Speakers of different languages follow a three-way split in how they express motion events in speech—with a greater emphasis on manner in satellite-framed languages (English), path in verb-framed languages (Turkish), and comparable expression of manner and path in equipollently-framed languages (Chinese). According to the thinking-for-speaking account, these language-specific patterns can affect speakers’ representation of motion events but only when verbalizing the event. In this study, we asked whether language might influence learning novel words, particularly when the words were accompanied with gestures. We examined effects of language type (equipollent-framed: Chinese, satellite-framed: English, verb-framed: Turkish) and modality (speech-only, gesture+speech) on learning pseudowords for motion (manner, path). Our results showed that speakers of all three languages learned pseudowords for manner and path but with lower accuracy scores and slower rates of learning by Chinese speakers. Regardless of the language they spoke, participants learned manner words more accurately than path words, but with no added benefits of instruction with gesture+speech over speech-only. Taken together, our study extends the lack of language effect on nonverbal representation of events when not speaking to the domain of novel word learning across structurally different languages.