The performance of language is multimodal, not confined to speech. Review of monkey and ape communication demonstrates greater flexibility in the use of hands and body than for vocalization. Nonetheless, the gestural repertoire of any group of nonhuman primates is small compared with the vocabulary of any human language and thus, presumably, of the transitional form called protolanguage. We argue that it was the coupling of gestural communication with enhanced capacities for imitation that made possible the emergence of protosign to provide essential scaffolding for protospeech in the evolution of protolanguage. Similarly, we argue against a direct evolutionary path from nonhuman primate vocalization to human speech. The analysis refines aspects of the mirror system hypothesis on the role of the primate brain’s mirror system for manual action in evolution of the human language‐ready brain.