Published online by Cambridge University Press: 05 December 2012
The recognition that speech formulas play a role in first language acquisition—that children reuse sequences of words taken directly and seemingly unanalyzed from the input—goes back to the earliest days of the field. Until fairly recently, however, such formulaic language was considered part of an early and soon-superseded stage of development. The last decade has seen the rise of a perspective on language development in which such formulas are central to language acquisition across development. According to this perspective, which is often known as the usage-based theory of language development, acquisition begins when children identify, infer a communicative function for, and start to utilize pieces of language of different sizes (single words and multiword sequences). Generalization, and as a result grammar, is an emergent property resulting from the ongoing coexistence of such sequences in a shared representational space. The growth in popularity of such an account, which represents a radical break from traditional models of grammatical development, has resulted in large part from the appearance of very large corpora of child–caregiver interactions. Such corpora have supported a new understanding of the challenges and opportunities facing the learner, as well as allowing new naturalistic analyses of children's productions and the creation of stimuli for experiments, all of which offer considerable support for the usage-based position. This article offers a review of these developments.
Ambridge, B., & Lieven, E. (2011). Child language acquisition: Contrasting theoretical approaches. Cambridge, UK: Cambridge University Press.
This textbook provides an evidence-based review of the central issues in first-language acquisition research, including many of the examples discussed in this article.
Bannard, C., & Matthews, D. (2008). Stored word sequences in language learning: The effect of familiarity on children's repetition of four-word combinations. Psychological Science, 19, 241–248.
This article reports on a repetition experiment in which two- and three-year-old children were found to be better (fewer errors, shorter duration) at producing the first three words of frequent four-word sequences (e.g., a drink of milk) than they were at producing the same three words when part of infrequent sequences (e.g., a drink of tea). It provides the first experimental evidence that young children have dedicated representations for frequent multiword sequences.
Matthews, D., & Bannard, C. (2010). Children's production of unfamiliar word sequences is predicted by positional variability and latent classes in a large sample of child directed speech. Cognitive Science, 34, 465–488.
This article provides experimental evidence that two- and three-year-old children extract generalizations from the input when sequences of repeated words occur with points of significant variation (e.g., they extract the schematic phrase a piece of X when exposed to multiple sequences such as a piece of toast, a piece of string, etc.). The article also explores some distributional factors that appear to determine when children will identify such patterns.
Rowland, C. F. (2007). Explaining errors in children's questions. Cognition, 104, 106–134.
This article reports on a corpus analysis of the questions produced by 10 children. When a very large proportion of the error-free questions were structured around highly frequent formulas, errors occurred when children were required to deviate from such patterns. The article concluded that children's early questions are produced by reusing word sequences taken directly from the input.
Stoll, S., Abbot-Smith, K., & Lieven, E. (2009). Lexically restricted utterances in Russian, German, and English child directed speech. Cognitive Science, 33, 75–103.
This article describes an analysis of the word sequences children hear in three different languages and provides evidence that opportunities for formulaic learning abound in even languages with freer word order and richer morphology than English.