Introduction
Williams syndrome (WS) is a rare neurodevelopmental disorder present in about 1 in 7,500 to 20,000 live births and caused by a micro-deletion on one copy of chromosome 7, which results in atypical physical, cognitive and behavioural phenotypes (Kozel et al., Reference Kozel, Barak, Kim, Mervis, Osborne, Porter and Pober2021; Royston et al., Reference Royston, Waite and Howlin2019; Tassabehji et al., Reference Tassabehji, Metcalfe, Karmiloff-Smith, Carette, Grant, Dennis, Reardon, Splitt, Read and Donnai1999). Individuals with WS have been described as presenting with relatively good vocabulary and phonological skills and relatively spared grammar in the face of weaker pragmatic skills and moderate-severe deficits in nonverbal tasks including problem solving, spatial and number cognition and planning (Mervis & John, Reference Mervis and John2010). Reports have also highlighted relatively good performance on complex language structures such as passives, negations, and conditionals (Bellugi et al., Reference Bellugi, Marks, Bihrle and Sabo1988, Reference Bellugi, Wang and Jernigan1994), inflections and derivations (Clahsen & Almazan, Reference Clahsen and Almazan1998) as well as increased use of narrative enrichment devices (Bellugi et al., Reference Bellugi, Wang and Jernigan1994). Such observations made WS a popular example supporting the idea of an independent “language module” containing abstract grammatical representations (Pinker, Reference Pinker1999; Zukowski, Reference Zukowski2005).
This assumption of a relative strength in language has been challenged by studies arguing that language in individuals with WS is either delayed or less developed than expected for their mental age. For example, individuals with WS produced more errors in the areas of lexical selection, word order, gender agreement and verb inflections, and showed poorer grammatical comprehension compared to neurotypical controls (Karmiloff-Smith et al., Reference Karmiloff-Smith, Grant, Berthoud, Davies, Howlin and Udwin1997; Volterra et al., Reference Volterra, Capirci, Pezzini, Sabbadini and Vicari1996). Individuals with WS also performed similarly or less well than populations with comparable intellectual skills such as Down syndrome on processing wh-questions and passives (Joffe & Varlokosta, Reference Joffe and Varlokosta2007). Furthermore, children with WS showed no verbal advantage over children with developmental language disorder on standardized tests or a narrative task (Stojanovik et al., Reference Stojanovik, Perkins and Howard2004). Thomas et al. (Reference Thomas, Grant, Barham, Gsödl, Laing, Lakusta, Tyler, Grice, Paterson and Karmiloff-Smith2001) reported that participants with WS had difficulties generalizing past tense rules to novel verbs, often omitting obligatory inflections. Expressive language has been described as stylistically different, featuring atypical vocabulary, stereotyped phrases, idioms, overfamiliar language and excessive use of social evaluative devices including prosodic cues and dramatic narrative elements (Reilly et al., Reference Reilly, Losh, Bellugi and Wulfeck2004; Thomas et al., Reference Thomas, Dockrell, Messer, Parmigiani, Ansari and Karmiloff-Smith2006; Udwin & Yule, Reference Udwin and Yule1990), although these findings have not always been replicated (Crawford et al., Reference Crawford, Edelson, Skwerer and Tager-Flusberg2008; Stojanovik & van Ewijk, Reference Stojanovik and van Ewijk2008).
The neuroconstructivist explanation has been that rather than accessing a preserved language module, individuals with WS acquire language in a qualitatively different manner (Karmiloff-Smith et al., Reference Karmiloff-Smith, D’Souza, Dekker, Van Herwegen, Xu, Rodic and Ansari2012; Levy & Eilam, Reference Levy and Eilam2013; Thomas et al., Reference Thomas, Grant, Barham, Gsödl, Laing, Lakusta, Tyler, Grice, Paterson and Karmiloff-Smith2001). Pointing and categorization emerge late relative to lexical acquisition (Laing et al., Reference Laing, Butterworth, Ansari, Gsödl, Longhi, Panagiotaki, Paterson and Karmiloff-Smith2002), and the developmental trajectory has been characterized by a stronger correlation between grammatical capacity and verbal working memory (Robinson et al., Reference Robinson, Mervis and Robinson2003). One proposal is that language processing in WS relies less on grammatical and lexical-semantic information and more on shallow acoustic and phonological features, with a bias towards imitation (Thomas & Karmiloff-Smith, Reference Thomas and Karmiloff-Smith2003). An individual with WS may produce phrases or utterances because they heard them before and represent them as one unit, and not because they use abstract grammatical information to combine individual words and morphemes. This more shallow production may give the appearance of unaffected processing.
Such explanations call for the distinction between analytic and holistic (or gestalt) processing, which is rooted within usage-based linguistics (e.g., Jackendoff, Reference Jackendoff2003; Langacker, Reference Langacker1987; Tremblay & Baayen, Reference Tremblay and Baayen2010). Analytic processing uses abstract representations of phrasal structures to combine individual words and morphemes and is able to generate novel expressions. Holistic processing, on the other hand, involves learning and retrieving word combinations, such as single phrases, but also entire sentences, as a single unit (a formula). Imitation can be driven by holistic representations; however, holistic forms are also an essential part of everyday language use (van Lancker Sidtis & Rallon, Reference van Lancker Sidtis and Rallon2004). Theories such as construction grammar (Goldberg, Reference Goldberg2006, Reference Goldberg2019) suggest that everyone uses both analytic and holistic processing, with the contribution of each changing depending on situational requirements. While proficient speakers have the capacity to analyze these utterances, in principle holistic phrases can be used without grammatical and lexical-semantic interpretation of their individual constituents.
It has been proposed that in typical development, children first primarily employ holistic processing, resulting in conservative and repetitive production of language formulas, and only later acquire more abstract grammatical representations that, along with lexical growth, enable more creative and flexible language (Bannard & Matthews, Reference Bannard and Matthews2008; Lieven et al., Reference Lieven, Salomo and Tomasello2009). Faster and more accurate processing of formulaic language in adults suggests that holistic knowledge remains relevant even after maturation (Conklin & Schmitt, Reference Conklin and Schmitt2012; Tremblay & Baayen, Reference Tremblay and Baayen2010). Formulaic language is also more likely preserved in people with neurological conditions such as aphasia or dementia (van Lancker Sidtis, Reference van Lancker Sidtis2012; Zimmerer et al., Reference Zimmerer, Newman, Thomson, Coleman and Varley2018, Reference Zimmerer, Hardy, Eastman, Dutta, Varnet, Bond, Russell, Rohrer, Warren and Varley2020). Because formulas can be long and morphologically rich, they give the impression of islands of intact grammatical knowledge, when instead they are likely either well-trained combinations or lexicalized multiword sequences.
Since one predictor of holistic representations is frequency of (co-)occurrence of words in everyday language use, acquisition of formulaic phrases is supported by sensitivity to statistical patterns in language. In WS, this sensitivity has been demonstrated in artificial language learning studies. Using the word segmentation learning paradigm, Cashon et al. (Reference Cashon, Ha, Graf Estes, Saffran and Mervis2016) demonstrated that 20-month-old infants with WS could distinguish words from part-words after brief familiarization with statistically structured syllable sequences. Stojanovik et al. (Reference Stojanovik, Zimmerer, Setter, Hudson, Poyraz-Bilgin and Saddy2018) found that participants with WS prefer statistical representations to more abstract grammatical rules. They compared processing biases in artificial language learning performances between participants with WS, mental-age matched typically developing (TD) children, and chronological-age (CA) matched TD individuals. In a brief familiarization phase, participants listened to spoken syllable sequences generated by a simple Markov-grammar. It was explained that sequences were magic spells, and they were presented along with a cartoon magician. In the test phase, participants distinguished between correct and incorrect “spells” based on what they learned from the familiarization set. Participants with WS and younger, mental-age matched TD children preferred sequences that resembled exemplars from familiarization. CA-matched TD participants, on the other hand, accepted sequences that were grammatical, regardless of familiarity, demonstrating their ability to acquire more abstract grammatical knowledge. These results suggest that TD individuals switch from familiarity- to rule-based processing in their development while, in individuals with WS, the bias towards familiarity may remain for much longer.
Based on current evidence, one could hypothesize that natural language processing in WS would also be atypically biased towards co-occurrence of specific words and acquisition of holistically processed, formulaic language. Do individuals with WS produce more familiar language? In this current study we examined statistical properties of words and word combinations in narrative samples of individuals with WS, who we compared with CA and language-age matched (LA) TD controls to identify both delays and atypical trajectories.
We determined the usage-frequency of each word in a narrative sample, using the spoken section of the British National Corpus (BNC XML Edition, 2007) as reference. Higher frequency indicates that a word is more common in typical language use, which has been associated with ease of production. We analyzed word combinations by determining their collocation strength, again using the BNC as reference. Collocation strength shows how often words appear together, relative to how often each occurs in general. Collocation strength is therefore not merely a function of frequency. For example, I go is a more frequent bigram than I tell according to the BNC; however, the collocation strength of the latter is seven times as high, because when the words I and tell occur, they more likely appear together relative to appearing in other contexts. We extracted these variables using the Frequency in Language Analysis Tool (FLAT), a script which had previously been employed to study statistical properties of language production in adults with stroke aphasia (Zimmerer et al., Reference Zimmerer, Newman, Thomson, Coleman and Varley2018) and neurodegeneration (Zimmerer et al., Reference Zimmerer, Hardy, Eastman, Dutta, Varnet, Bond, Russell, Rohrer, Warren and Varley2020).
This study is a secondary analysis of language transcripts collected by Stojanovik et al. (Reference Stojanovik, Setter and van Ewijk2007), originally for their investigation of intonation. They found that the ability to produce and understand intonation of participants with WS was poorer than that of CA-matched controls, but mostly in line with language-age (LA)-matched controls. A subsequent study by Stojanovik and van Ewijk (Reference Stojanovik and van Ewijk2008) investigated lexical production. The authors report that individuals with WS did not differ from controls with regards to lexical diversity and the number of low frequency words produced.
We investigated additional features in order to present a broader profile and context and to address our questions regarding holistic vs. analytic processing. Our analysis includes two dimensions of language production: (1) complexity, which includes mean length of utterance (MLU), proportion of complex sentences, verb phrase complexity, and morphosyntactic errors, and (2) familiarity, which includes a more comprehensive measure of lexical frequency than the one used by Stojanovik and van Ewijk (Reference Stojanovik and van Ewijk2008), and the collocation strength of word combinations. While we predicted that CA-matched controls would produce the most complex and least familiar language, while individuals with WS would produce the least complex and most familiar language, our statistical analyses tested broader hypotheses, namely that groups would differ from another.
Methods
Participants
Twelve children (9 female, 3 male; mean age = 9.05 yrs) were recruited through the Williams Syndrome Foundation (UK). Ethnicity was not a recruitment criterion, however, all participants taking part in the study were white. Diagnosis was confirmed by a positive fluorescent in situ hybridization (FISH) test. Children’s language skills were tested using the Test of Reception of Grammar (TROG-2; Bishop, Reference Bishop2003). The TROG-2 is a sentence-picture matching test, and sentences were read to the participants by the experimenter. The test contains a variety of unfamiliar sentences of increasing grammatical complexity, including subject-verb-object, subject-verb-adjunct, spatial prepositional phrases, sentences with pronouns, passive constructions and center-embedded clauses. Sentences and distractor images are designed in a way that the participant needs to interpret the grammatical structures to perform well. Non-verbal reasoning was assessed using Raven’s Coloured Progressive Matrices (RCPM; Raven, Reference Raven1984).
Children with WS were matched to two TD control groups (Table 1): 14 LA-matched controls (12 female, 2 male; mean age = 5.78 yrs) who did not differ on the TROG-2, t(24) = -.297, p = .769, but were significantly younger, t(24) = 4.739, p < .001. 15 CA-matched controls (13 female, 2 male; mean age = 9.91 yrs) did not differ in age from the WS group, t(25) = -1.210, p = .237, but scored significantly higher on the TROG-2, t(25) = -14.389, p < .001. RCPM raw scores differed significantly across groups, F (2,38) = 53.253, p < .001, with post-hoc tests identifying a significant difference between CA-matched controls and participants with WS, p < .001. LA-matched controls performed better than participants with WS, and that difference was close to the significance threshold, p = .063.
Procedure
Children had been asked to generate a story using the wordless picture book ‘Frog, where are you?’ (Mayer, Reference Mayer1965). Samples had been orthographically transcribed using the Systematic Analysis of Language Transcripts (SALT; Miller & Chapman, Reference Miller and Chapman1985) and utterances had been segmented based on the conventions presented by Crystal et al. (Reference Crystal, Fletcher and Garman1976). We manually annotated transcripts for features selected from the Northwestern Narrative Language Analysis (Thompson, Reference Thompson2013; see Appendix A for an example from this study). The features were sentence type (simple or complex; the latter defined by clause embedding or non-canonical word order), number and types of clause embedding, verb argument structure (number of arguments), and morphosyntactic errors. Annotators were blind to the participants’ group membership. We calculated interrater reliability by computing intra-class correlations coefficients for each variable that was second rated for a subsample of 10 transcripts. Interrater reliability was satisfactory (sentence complexity: ICC (1,2) = .997; verb argument structure: ICC (1,2) = .945; grammatical errors: ICC (1,2) = .887). To investigate familiarity, usage-frequency was extracted for words and bigrams (two-word combinations) from the spoken subsection of the British National Corpus (BNC, 2007) using the Frequency in Language Analysis Tool (FLAT; Zimmerer et al., Reference Zimmerer, Newman, Thomson, Coleman and Varley2018, Reference Zimmerer, Hardy, Eastman, Dutta, Varnet, Bond, Russell, Rohrer, Warren and Varley2020). All words in each sample were included, and all bigrams except for ungrammatical combinations and words separated by a sentence or utterance boundary. Based on these variables, we calculated the following measures:
Complexity
Mean length of utterance in words (MLU-w)
The ratio of the number of word tokens divided by the number of utterances. MLU can be measured in words or morphemes; both variables correlate with another very strongly in a number of languages including English, which has relatively few inflectional markers (Ezeizabarrena & Garcia Fernandez, Reference Ezeizabarrena and Garcia Fernandez2018; Parker & Brorson, Reference Parker and Brorson2005).
Word count
Total number of words produced.
Sentence complexity
The number of complex sentences (i.e., containing non-canonical word order and/or clause embedding) divided by the total number of sentences. We excluded nominal sentences, i.e., sentences without a finite verb.
Verb argument structure
The number of verb arguments in each sample divided by the number of verb tokens.
Morphosyntactic errors
The number of grammatically incorrect utterances divided by the number of utterances (including abandoned utterances and nominal sentences).
Familiarity
Lexical frequency
FLAT determined the average frequency of content words (words with a strong semantic representation, e.g., “table”, “blue”, or “swim”) and function words (words with a primarily grammatical function, e.g., “the”, “she”, “what”) separately based on the BNC. Averages for each participant were calculated based on types, i.e., each unique word was only entered once.
Bigram collocation strength
We followed the procedure from previous studies (e.g., Zimmerer et al., Reference Zimmerer, Hardy, Eastman, Dutta, Varnet, Bond, Russell, Rohrer, Warren and Varley2020). We calculated collocation strength for each bigram in a sample. For example, for the sentence The boy went to sleep, we analysed the collocation strength of the boy, boy went, went to and to sleep. We excluded ungrammatical bigrams from this part of the analysis (but counted these as morphosyntactic errors) and bigrams which crossed sentence or utterance boundaries. We also excluded immediate repetitions (e.g., the second and then in and then and then he got up). For quantifying collocation strength, we used t-scores (Gries, Reference Gries2010), which, compared with the better known measure Mutual Information, does not inflate collocation strength when the frequency of the combination is low. We computed t-score averages for each participant based on bigram types, and only included bigrams with a frequency of one or more as t-scores for bigrams with a frequency of zero cannot be computed.
Proportion of bigrams in BNC
We computed the proportion of bigrams produced by the individual which occur in the BNC, i.e., have a frequency of one or more, as another measure of familiarity. This variable works in conjunction with collocation strength in order to describe word combinations produced by the participant.
Results
Analysis plan
Because of the novelty of this research, bidirectional hypotheses were tested. We compared means between each group for each independent variable. The main effect of group was inferred using one-way ANOVAs, followed by pairwise comparisons between groups. Bonferroni correction for pairwise comparisons (three groups) sets an adjusted significance threshold of p = .017. We mention, however, all pairwise differences with p < .1 to highlight the respective variables’ potential for future work on the topic.
Group comparisons
See Table 2 for a summary of group performance containing group averages, standard deviations, and main effects of group. LA-matched controls produced fewer words than CA-matched controls, and significance was close to the adjusted threshold (p = .022), while the difference between CA-controls and speakers with WS was close to the unadjusted threshold (p = .052). With regards to complexity, MLU-w showed that CA-matched controls produced longer utterances than both LA-matched controls (p = .002) and individuals with WS (p < .001). LA-matched controls produced longer utterances than individuals with WS, but this difference too was not significant according to the adjusted threshold (p = .036). CA-matched controls also produced more complex sentences than both LA-matched controls (p = .006) and individuals with WS (p = .001). The difference between WS and LA-matched individuals was not significant (p = .488). Groups did not differ on complexity of verb argument structure or the proportion of morphosyntactic errors.
Lexical familiarity effects were not significant for content words, though the difference between individuals with WS and LA-matched controls was notable (p = .058), as LA-matched controls produced less frequent content words. The effect was greater and significant for function words: individuals with WS produced more frequent function words than CA-controls (p < .001). The difference between CA- and LA-matched controls was close to the adjusted significance threshold (p = .024).
Collocation strength was the only variable on which individuals with WS were significantly different from both control groups. Word combinations were more strongly collocated in individuals with WS than in CA-matched (p = .003) and LA-matched (p = .005) participants. Control groups did not differ significantly on collocation strength.
Groups also differed in the proportion of bigrams in the BNC, but only at p < .1, driven by CA-speakers producing fewer combinations that occur in the corpus than individuals with WS, with the effect being above the adjusted significance threshold (p = .03).
Relationship between language production and standardized testing measures
Post-hoc, we investigated how properties for “Frog Story” narrations related to TROG-2 and RCPM scores (Table 3). Overall, individuals with higher TROG-2 and RCPM scores produced longer samples, longer utterances, more complex sentences, less frequent function words, weaker collocations and fewer bigrams which occur in the BNC. However, because TROG-2 and RCPM scores were strongly correlated, one cannot confidently separate these individual predictors.
Discussion
Analysis of spontaneous language production in a narrative task revealed substantial and significant differences between individuals with WS, CA-matched controls, and younger LA-matched controls. Our data characterize language in individuals with WS as containing mostly grammatically correct, but short and syntactically simple utterances with a tendency to overuse familiar (strongly collocated) word combinations. In context of previous studies and usage-based theories of language, we regard the results as evidence for language in WS being more dependent on holistic representations, which is likely the result of a bias towards statistical processing of word co-occurrence patterns rather than application of abstract grammatical knowledge. However, before considering the implication of this finding, we provide context for the other results. We interpret as evidence for delay a pattern in which individuals with WS differ from CA-, but not LA-matched controls, while we regard differences between the WS and LA-matched groups as evidence for divergent developmental trajectories.
Considering language complexity, the CA-controls produced longer utterances, and proportionally more sentences with non-canonical structures and embedded clauses, than both LA-controls and individuals with WS. These results are in line with previous findings that showed delays in the language of people with WS with regards to both MLU (Levy & Eilam, Reference Levy and Eilam2013) and sentence complexity (Reilly et al., Reference Reilly, Losh, Bellugi and Wulfeck2004; Stojanovik et al., Reference Stojanovik, Perkins and Howard2004).
We found no significant group differences for usage-frequency of content words. This finding supports views that lexical difficulties play a relatively small role in WS, and is corroborated by previous results which suggest that lexical diversity is also not affected in WS (Stojanovik & van Ewijk, Reference Stojanovik and van Ewijk2008). However, we consider that lexical effects may be diminished by lexical constraints of the task, since all children described the same content (the “Frog story”). Investigations of spontaneous conversations can address this limitation.
We did find strong effects of WS on the frequency of function words. These are a crucial aspect of grammatical knowledge. Function word frequency is rarely investigated, but it appears that it can be used to characterize language production. Previously, Mok et al. (Reference Mok, Goh, Saddy, Varley and Zimmerer2022) found that younger TD children produced significantly more frequent function words than older children. We found the same age difference in our comparison between CA- and LA-controls, and while it did not meet Bonferroni-adjusted criteria for statistical significance, this finding is worth highlighting for further investigations. Importantly, the difference between individuals with WS and CA-matched controls was greater and significant and may provide another way of capturing grammatical deficits in WS. Data from studies on reading suggests that less frequent function words are more demanding (Ong & Kliegl, Reference Ong and Kliegl2008). However, we do not understand exactly what makes less frequent function words (e.g., because) more difficult than more frequent words (e.g., and). Less frequent function words may occur in more complex sentence structures and propositional representations. We suggest future research could break our binary distinction between content and function words into further categories. Future projects may look further into variables related to lexical frequency, such as grammatical function, phonological complexity, and age of acquisition.
Collocation strength is the only measure on which individuals with WS differed significantly from both control groups, after correction for multiple comparisons and with large effect sizes. When individuals with WS combined words, they did so in ways which are more common, rather than in rare or novel ways. Group comparisons suggest that this may not be an effect of cognitive delay (younger and older controls did not differ from one another), but rather a substantial deviation from the typical trajectory. As reviewed in the introduction, data from artificial grammar learning suggest a stronger bias towards familiarity of stimuli in individuals with WS. We provide evidence that it is present in spontaneous language production and suggest that this familiarity bias shapes language organization at the cognitive level in individuals with WS. High collocation strength is one indicator that a combination is processed as a holistic, formulaic unit, with fewer demands on abstract, grammatical processes. We propose that while TD children switch from predominantly holistic to more analytic language, enabling greater combinatorial creativity, individuals with WS rely on familiar and more fixed constructions for longer (if not through life), at the cost of generative capacities. Learning of formulas can be supported not only by statistical processing, but by processing of prosodic contour, found to be a relative strength in WS.
This explanation supports neuroconstructivist views, which propose that children with WS acquire language in a different way (Grant et al., Reference Grant, Valian and Karmiloff-Smith2002; Joffe & Varlokosta, Reference Joffe and Varlokosta2007; Levy & Eilam, Reference Levy and Eilam2013; Thomas et al., Reference Thomas, Grant, Barham, Gsödl, Laing, Lakusta, Tyler, Grice, Paterson and Karmiloff-Smith2001). The bias towards holistic processing may be present in other domains. For example, individuals with WS may process faces holistically (“globally”) rather than as a combination of individual features (Annaz et al., Reference Annaz, Karmiloff-Smith, Johnson and Thomas2009; Tager-Flusberg et al., Reference Tager-Flusberg, Plesa-Skwerer, Faja and Joseph2003). More studies on the relationship between holistic language and processing in other domains could contribute to accomplishing a cognitive profile in WS.
Our study did not find a difference between the WS group and controls in the proportion of morphosyntactic errors, which contradicts previous results (Joffe & Varlokosta, Reference Joffe and Varlokosta2007; Karmiloff-Smith et al., Reference Karmiloff-Smith, Grant, Berthoud, Davies, Howlin and Udwin1997). These contrasting results might be explained by the choice of the language elicitation task. Studies that indicate more erroneous language production in WS used tasks which constrained production to specific linguistic structures which were hypothesized to be difficult (Faitaki & Murphy, Reference Faitaki and Murphy2020). Our spontaneous narrative speech elicitation task did not constrain participants in such a way. Participants could have favoured selection of constructions they could produce with greater accuracy.
The relative lack of morphosyntactic errors in a narrative production would be a demonstration of how a reliance on familiar word combinations can mask possible language differences. Here, one could see parallels between WS and dementia. In early work on grammar in dementia, a lack of grammatical errors led to the conclusion that grammar was unimpaired (Kempler et al., Reference Kempler, Curtiss and Jackson1987). Later studies found decrease in grammatical complexity, and finally an overreliance on formulaic language, also detectable using collocation strength measures (Bates et al., Reference Bates, Harris, Marchman, Wulfeck and Kritchevsky1995; Zimmerer et al., Reference Zimmerer, Hardy, Eastman, Dutta, Varnet, Bond, Russell, Rohrer, Warren and Varley2020). Familiar language naturally is unlikely to strike the listener as unusual, which explains why early studies, which did not focus on familiarity of language, did not reveal atypical patterns.
Our work also parallels suggestions that language development in autistic individuals may rely more on holistic processing (or “gestalt” processing), resulting in acquisition and use of phrases and utterances which formally suggest complexity but may be unanalysed (Noens & Berckelaer-Onnes, Reference Noens and Berckelaer-Onnes2005). Such bias in processing may support production of connected language where analytical understanding of language is less developed, but may underlie phenomena like echolalia and inaccurate pronoun use.
One important limitation of the current study is sample size, which limits the power of our statistical models, particularly since substantial individual differences have been reported in WS (Brock, Reference Brock2007). Research on larger samples would also enable more complex models to investigate interactions between variables. This issue is common in WS research because the syndrome is rare. One alternative to studies involving larger samples can be reproductions using other smaller samples available to individual labs. Public availability of samples from individuals with WS can also aid research, as WS is not well-represented in public language corpora. For example, the CHILDES database (MacWhinney, Reference MacWhinney2000) currently only features two transcripts of a Spanish-speaking child with WS. Unfortunately, sharing our samples publicly was not covered in the original ethical approval. Future studies may also choose to elaborate on measures of lexical frequency and grammatical function. Content and function words are very large categories which each contain words with very different semantic, grammatical, and discourse functions.
Research of language in WS has seen a shift away from theories which propose that WS offers evidence that a language “module” can function independently of other cognitive deficits. Our work suggests that our understanding of WS can be supported by frameworks which regard language processing as a combination of two types of representations: more abstract and analytic grammatical frames, which enable more creative and flexible language use, and holistic, fixed representations which are acquired by statistical learning and are cognitively less demanding. A bias towards these holistic representations may be related to general cognitive deficits in WS.
Competing interest
The authors declare none.
Appendix A. Linguistic levels, features, codes and example of sample coded transcript
-
1. The boy was watch/ing out for the owl.
I: [s]
II: [ss][as][e0]
V:[ob2xy]
-
2. And he call/ed ‘frog, where are you’.
I: [s]
II: [ss][con][e0][wqj]
V: [cxy][copyp]
-
3. And a deer hold/ed hold[EWheld] him on his head.
I: [*s][g]
II: [ss][as][e0]
V: [ob2xy]
-
4. And he run/ed run[EW:ran].
I: [*s][g]
II: [ss][as][e0]
V: [ob1x]
-
5. And he push/ed him off the cliff.
I: [s]
II: [ss][as][e0]
V: [phob2xy]
-
6. ’splash’!
I: [ns]
II: -
V: -
-
7. And when the boy woke up he saw that the jar was empty.
I:[s]
II:[cs][as][e2][ac][cc]
V:[op2x][cxs’][copyp]
Auxiliary verbs were not coded.