It has become uncontroversial that bilingual children’s proficiency in each language is influenced by the amount of exposure they experience in each language (Paradis, Reference Paradis2017; Unsworth, Reference Unsworth, Nicoladis and Montanari2017). A growing body of research is now turning to the impact of qualitative aspects of language exposure (see, e.g., the double Special Issue dedicated to this topic by the Journal of Child Language; Blom & Soderstrom, Reference Blom and Soderstrom2020).
Qualitative aspects of the language environment can manifest themselves along several dimensions. Language “richness” has been taken to encompass structural complexity, lexical diversity, and the use of different registers (e.g., during literacy activities; see, e.g., Hart & Risley, Reference Hart and Risley1995; Hoff, Reference Hoff2006; Paradis, Reference Paradis2011a; Rowe, Reference Rowe2008; Scheele, Leseman, & Mayo, Reference Scheele, Leseman and Mayo2010). Another dimension is the diversity of the language experience, indexed by the diversity of contexts of language interaction, or the number of conversation partners in each language (Place & Hoff, Reference Place and Hoff2011). Footnote 1 Non-native input can be a factor too, depending on the proficiency level of the interlocutors (Hoff, Core, & Shanks, Reference Hoff, Core and Shanks2020; Paradis, Reference Paradis2011b; Unsworth, Brouwer, de Bree, & Verhagen, Reference Unsworth, Brouwer, de Bree and Verhagen2019) and on the type of interlocutor (e.g., second language input from older siblings has been shown to correlate with better language outcome measures in bilingual children; Sorenson-Duncan & Paradis, Reference Sorenson-Duncan and Paradis2020). In sum, optimal input is likely to result from sufficient time spent with the relevant interlocutors and sufficient density, diversity and complexity of the actual language used during that time. Footnote 2
In this paper, I adopt a broad definition of input quality, taking into account any aspect of the linguistic environment that goes beyond input quantity as it is traditionally operationalized (i.e., as “time spent with interlocutors X, Y, Z,” which in fact only measures the opportunity for language interaction with those interlocutors, whether it is realized or not). Under this broad definition, input quality can include quantitative aspects of language exposure, such as the actual amount of linguistic interaction (e.g., caregivers’ verbal responsiveness; Hoff, Reference Hoff2006), density of language use (e.g., lexical or clausal density; Huttenlocher, Haight, Bryk, Seltzer, & Lyons, Reference Huttenlocher, Haight, Bryk, Seltzer and Lyons1991), and other distributional properties of the input.
Socioeconomic status (SES) is often used as a proxy for the quality of language exposure. In the language acquisition literature, SES has been operationalized as parental education (most often maternal education), household affluence (estimated from parental occupation, entitlement to free school meals, or estimated from postcodes), or indices of deprivation. Most studies use a single measure of SES, but combined measures have been argued to be more informative, as they capture several aspects of the child’s environment (Gatt, Baldacchino, & Dodd, Reference Gatt, Baldacchino and Dodd2020). Low SES has a robust impact on cognitive development and language outcomes. In their review, Perkins, Finegood, and Swain (Reference Perkins, Finegood and Swain2013) highlight two potentially explanatory dimensions, stemming from the well-documented association between low SES and (a) high chronic stress (which impacts the cognitive control system underpinning language development) as well as (b) lower quality of the home environment in terms of literacy practices, parental language, and parenting styles (all of which have an impact on children’s language development).
In bilingual children, the presumed association between SES and language “richness” is likely to be mediated by parental education. The language in which parents have obtained a higher education diploma does itself correlate with their child’s development in that language (Hoff, Burridge, Ribot, & Giguere, Reference Hoff, Burridge, Ribot and Giguere2017). High SES might also confer an indirect advantage through better quality of the child’s environment (e.g., better nutrition and medical care, reduced exposure to toxins and noise; Meir & Armon-Lotem, Reference Meir and Armon-Lotem2017; Perkins et al., Reference Perkins, Finegood and Swain2013). It could also be associated with differences in social practices and attitudes toward education (Scheele et al., Reference Scheele, Leseman and Mayo2010) or degree of assimilation in the “majority” culture (which can lead to more opportunities for literacy and language-rich activities in the “majority” language; Pearson, Reference Pearson2007).
SES has been shown to correlate with bilingual children’s vocabulary size (Gatt et al., Reference Gatt, Baldacchino and Dodd2020; Gathercole, Kennedy, & Môn Thomas, Reference Gathercole, Kennedy and Môn Thomas2016; Hoff, Reference Hoff2006), morphosyntax (Chiat & Polišenská, Reference Chiat and Polišenská2016; Meir & Armon-Lotem, Reference Meir and Armon-Lotem2017), and receptive grammar skills (Gathercole et al., Reference Gathercole, Kennedy and Môn Thomas2016). The impact of SES is, however, varied. Using a composite measure of parental education and parental occupation, Gathercole et al. (Reference Gathercole, Kennedy and Môn Thomas2016) found differences across age groups: in the younger participants (preschool), the receptive language scores were more strongly correlated with quantitative aspects of language exposure, while in older participants (teens and older), SES had a stronger impact. Unsworth et al. (Reference Unsworth, Brouwer, de Bree and Verhagen2019) also observed a differential impact of SES within preschoolers, depending on the aspect of language considered: maternal education predicted receptive vocabulary scores but not morphosyntax, semantic fluency, or sentence repetition. Another type of differential impact of SES was observed by De Cat (Reference De Cat2020). Here, the relationship between parental occupation and proficiency in the language of schooling was modulated by the amount of cumulative exposure to that language in 5- to 7-year-olds: the more cumulative exposure to the school language, the stronger the correlation between SES and sentence repetition score.
From this body of research, it appears SES interacts in complex ways with quantitative aspects of language exposure and with different aspects of language competence. Is it possible to predict which aspect(s) of language competence might be affected by SES-related properties of the child’s language environment? This will depend on whether certain aspects of language are more sensitive to input. For instance, Tsimpli (Reference Tsimpli2014) argues that aspects of language that involve grammar-external components (such as pragmatics, lexical knowledge, or working memory) develop beyond the age of 5 in monolinguals, and are sensitive to input effects in bilinguals. By contrast, language-internal phenomena (defined in terms of syntax and semantics) are generally acquired by monolinguals before the age of 5, and sensitive to age of onset effects in bilinguals (rather than input effects). Although Tsimpli defines input effects in terms of quantity of exposure, it is reasonable to assume that qualitative aspects will also have an impact.
How “input effects” are defined might itself have an impact, however. Is it frequency of current exposure, or cumulative exposure? In bilingual children, cumulative exposure to a language depends both on age of onset and on frequency of exposure. In their study of gender marking in Greek versus in Dutch by 5- to 10-year-old bilinguals (whose other language is English), Unsworth et al. (Reference Unsworth, Argyri, Cornips, Hulk, Sorace and Tsimpli2014) demonstrated that age of onset was not sufficient to account for the different patterns of acquisition of Greek versus Dutch gender: the effect of age of onset was modulated by the cumulative quantity of exposure. By extension, simultaneous bilinguals might not turn out more proficient than successive bilinguals in language X if the amount of exposure to that language has been low. Unless one of the two dimensions is controlled (i.e., operationalized as a group-level factor with minimal variation within group), it is impossible to measure the effect of input versus age of onset separately.
The aim of this study is to probe the role of SES as a hypothetical proxy for input quality in bilingual children, and to explore its interplay with input quantity (operationalized as cumulative exposure, as justified above). I will focus on 5- to 7-year old children with English as an additional language who are in monolingual English education in the United Kingdom, and who vary in terms of age of onset of English exposure, amount of English exposure, as well as SES.
The paper is articulated around two sets of questions:
1. What is the relationship between SES and quantity of exposure, as predictors of proficiency in the school language? Does it vary depending on how SES is operationalized?
2. Does the effect of language exposure and SES vary across different aspects of language proficiency? Is the effect of SES only detectable where language exposure has a significant impact?
The first set of questions will be investigated in three stages. First, the relationship between different operationalizations of SES will be explored, comparing bilinguals and monolinguals recruited from the same schools. This will allow us to identify associations and potential sociocultural confounds, and thereby inform the interpretation of any differences across SES operationalizations as predictors of language proficiency (later in the paper). Second, I will test the robustness of the non-linear interaction between SES and cumulative language exposure (which had been observed in the same children using sentence repetition as outcome measure, and parental occupation as the SES measure): is this result replicated with different SES measures, using global proficiency as outcome variable? The alternative SES measures will be derived through different combinations of parental education, parental occupation, and indices of deprivation. Third, I will compare the informativity of alternative SES measures (as predictors of school language proficiency), to determine what the optimal measure is for the present data set, that is, in case of high diversity.
The second set of questions will be explored through an item analysis of sentence repetition data. The respective impact of SES and cumulative language exposure will be probed across syntactic structures and across error types (i.e., lexical, inflectional or functional). Footnote 3 I will leave the robust evaluation of Tsimpli’s hypothesis for future research.
Based on the literature reviewed above, the hypotheses are as follows:
1. If SES indexes input quality, parental education should be a significant dimension of SES as a predictor of the child’s language proficiency.
a. Parental education alone might be the most informative predictor.
b. An SES measure that gives a stronger weight to parental education should be a more informative predictor than other measures.
2. The influence of input quality will be necessarily modulated by input quantity. SES is expected to have an impact only on those aspects of language proficiency that are affected by cumulative language exposure in the age group under consideration.
Method
This study is based on the secondary analysis of data which were collected as part of an investigation of the relationship between executive function skills (cognitive flexibility, inhibitory control, and working memory) and language experience in young bilingual children with unbalanced exposure to two languages, probing these children’s ability to make referential choices appropriate to their listener’s information needs (see Serratrice & De Cat, Reference De Cat2020). In this section, I describe the population sample and the measures from the original study that are relevant for the present purpose.
Participants
Ethical approval was obtained from the University of Leeds (Ref. PVAR 12-007) and parental consent was obtained prior to data collection. One hundred and seventy four children between the ages of 5 and 7 participated in the study. The school language was exclusively English for all the children. Half of the children (N = 87) were also exposed to an additional language at home in varying degrees. There was a total of 28 home languages in our sample: Arabic (9%), Bengali, Cantonese, Catalan, Dutch, Farsi, French (8%), Greek, Hindi, Italian, Kurdish, Mandarin, Marathi, Mirpuri, Nepalese, Pashto, Polish, Portuguese, Punjabi (21%), Shona, Somali, Spanish (6%), Swedish, Tamil, Telugu, Thai, Tigrinya, and Urdu (17%). Footnote 4 Bilingual and monolingual children were recruited from the same schools (in the North of England) for maximum comparability. None of the children were excluded.
Table 1 summarizes the distribution of the two groups in gender and age. For ease of presentation, children with any amount of exposure to a language other than English are referred to as “bilinguals”; children who had no exposure to a language other than English are referred to as “monolinguals”. All children were reported by the school to be developing typically and did not have any known hearing deficits.
Environmental variables
Estimates of the amount of exposure to English were calculated on the basis of information gathered via parental questionnaires (through a simplified version of the BiLEC; Unsworth, Reference Unsworth2013). Current exposure to English was calculated as a proportion of their total interaction time (assumed to equate to waking hours). The total number of hours of interaction with each interlocutor was multiplied by the proportion of the time English was used with that interlocutor. These proportions across interlocutors were added up, and the sum was divided by the total number of hours of potential interaction time. Cumulative exposure to English was estimated as the number of months since onset of exposure to English, multiplied by the proportion of current exposure to English. Cumulative exposure to English therefore equates to the total number of months-equivalent of full-time English exposure. We also collected self-reported estimates of parental proficiency in English (on a 5-point Likert scale).
The SES of the children’s families was estimated on the basis of information gathered via a parental questionnaire. Three types of measures were collected.
(a) Parental occupation data was scored using the reduced method of the National Statistics Socioeconomic Classification of professional occupation (NS-SEC). The score obtained ranged from 2 (for the highest SES) to 13 in our data set. We reversed it for ease of interpretability (so that a higher score reflected a more privileged background). The distribution will be shown in Figure 3 later in the paper. (b) Parental education was documented on a 5-point scale (1 = no education, 2 = primary school, 3 = secondary school, 4 = further education, and 5 = university). (c) The index of deprivation risk was based on 5 indicators:
If there were two parents in the household, the score was calculated for each parent, and the lowest risk among the two was retained. In our population sample, very few families had a low-SES risk of 4 (n = 4) or 5 (n = 2), so we collapsed them into Risk Level 3 (n = 11).
English proficiency tests
Several measures of English language proficiency were collected to assess different aspects of language competence. The tests included (a) the LITMUS sentence repetition task Footnote 5 to probe morpho-syntax, (b) four lexical–semantic tests of the Diagnostic Evaluation of Language Variation: the verb and preposition contrasts, real verb mapping, novel verb mapping (Seymour, Roeper, & de Villiers, Reference Seymour, Roeper and de Villiers2005), and (c) a discourse–semantic test: the Diagnostic Evaluation of Language Variation articles task. The distribution of proficiency scores is shown for each test in Figure 1. The results of the monolingual children are included for reference.
A composite measure of English proficiency was derived through a principal component analysis (PCA) of the three proficiency scores (sentence repetition, lexical semantics, and discourse semantics). PCA is a standard method of dimensionality reduction, which allows the three proficiency scores to be mapped linearly into a lower dimension space that maximizes the variance in the data. In the bilingual children, the three proficiency scores were strongly correlated (as shown in Figure 2), indicating good potential for reduction to a single dimension. Footnote 6
Cognitive variables
Two cognitive measures were included as control variables, in an attempt to account for the cognitive demands of the proficiency tasks. The Forward Digit Span measure was used as a proxy for children’s short-term memory capacity (Wechsler Intelligence Scale for Children III; Wechsler, Reference Wechsler1991). Cognitive flexibility was indexed by performance in the Dimensional Change Card Sort task, which was administered and scored following the protocol described in Zelazo (Reference Zelazo2006).
Which aspects of SES best predict school language proficiency
I now turn to the first set of research questions, starting with a comparison of different SES operationalizations.
Operationalizations of SES
SES can be operationalized in different ways, either as a simple or a composite measure. From the three simple measures listed above, composite measures were derived to encompass two dimensions at once.
Comparison between monolingual and bilingual children from the same schools reveals some interesting patterns of interaction between SES measures (both simple and complex). Within the bilingual group, we will see that SES is also associated with sociocultural characteristics.
Simple measures
Figure 3 shows the distribution of our population sample according to two simple SES measures: parental occupation and parental education. A notable difference between bilinguals and monolinguals in this data set is that, while parental education and parental occupation are strongly correlated in both groups (linear regression: β = 3.14, p < .0001), higher levels of education tend to be less predictive occupation level in the bilingual households (β = –1.06, p = .045). Among those without higher education, higher occupation levels appear more accessible to parents from monolingual households than those from bilingual households. SES was also significantly higher in monolinguals than bilinguals at group level (occupation: t = 3.56, p = .0005; education: t = 3.52, p = .0006), in spite of having recruited children from the same schools.
Composite SES measures
Composite measures of SES were derived by combining two simple measures. The aim was to obtain measures with no more than eight levels, to maximize the chance of a sufficiently even distribution of children across categories, while allowing (hopefully) sufficient granularity to test for a nonlinear interaction with English exposure.
The first composite measure combines parental occupation with parental education. Occupation was operationalized by reducing the NS-SEC score to a four-level variable as shown in (2).
The two variables were crossed into an ordered factor nesting education into occupation, yielding the levels listed in Table 2 by ascending SES level. Parental education was simplified into a binary factor (higher education vs. no higher education). In this population sample, there were no families with low education and an NS-SEC score between –4 and –1, so that level was removed. The resulting levels are listed in column Education × Occupation in the table. As an alternative, the two variables were also crossed into an ordered factor nesting occupation into education, yielding the levels listed in column Occupation × Education.
The second set of composite measures of SES combined the index of deprivation risk and parental occupation. The deprivation risk index was crossed with a binary indicator of the level of occupation (with a threshold of 6 on the NS-SEC scale, which corresponds to occupations characterized by semiroutine or routine operations, without a significant technical or supervisory component).
The crossing of these two factors resulted in 8 categories (risk 0 to 3+ in “higher” vs. “lower” professions). Two categories were not represented in our population sample: those corresponding to higher employment with low-SES risk score of 2 or 3+. These were therefore removed from the levels of the composite variable. An alternative measure was obtained by nesting the risk index within occupation. The different resulting order of levels is shown in Table 3.
Figure 4 shows the distribution of our population sample according to two of the composite SES measures. It reveals a greater proportion of monolinguals in the high-SES groups, consistent from the fourth SES level upward in the left-hand figure (where SES is operationalized as Occupation × Education). The disparity is less distributed across SES levels in the Risk × Occupation operationalization, which gives much less weight to education. Consistent with the pattern detected from the simple measures in Figure 3, education seems to be a less robust predictor of professional occupation in parents from multilingual households.
Furthermore, the family’s SES was significantly higher if one of the parents was a native speaker of English (ordinal regression: β = 1.5, p = .001). There was also a modest but significant association between SES (operationalized as Occupation × Education) and self-reported maternal proficiency in English (Spearman’s rank correlation ρ = 0.25, p = .02). However, there was also a significant correlation between maternal English proficiency and frequency of English use by the mother when addressing her child (Pearson’s product–moment correlation r = .38, p = .0003). Children in this population sample were therefore not very likely to have been exposed to a substantial amount of “poor” English at home. Maternal proficiency in English was never a significant predictor of children’s English proficiency in this study. Footnote 7
Another noteworthy association is that between SES and ethnicity (Figure 5), which was significant whether ethnicity was operationalized as home language (Pearson’s chi- squared test with simulated p value [based on 2,000 replicates] = 272.71, p = .002) or as ethnic category Footnote 8 (Pearson’s chi-squared test with simulated p value [based on 2,000 replicate] = 37.39, p = .01).
Testing the interaction between cumulative language exposure and SES
The original analysis reported in De Cat (Reference De Cat2020) identified a nonlinear interaction between cumulative English exposure and SES (operationalized as parental occupation on a fine-grained scale), as predictors of English proficiency (indexed by the LITMUS sentence repetition test). This was demonstrated through an item-analysis, using a general additive mixed model (GAMM; Wood, Reference Wood2006) with the R-package “mgcv” (version 1.8–28). GAMMs model linear effects (through parametric terms) as well as nonlinear effects (through smooth terms). The nonlinear interaction is shown in Figure 6: at low levels of cumulative English exposure, high SES does not confer an advantage; the higher the cumulative English exposure, the stronger the SES advantage.
To ascertain the robustness of this original finding, the analysis is reproduced here on the same group of children, but using a global measure of English proficiency as outcome variable, and (in turn) each of the alternative SES measures derived for the present population sample. As there is a single proficiency score for each child, we fitted a GAMM without random effects for participant or item. All predictors were scaled and treated as numeric. To identify the optimal model, a bottom-up procedure was adopted: starting from the simplest model including only gender as a control variable, potential predictors were added one by one, and only those predictors that significantly enhanced the model fit (assessed through fREML comparison) were retained.
This method was used to evaluate the effect of cumulative exposure to English, and of the different operationalizations of SES defined above, as well as age (in months) and cognitive abilities. The cognitive demands of the other two proficiency tests used to derive the global measure are greater than those of the sentence repetition test. In addition to short-term memory and working memory, I therefore tested for the effect of cognitive flexibility, operationalized as task switching (DCCS score). As a final step, a nonlinear interaction between SES and cumulative English exposure was included in the model.
Among the cognitive variables, only task switching was found to improve the model fit. Memory was therefore not included in the final model. Footnote 9 Age and cumulative English exposure are significantly correlated (Pearson’s product–moment correlation: r = .25, p = .02). The model excluding age had a better fit (fREML difference: 6) and was therefore preferred.
This first model (summarized in Table 4) confirms that SES, operationalized as a fine-grained measure of parental occupation, is a significant predictor of a general measure of English proficiency. Crucially, it also confirms the non-linear interaction between SE and cumulative English exposure as an additive effect.
Note: Parametric terms include cumulative exposure to English (in months-equivalent), SES (as parental occupation), and cognitive flexibility (task switching score). Gender is included as a control variable. The smooth (nonlinear) term corresponds to the interaction of cumulative exposure to English and SES.
Next, models using the same formula were fitted using each of the alternative SES measures in turn, to ascertain whether these results are robust across the different operationalizations of SES. Table 5 shows that, across its different operationalizations, SES has a significant impact on English proficiency, both as a linear term (except in Models 2 and 3) and in nonlinear interaction with cumulative English exposure (including in Models 2 and 3).
Note: The parametric effect of SES is shown in columns 3–4, and the effect of its nonlinear interaction with cumulative English exposure is shown in columns 5–6.
Which of the alternative SES measures should be preferred? This can be estimated by comparing their informativity in the model of interest.
Comparing the informativity of SES measures
As expected, the six alternative measures of SES are strongly (and positively) correlated, which implies that there is substantial overlap in the information they each encapsulate. This is shown in Figure 7.
To assess the informativity of alternative predictors, the information-theoretic method described in Burnham and Anderson (Reference Burnham and Anderson2003) will be used. The Akaike weight of a model is based on the Akaike information criterion (AIC), which is an indicator of the trade-off between the accuracy and the complexity of a given model (i.e., a measure of the relative goodness of fit of the model to reality). AIC weights indicate the strength of the evidence in favor of a model in a particular set of competing models. The model with the highest AIC weight is taken as the one that best approximates the “true” process underlying the phenomenon under study, and the other models are evaluated in relation to that optimal model. The evaluation is based on delta values, which correspond to the difference in AIC between the best model in the set and a particular competitor model.
This method can be used for variable selection, that is, to determine which predictor variable has the greatest influence, among a set of competitors (Burnham & Anderson, Reference Burnham and Anderson2003). This is done by summing the Akaike weights of variables across all the models where the variables occur. The competing variables are then ranked using these sums. The larger this sum of weights, the more important the variable is.
As shown in Table 6, the model in which SES is operationalized as Education × Occupation has a 70% chance of being the best in the set. We can conclude that the most informative SES measure, as a predictor of English proficiency, is the one that gives the greatest weight to parental education.
Note: The weight indicates the probability that the model is the best one in the set. Delta is the AIC difference of a model compared with the best one.
This confirms the first hypothesis: if SES indexes input quality, parental education should be a significant dimension. Hypothesis 1a is however not supported: the single- component SES measure based on parental education was not the most informative as a predictor of English proficiency. Hypothesis 1b is supported: the composite measure assigning the greatest weight to parental education was the most informative as a predictor of English proficiency.
There was no significant effect of maternal or paternal proficiency in English on children’s English proficiency scores. This was the case whether SES is taken into account (β = 0.04, p = .24) or excluded from the model (β = 0.06, p = .13). Given the significant association between maternal proficiency in English and SES (Occupation × Education) reported above, the effect of suboptimal English input at home is likely to be have been captured by the SES measure. However, it seems reasonable to conclude that parental English proficiency only had an indirect effect, if any.
Which aspects of language proficiency are affected by SES?
The second set of questions I will address is (a) whether SES affects different aspects of bilingual language acquisition differently, and (b) whether the effect of SES is only detectable where language exposure has a significant impact.
The design of the LITMUS Sentence Repetition test (SRep) lends itself well to the investigation of these questions (Marinis & Armon-Lotem, Reference Marinis, Armon-Lotem, Armon-Lotem, de Jong and Meir2015). First, it comprises three levels of language difficulty, operationalized as syntactic complexity (such as embedding and syntactic movement) and semantic complexity (e.g., conditionals and negation). The levels are illustrated in Table 7. They correspond to three blocks in Experiment 1. I used the LITMUS scoring method, which focuses on the accuracy in reproduction of the target structure. This makes it possible to isolate the effect of structural complexity from other aspects of the task. Across levels, items are controlled for lexical complexity (in terms of word frequency and age of acquisition) and word length (except that Level 3 includes some slightly longer items). The levels are also comparable in terms of the number of full noun phrases and pronouns within each sentence type.
Second, errors are also coded independently of target structure accuracy. They focus on three domains: lexical, functional, and inflectional. Each word in each test item is classified as either lexical or functional. Lexical items include: nouns, verbs, adverbs, and adjectives. Functional items include: determiners, pronouns, wh-words, auxiliaries, modals, prepositions, complementizers, and conjunctions. Some words also feature inflectional morphemes, including: tense, finiteness, aspect (progressive), voice (participle –ed), and number. A single word can therefore be coded for two different aspects: selection accuracy (of the correct lexical or functional item) and inflection accuracy (if applicable). For each of the three domains above (lexical, functional, and inflectional), what counts as an error is any omission, substitution, or addition. The errors in (3)–(4) each highlight one of the domains (underlined), although some also feature other errors too.
Performance on SRep items is therefore measured as accuracy of repetition of the target structure (scoring 0 or 1), and proportional accuracy in each of three domains: lexical, functional, and inflectional. Figure 8 shows the distribution of accuracy scores according to each of these domains in bilinguals and monolinguals.
A GAMM was fitted to bilingual children’s SRep target-structure-accuracy data. The model was built bottom-up, using model comparison (fREML) to identify the best fitting model at each step. The optimal model is summarized in Table 8. There was no statistical support for an interaction between difficulty level and cumulative English exposure. Footnote 11 There was a significant nonlinear interaction of SES and cumulative English exposure.
Note: Edf shows the estimated degree of freedom (reflecting the wiggliness of the curve). All predictors are scaled and centered. Participant and item are included as random effects.
The same model was fitted to the accuracy scores in the lexical, functional, and inflectional domains. A summary of the relevant effects is provided in Table 9, Footnote 12 showing that SES and cumulative English exposure are significant predictors of accuracy in the lexical and the functional domain, but not in the inflectional domain. The impact of difficulty level also varies according to language domain, with a significantly higher incidence of errors of each type at Level 3 (but not Level 2), compared to Level 1. The impact of difficulty level is more than compensated by a significant interaction between difficulty level and SES (observed with respect to lexical and functional accuracy): children from higher SES households make fewer lexical and functional errors in structurally more demanding items. There is no significant interaction between difficulty level and cumulative English exposure. The nonlinear interaction between SES and cumulative English exposure is the strongest in the lexical domain; it is also significant in the functional domain, but not in the inflectional domain. The robustness of these significance patterns was confirmed by fitting a model predicting accuracy of the target structure repetition from accuracy scores in each domain, alongside the environmental factors and memory scores. That model is reported in the online-only Supplemental material.
Note: The statistical significance threshold for t values is at the absolute value of 2. Underlined t and p values are significant at the ** level or higher.
In monolingual children, there was no evidence for an interaction between SES and age in months (as a proxy for cumulative English exposure). There was a significant effect of difficulty level (although only detectable at Level 3), and no evidence for an interaction between difficulty level and cumulative English exposure (i.e., age in months). SES was not significant as a main effect (β = 0.39, p = .07), but a significant interaction was detected between difficulty level and SES (albeit only between Level 1 and Level 2; β = 0.48, p =.017).
Discussion
The first part of the paper investigated the interaction between cumulative English exposure (as a proxy for input quantity) and SES (as a proxy for input quality) as predictors of proficiency in the school language in a socioeconomically diverse group of 5- to 7-year-old bilinguals schooled in English.
Hypothesis 1 was that parental education would be an important dimension of SES as a predictor of the child’s language proficiency (Hoff et al., Reference Hoff, Burridge, Ribot and Giguere2017). Six alternative measures of SES were considered: two simple measures (parental occupation and parental education) and four complex measures (two combining parental occupation and education, and two combining parental occupation and deprivation risk). All the alternative measures were shown to interact significantly with cumulative English exposure as predictors of English proficiency (indexed by a global measure derived through PCA). Out of the six alternatives, the composite measure obtained by nesting parental occupation in parental education was shown to be the most informative, as a predictor of English proficiency. Hypothesis 1a was not confirmed: a simple SES measure based on parental education is not the most informative as a predictor of English proficiency. Hypothesis 1b was confirmed: among alternative composite SES measures, the one assigning a strong weight to parental education was more informative as predictor of proficiency than alternative SES measures. While parental education is significantly associated with parental occupation, the two dimensions also contribute independently to predicting proficiency scores in this population sample.
Importantly, the association between parental education and occupation was modulated by “bilingualism” in our population sample (in spite of the fact that children had been recruited from the same schools). This is in line with other studies, which have found that SES is associated with ethnicity (Fairley et al., Reference Fairley, Cabieses, Small, Petherick, Lawlor, Pickett and Wright2014), and that the language input experienced by children varies within SES groups (Schwab & Lew-Williams, Reference Schwab and Lew-Williams2016). Composite measures of SES are therefore likely to be more informative (as also advocated by Gatt et al., Reference Gatt, Baldacchino and Dodd2020), to the extent that they are sensitive to variability in the association between dimensions of SES.
The second part of the paper explored whether the relationship between SES (as a proxy for input quality) and cumulative English exposure (as a proxy for input quantity) varies in relation to different language domains. This was done through the in-depth analysis of SRep accuracy data. Accuracy was examined across four dimensions: structural complexity of the target structure (syntax), use of functional words (syntax/discourse-semantics), inflectional morphology, and lexical choices.
Structural complexity was operationalized as the level of difficulty of the target structure. Cumulative English exposure and SES had a significant impact in the bilingual children (cumulatively as main effects and in nonlinear interaction). The effect of these environmental predictors did not vary across difficulty levels (either individually or in interaction). Importantly, difficulty Level 3 (which featured object relatives and conditional clauses) also remained challenging for monolinguals in this age group, suggesting that the acquisition of these structures extends beyond the ages investigated in this study.
Accuracy across language domains was strongly correlated with accuracy of repetition of the target structure. More errors were observed in the more structurally complex items, especially in the lexical and the functional domains. The impact of structural complexity was significantly reduced at higher SES, but not at higher levels of cumulative English exposure, suggesting a differential impact of these two predictors (to which I will return below). Environmental predictors also had a combined, global impact through a nonlinear interaction (detected in the lexical and functional domains, but not the inflectional domain).
When treating accuracy across language domains as outcome measures, a similarly varied picture emerged. Inflectional accuracy was the highest across the three language domains. No significant interaction of cumulative exposure and SES was observed in that language domain in the age group under study. This is in line with Schulz and Grimm (Reference Schulz and Grimm2019), who found that at the age of 4 years 4 months, the performance of bilinguals on phenomena such as subject–verb agreement and verb meaning (in German) was already on a par with the performance of age-matched monolinguals. The findings above suggest that the bilinguals in this age group had in general already experienced sufficient language exposure in that respect. This could be interpreted as a manifestation of an age of acquisition effect (which has been found to be robust in inflectional morphology but not derivational morphology; Veríssimo, Heyer, Jacob, & Clahsen, Reference Veríssimo, Heyer, Jacob and Clahsen2018).
Lexical accuracy remained susceptible to the influence of cumulative English exposure and SES in this age group. Similar influences of input quantity and quality on lexical competence have been observed in monolingual development (Fernald, Marchman, & Weisleder, Reference Fernald, Marchman and Weisleder2013; Hart & Risley, Reference Hart and Risley2003) with consequences for reading development (Merz, Maskus, Melvin, He, & Noble, Reference Merz, Maskus, Melvin, He and Noble2019). In bilinguals, the impact of SES on lexical development has been observed in relation to proficiency in the majority language (similarly to what was found in this study; see, e.g., Buac, Gross, & Kaushanskaya, Reference Buac, Gross and Kaushanskaya2014; Calvo & Bialystok, Reference Calvo and Bialystok2014). SES seems to have a more limited impact on lexical development in the home language (Bohnacker, Lindgren, & Öztekin, Reference Bohnacker, Lindgren and Öztekin2016; Leseman, Reference Leseman2000), except if the home language is a majority language (Gatt et al., Reference Gatt, Baldacchino and Dodd2020). Such divergences in findings are likely to be explained by sociocultural differences across bilingual populations.
Functional accuracy remained the most challenging for the bilinguals in this study, and it was strongly affected by the interplay between cumulative English exposure and SES. At low levels of cumulative exposure, higher SES did not confer any advantage; at higher levels of cumulative exposure, higher SES conferred a strong advantage.
Hoff, Quinn, and Giguere (Reference Hoff, Quinn and Giguere2018) provide evidence for “correlated but uncoupled growth” of vocabulary and grammar in bilinguals: both are related, but neither dimension predicts growth in the other dimension. They argue that the relation between vocabulary and grammar development is mediated by the effect of input properties. The results of the present study are consistent with that view, and suggest that different aspects of grammar are affected differently by input properties.
Hypothesis 2 was that the influence of input quality (operationalized as SES) on language proficiency would be necessarily modulated by the influence of input quantity (operationalized as cumulative English exposure). The robust, nonlinear interaction between cumulative English exposure and SES found across all the models in this study is consistent with that hypothesis. The impact of these two predictors did however vary, depending on the aspects of language proficiency used as outcome variable: (a) when global English proficiency was used as outcome variable, higher SES conferred an advantage over and above its effect in interaction with cumulative English exposure; (b) when accuracy of repetition of the LITMUS target structure was used as outcome variable (indexing morphosyntax), SES and cumulative English exposure conferred an advantage both as main effects and in interaction; (c) when repetition accuracy across language domains was used as outcome variable, both environmental predictors conferred an advantage as main effects and in nonlinear interaction, but only SES interacted with difficulty level, and the strength of the interaction of cumulative English exposure and SES varied across language domains (and was absent in the inflectional domain).
This variability suggests that SES and language exposure index dissociable but interrelated properties of the child’s language environment, reflected in a differential effect across aspects of language competence. Further research will be necessary to ascertain whether age of acquisition effects can be distinguished from language exposure effects in bilinguals (as advocated by Tsimpli, Reference Tsimpli2014), and to confirm whether language exposure effects can be reliably broken down into qualitative and quantitative components. The robust impact of SES on selective aspects of language proficiency reported above is promising in that respect.
A notable limitation of the present study was the impossibility to assess the quality of language exposure directly. Self-reported parental proficiency in English did not predict their children’s proficiency, but this could have been due to the fact that non-native speakers of English generally addressed their child in the home language, especially if their own proficiency in English was low. It is nonetheless possible that SES has an effect on the quality of home language use/interactions (Perkins et al., Reference Perkins, Finegood and Swain2013), and that this in turn has an indirect effect on English proficiency (mediated for instance by lower levels of global language proficiency or lower levels of metalinguistic awareness).
Finally, it was not possible to investigate whether SES is associated with different patterns of integration of the child’s community with the wider society. This could have a substantial effect on the quality of their language environment (e.g., depending on the frequency of interactions with native speakers of the societal language, and on the diversity of interaction opportunities available).
Conclusion
SES has a nontrivial impact on language development in monolingual and bilingual children, and it is often used as a proxy for input quality. Adopting a broad definition of input quality (which encompasses some quantitative aspects of language exposure), this study has explored the interplay between SES and cumulative language exposure as predictors of proficiency in the school language, in a diverse group of bilingual children. The findings of this secondary data analysis point to a related but differential impact of the two dimensions of the child’s language environment, affecting different aspects of language differently
SES is associated with parental education, parental occupation, and ethnicity, but more research is needed to unveil the actual SES-related dimensions that affect children’s language development. This will allow the identification of optimally informative predictors that remain valid across cultures and social groups.
Acknowledgments and open access information
The original data was collected as part of a project funded by the Leverhulme Trust (RPG-2012-633), which is gratefully acknowledged. Thanks to the organizers of the workshop on “Capturing and Quantifying Individual Differences in Bilingualism” (Tromsø, September 2019) for the invitation to present the first version of this paper. Insightful comments from the audience and from three anonymous reviewers have greatly helped me sharpen the focus and enhance the clarity of the paper. The code and data are accessible from the Open Science Foundation’s repository at osf.io/53va8.