Introduction
In usage-based theory, it is axiomatic that the precise characteristics of the language children hear are central to predicting the course of their language development. Research from this perspective has been extremely successful in showing how the frequencies of different forms can account not only for the course of children’s comprehension and production but also for the systematic errors that they make (see Ambridge, Kidd, Rowland & Theakston, Reference Ambridge, Kidd, Rowland and Theakston2015). However, less attention has been paid to the relationship between these forms and their functions even though the central tenet of usage-based and allied theories is that children are acquiring mappings between form and function. In this paper, we address this central issue of form-function mapping by examining the acquisition of modals by English-learning children.
Modals are ideal for exploring the acquisition of form-function relations. They comprise a complex system in which there are many-to-one mappings of form to function and vice versa. Acquiring a modal is not simply a case of learning its one and only meaning. Can, for instance, may signal either a physical ability to perform a task or permission to do so (e.g., James can ride his bike). Different modals can also express the equivalent meaning – for example, permission (e.g., you can/may have dessert) or subtle differences in meaning (e.g., you must/should work). The acquisition of modals is important because they make our language nuanced. Their acquisition promotes children’s socio-pragmatic skills, including the ability to negotiate with others and form peer networks (Halliday, Reference Halliday1994; Hoyte, Torr & Degotardi, Reference Hoyte, Torr and Degotardi2015). In this paper, we focus on whether modals’ form-function mappings in the input affect acquisition.
The functions of modals
Modals are typically classed as having one of two broad functions: an epistemic function whereby the speaker uses a modal to indicate their level of certainty towards a proposition (e.g., it must/might be raining) or a deontic function, defined as “concerning conditioning factors, which are external to the individual” (Palmer, Reference Palmer2001, p.9) such as obligation or permission (e.g., you must be quiet) (Papafragou, Reference Papafragou2002). Modals convey numerous meanings and other functional categories have also been identified, including ability (e.g., she can dance), willingness (relating to the speaker’s or interlocutor’s desires, e.g., would you like a drink?) and intention (e.g., I will leave) (Coates, Reference Coates1983; Sweetser, Reference Sweetser1990).
However, differences arise in how researchers have analysed these meanings. Some scholars – for instance, Sweetser (Reference Sweetser1990) – contrast epistemic with root meanings, with the latter denoting obligation, permission, or ability. Coates’s (Reference Coates1983) definition of root is similar with willingness and intention meanings added to this category. ‘Root’ is therefore a broader term than ‘deontic’ (the latter focuses exclusively on permission and obligation). The label ‘dynamic’ has also been introduced to accommodate non-epistemic meanings such as ability and willingness. Due to these contrasting functional labels, we will simply refer to epistemic and non-epistemic functions when summarising literature. The latter refers to any function whereby the speaker is not indicating their level of certainty (i.e., obligation, permission, ability, intention and willingness functions).
In this paper, we investigate whether the nature of modal meanings and their mappings to modal forms in the input affects the order and rate of children’s acquisition. Our goal is to determine to what extent an input-based, constructivist account can account for the pattern of acquisition. We examine the role of the relative frequencies of different forms and form-function mappings in the input. Crucially, we also consider the role that different types of form-function mapping may play: how do children cope when many meanings are associated with a particular form, many forms with a particular meaning or a many-to-many mapping? To harness the power of dense data and investigate the effects of input and child characteristics (e.g., age) on acquisition of the modal system as a whole, we code utterances according to specific modal forms and functions, and then analyse the data together. We first provide an overview of research into children’s sensitivity to form-function mappings. Following this, we summarise previous work on children’s modal acquisition, before developing hypotheses to test.
Form-function mappings in children’s input
There is considerable evidence that, other things being equal, a one-to-one mapping between a form and function in the input aids acquisition. Bates and MacWhinney (Reference Bates and MacWhinney1987) argue that acquisition is greatly enhanced by one-to-one mappings, since children learn language to communicate their own interests and goals. If a form maps onto a single unambiguous communicative function, it will be more easily acquired than a form mapping onto several functions – or, conversely, a function expressed through various forms. However, there are additional factors taking us beyond a straightforward conceptualisation of one-to-one mapping. First, there are children’s communicative interests: if a function is of no interest to children, they might learn it later despite a simple mapping. Secondly, we must consider frequency: however straightforward a mapping, children will presumably not learn it early if it is rather infrequent. Thirdly, there is a child’s ‘functional readiness’: children “will not acquire a complex form until they can assimilate it, directly or indirectly, to an underlying function” (Bates & MacWhinney, Reference Bates and MacWhinney1987, p.167). A three-year-old could likely produce a sentence referring to simple, observable concepts (e.g., the girl kicked the ball) but they may struggle with abstract sentences in which we express belief (epistemic uses) or another’s mental processes (e.g., it might rain or James thought Susan was unwell).
The influence of consistent form-function mappings on order of acquisition has been shown at the level of individual lexical items. Cameron-Faulkner, Lieven, and Theakston (Reference Cameron-Faulkner, Lieven and Theakston2007) analysed the emergence of English multiword negation with zero-marked verbs (i.e., verbs with no overt tense or aspectual marking, e.g., no sleep, can’t reach) in the dense corpus of one child, Brian, between 2;3-3;4. Brian first (ungrammatically) combined unmarked verbs with no (the most frequent negator in the input, e.g., no reach), before producing not, followed by the contracted ‘nt negators (e.g., don’t). However, the speed with which he used the correct ‘nt negator, was influenced by function-based input frequency. Don’t and can’t were the first ‘nt negators to emerge at 2;9 years and overall the most frequent ‘nt negators in the input to signal PROHIBITION and INABILITY respectively. His initial use of these negators only conveyed these meanings. It was not until 3;3 years that don’t was used to convey REJECTION, a less frequent form-meaning mapping. However, note that Brian used the more frequent no and, subsequently, not, to convey various meanings he wanted to express. This demonstrates the role of both the child’s own communicative needs and input frequency. Brian resorted to using highly frequent forms from the input in incorrect contexts to express these concepts (e.g., REJECTION such as I no want cheese) before he grasped how to correctly express them with lower frequency forms. Theakston, Lieven, Pine, and Rowland (Reference Theakston, Lieven, Pine and Rowland2002) demonstrated similar findings, assessing children’s acquisition of go and its various form-meaning mappings between two and three. Meanings included movement (e.g., I’m going home), disappearance (e.g., the drink has gone) and belonging (e.g., does that piece go there?). Children produced a form (e.g., go) for its most frequent input function (e.g., movement) before producing other less frequent functions with that form (e.g., disappearance).
Together, these findings illustrate how speed of acquisition is influenced not only by form frequency, but also fine-grained form-meaning pairings interacting with the meanings that children wish to, or are socio-cognitively able to, use.
Modal acquisition
The order of acquisition of modal forms and functions
Foundational studies on children’s naturalistic modal use showed that children first produce modals from the age of two (Richards, Reference Richards1990; Shepherd, Reference Shepherd1982; Wells, Reference Wells1979). Can and will are the first modals to appear, whereas shall and could are not uttered until the fourth year. Furthermore, forms such as must and might are very infrequent during this period (Fletcher, Reference Fletcher1985; Wells, Reference Wells1979).
Earlier uses of modals are non-epistemic. Epistemic uses do not emerge until at least the age of three, a year or so later than the observed non-epistemic instances. In line with these findings, researchers suggested that children’s epistemic modal use may relate to their socio-cognitive abilities (Moore, Pure & Furrow, Reference Moore, Pure and Furrow1990; Papafragou, Reference Papafragou1998). Papafragou (Reference Papafragou1998) proposed that children’s epistemic modal use may depend on Theory of Mind development, in being able to reason about mental representations (“thinking about thinking”) and their differing levels of accuracy and speaker certainty. The shift in epistemic modal use in the fourth year coincides with the age at which children typically pass explicit false-belief tests (Wellman, Cross & Watson, Reference Wellman, Cross and Watson2001). However, whether the acquisition of epistemic modals depends on, or supports, children’s Theory of Mind (e.g., de Villiers, Reference de Villiers2007) is not entirely clear. On the one hand, in comprehension experiments four- but not three-year-olds can reliably distinguish between the relative certainty of epistemic modals (e.g., Hirst & Weil, Reference Hirst and Weil1982), and Moore et al. (Reference Moore, Pure and Furrow1990) found correlations between children’s epistemic understanding and performance on false-belief tests. One interpretation is that grasp of modals may depend on Theory of Mind. Conversely, language could underpin Theory of Mind developments. Studies have demonstrated links between caregivers’ use of mental terms (e.g., in questions posed to children such as Do you remember the card? which encourage them to reflect on their own thought processes) and children’s success on false-belief tests (Howard, Mayeux & Naigles, Reference Howard, Mayeux and Naigles2008). In fact, the relationship is likely to be complex: Boeg Thomsen, Theakston, Kandemirci and Brandt (Reference Boeg Thomsen, Theakston, Kandemirci and Brandt2021) found strong evidence of a bi-directional influence between language (knowledge of complement clauses and mental state verbs) and false-belief understanding in a longitudinal study between 2-3 years.
Recent work has illustrated that children produce other epistemic items including adverbs (e.g., maybe, probably) and adjectival phrases (e.g., It is possible/true that) before the age of three and during the so-called ‘epistemic gap’ (Bassano, Reference Bassano1996; Cournane, Reference Cournane2021; Veselinović & Cournane, Reference Veselinović, Cournane, Ionin and MacDonald2020). These authors proposed that it is not the epistemic function that necessarily causes an issue but rather the more syntactically complex nature of modal auxiliaries relative to lexical expressions. Only modal auxiliaries require sentential embedding whereas adverbs can be flexible in their syntactic distribution (e.g., (Maybe) I will (maybe) visit) or can stand alone in an utterance (see Cournane, Reference Cournane2020 for a review). However, once children can cope with the syntactic contexts for modals, an input account would predict children to acquire the epistemic function for specific modals later if those modals display less consistent mappings to the epistemic function. Previous research has mostly ignored the distribution of different modal functions in caregivers’ speech (Fletcher, Reference Fletcher1985; Moore et al., Reference Moore, Pure and Furrow1990; Wells, Reference Wells1979). Over and above the predicted effects of input frequency, some epistemic meanings could hinder acquisition, given the diversity of epistemic modal uses (Palmer, Reference Palmer2001). Some epistemic uses are speculative (e.g., Katie might be in her office), whilst others rely on inferring from observation (e.g., Katie must be in her office having seen the light switch on) or assumptions based on what we know about others (e.g., Katie will be in her office as it is her typical working hours). These latter examples may indeed depend more on socio-cognitive skills than on their distributional characteristics.
It is also worth considering children’s general grasp of modal concepts and how these may influence the types of modal functions they produce. Research has shown that even 9-month-olds can interpret agent intentions and desires (Woodward, Reference Woodward1998), which could explain the early use of will to signal intentions in children’s speech (Wells, Reference Wells1979). Relatedly, the fact that early modal usage is dominated by non-epistemic functions is consistent with children’s greater success on deontic (obligation and permission meanings) than epistemic reasoning tasks (Cummins, Reference Cummins1996). Two-year-olds can already reason appropriately about obligations, understanding that it is ‘bad’ to violate moral obligations (e.g., hurting another child) (Smetana & Braeges, Reference Smetana and Braeges1990), whilst three-year-olds can distinguish between moral (e.g., one shouldn’t hurt others) and conventional norms (e.g., tidying away one’s belongings) (Smetana, Reference Smetana1981). The notions of possibility and uncertainty, that could underpin epistemic uses, do not develop until later. For instance, four-year-olds cover both possible exits for a ball that is dropped from an upside-down Y container (Redshaw & Suddendorf, Reference Redshaw and Suddendorf2016), whereas three-year-olds tend to guess by opening their hand to only one, suggesting only four-year-olds can represent multiple possibilities concerning a single event. Similarly, four-year-olds have greater awareness of uncertainty. In one study, an experimenter hid an object in one of two boxes out of children’s sight. Children were asked to report its location to a second experimenter or to rely on the first experimenter to inform them (Leahy & Carey, Reference Leahy and Carey2020). Four-year-olds outperformed three-year-olds by acknowledging their uncertainty and requesting the first experimenter’s input, though performance was still not adult-like. These developmental breakthroughs may underpin children’s ability to use modals epistemically to signal possibility and, hence, uncertainty (e.g., it might rain), however whether these concepts develop fully independently from, and prior to, the acquisition of modal language, is debated (see Leahy & Carey, Reference Leahy and Carey2020).
The role of the input in modal acquisition
Of the earlier modal acquisition studies, only Wells’ (Reference Wells1979) corpus analysis of 60 children aged between 1;2 and 3;7 considered the input. However, these data were quite limited, focusing solely on form frequency. Wells found that the most frequent modals in mothers’ speech were typically the forms used most often by children. However, Wells only provided descriptive statistics indicating the total number and proportion of a form. Without accompanying statistical analyses, one cannot determine whether there were in fact strong form correlations between specific mothers and their children. Furthermore, speech samples were collected at three-monthly intervals. These sparse data may fail to represent children’s everyday language and age of acquisition may not be very reliable, particularly for lower frequency forms.
More recent studies have shown that epistemic modals are not very well attested in the input (<8% of modal utterances), particularly in Dutch (Van Dooren, Dieuleveut, Cournane & Hacquard, Reference Van Dooren, Dieuleveut, Cournane and Hacquard2017; Van Dooren, Tulling, Cournane & Hacquard, Reference Van Dooren, Tulling, Cournane and Hacquard2019). Caregivers are more inclined to use adverbs epistemically, and moderate positive correlations in overall usage rates of epistemic adverbs have been evidenced between children and parents (Cournane, Reference Cournane2021). Van Dooren et al. (Reference Van Dooren, Dieuleveut, Cournane and Hacquard2017) analysed six mother-child dyads from the Manchester corpus with children aged two to three. They compared the overall frequency of epistemic vs. non-epistemic (using the term ‘root’ for the latter) modal uses and found that both mothers and children more frequently expressed non-epistemic functions. However, children showed a ‘root bias’ for must, even though this form was predominantly used epistemically by caregivers. Unfortunately, methodological considerations make it difficult to interpret these data. The researchers provided raw frequency counts and proportional functional usage information, but no statistical analysis, so it is unclear whether the observed differences between children and caregivers are statistically significant. This research also looked broadly at epistemic and non-epistemic functions, without differentiating between different types of non-epistemic functions (e.g., ability, permission etc.). Can, for instance, may convey epistemic (e.g., that can work), permission (e.g., you can eat dessert), and even obligation (can you sit down?) meanings. The authors did consider that some modals (e.g., must) are polysemous and analysed syntactic cues in the input to explain how children map a modal to both root and epistemic meanings. However, the analyses were applied to these broad functional categories instead of fine-grained form-function mappings. To delve into acquisition of such a complex system, we need more information about nuanced form-meaning mappings in the input.
A further crucial point is that speech sample frequency needs controlling for, since caregivers typically speak more than children. Van Dooren et al. found 43,189 relative to 7,694 modal utterances from parents and children, respectively. There is a greater likelihood of detecting less frequent epistemic utterances from a larger sample, despite the non-epistemic function being dominant (Tomasello & Stahl, Reference Tomasello and Stahl2004). Sample size therefore needs controlling for before assuming that children struggle with the epistemic function. These important methodological controls will be applied in our study.
The present study
In this study, we analysed two dense corpora of mother-child interaction to assess the influence of specific form-function mappings in the input on children’s modal acquisition. Some modal form-function mappings are infrequent in speech and therefore the use of dense databases, as opposed to sparse sampling across multiple dyads, was essential to provide a more reliable indicator of their order of acquisition (Lieven & Behrens, Reference Lieven and Behrens2012; Tomasello & Stahl, Reference Tomasello and Stahl2004). We developed and used crucial controls for how frequency of form, function and the mappings between them are measured. These controls are completely novel in modal acquisition research. Besides form frequency, we included distributional properties of modal usage (i.e., the number of functions a modal maps onto and its bias towards one given meaning).
Similarly to previous work, we studied children’s modal use at three (Cournane, Reference Cournane2021; Wells, Reference Wells1979), but also followed their development at four. This later age would reveal if children are more inclined to produce the epistemic function when they typically start passing explicit false-belief tasks (Wellman et al., Reference Wellman, Cross and Watson2001), and would test the predictive value of earlier experienced input on children’s later acquisition (thus removing the possible confound that similarities between caregivers and children simply reflect being engaged in the same conversation demonstrating priming effects). Usage-based approaches assume that children only gradually build up linguistic representations based on repeated exposure to patterns of usage in their input. Using earlier experienced input data to predict later acquisition is consistent with this approach as it incorporates developmental time for the distributional patterns to be acquired. We focus on the extent to which the input data may explain children’s acquisition. Any linguistic developments that cannot be explained by the input may reflect the child’s socio-cognitive abilities, their grasp of underlying modal concepts and/or pragmatic aims. Our research questions are as follows:
-
1. Which modal auxiliaries do caregivers use most often? How does this relate to the frequency of these forms in children’s speech?
-
2. Do mothers produce a significantly higher proportion of epistemic modals than their children at both age three and age four?
-
3. Are children more likely to use modals epistemically at four than three years of age?
-
4. Are modals associated with fewer functions in the input easier for children to acquire?
-
5. Are children more likely to use a modal for a greater number of functions at four than three years of age?
-
6. Do the most frequent modal form-function mappings in the input feature the most prominently in the children’s speech?
Based on the literature we derived the following predictions:
-
1. The raw frequency of use of specific modal forms in the input will correlate with their raw frequency of use in the children’s speech at both 3 and 4 years (Lieven & Tomasello, Reference Lieven and Tomasello2008; Wells, Reference Wells1979).
-
2. Mothers will produce a significantly higher proportion of epistemic modal uses than their children at both 3 and 4 years, even when controlling for modal type and sample size (Van Dooren et al., Reference Van Dooren, Dieuleveut, Cournane and Hacquard2017).
-
3. Children will produce a significantly higher proportion of epistemic modal forms at 4 than 3 years of age, even when controlling for effects of input frequency (Moore, et al., Reference Moore, Pure and Furrow1990; Papafragou, Reference Papafragou1998).
-
4. The number of distinct functions associated with a specific modal in the input will negatively predict its frequency of use in the children’s speech (as a proxy for ease of acquisition, Bates & MacWhinney, Reference Bates and MacWhinney1987) at both 3 & 4 years.
-
5. There will be a significantly larger number of functions associated with specific modal forms in children’s speech at 4 years in comparison to 3 years due to their greater experience with language and developing pragmatic skills.
-
6. The raw frequency of specific form-function mappings with individual modals in the input will predict the frequency of use of these same form-function mappings in the children’s speech at 3 and 4 years of age (Cameron-Faulkner et al., Reference Cameron-Faulkner, Lieven and Theakston2007; Tomasello, Reference Tomasello2003).
Predictions 1 to 3 constitute replications of earlier studies, but include previously omitted methodological controls (sample size and the input frequency of a modal form and form-function mapping), essential to robustly test the reliability of previous findings. Predictions 4 to 6 test theoretical accounts of acquisition with direct relevance to the acquisition of form-function mappings: Prediction 4 considers the competition account in which one-to-one mappings arguably facilitate the acquisition of individual forms whilst Predictions 5 and 6 focus on children’s own use of form-function mappings. Prediction 5 investigates children’s use and development of functions associated with each modal form and of particular importance is Prediction 6 to assess whether, in accordance with the usage-based account, children’s acquisition of a given form-function mapping is predicted by its input frequency.
Methodology
Data
The data consist of speech samples obtained from two children (Thomas and Helen) on the Max Planck database (Lieven & Behrens, Reference Lieven and Behrens2012; Lieven, Salomo & Tomasello, Reference Lieven, Salomo and Tomasello2009). Both corpora are instances of a longitudinal naturalistic study of children’s speech with their mothers, audio recorded at home during regular play. In most recordings, the researcher is also present and engaging in play with the child. Both dyads are monolingual English speakers who live in Greater Manchester. The mothers are the children’s primary caregivers.
Each child’s data were analysed for up to two months from 3;0 and 4;0 to ensure a developmental gap between the ages. However, modal coding ceased within each period at the end of the transcript once 500 modals were captured to control for number of utterances. For Thomas, we used recordings from age 3;0.0 to 3;1.30 (36 hours of recordings) and 4;0.2 to 4;2.1 (10 hours of recordings). Data were collected very intensively at three years of age (one hour, five times each week) and slightly less intensively at four (five hours across one week in every month). For Helen, we analysed data from 3;0.2 to 3;0.24 (17 hours) and 4;0.2 to 4;0.19 (13 hours). Helen was recorded for one hour, five times a week, every week for these ages. Each recording lasted 60 minutes. Table 1 shows the average and range MLU (mean length of utterance) for each child and the number of modal utterances produced.
The input samples included 10 hours of data taken from the first two weeks of Thomas’s and Helen’s recordings at age three. Thomas’s and Helen’s mothers’ speech was then further analysed within the first 10 hours of data obtained from Thomas and Helen at age four. For research questions 1 and 2, that compared the use of modals between the input and children’s speech, we harnessed both the age three and age four input samples for analysisFootnote 1. For research questions 3 to 6, which investigate which variables predict children’s use of a given modal form or mapping, we inputted properties of the earlier, age 3, input sample to our predictive models. The rationale behind this is explained in the Analysis section.
Procedure
The transcripts were searched using the Computerized Language Analysis (CLAN) program (MacWhinney, Reference MacWhinney1995), for all utterances incorporating modals: can, could, may, might, must, shall, should, will and would. Non-modal auxiliary forms (be, have, do) and all quasi-modal infinitival forms such as want to, have to, ought to etc. were omitted from analysis as they do not encode the modal functions of interest. If utterances contained more than one modal (e.g., You can see if it will fit), two copies of the utterance were coded (one per modal).
Table 1 shows the number of modal utterances that were analysed. Each modal utterance was coded for verb type and function. The modal was first coded in terms of its main function (i.e., epistemic, non-epistemic or ‘other’ if difficult to ascribe) and, if non-epistemic, its subcategory (provided below). The functions used to analyse the data were based on those used to characterise adult speech. However, from a constructivist perspective, children are assumed to learn the functions that are relevant for their language and how they map onto linguistic forms gradually through experience, so it is possible that these adult approximations were broader or narrower than the form-function mappings used by the children. A detailed analysis of the specific contexts of use and/or experimental studies would be required to assess children’s mappings in detail, but this was beyond the scope of the present study.
Motivations behind the coding scheme
Most categories included in the analysis were derived from previous literature, particularly the epistemic vs. non-epistemic distinction (Papafragou, Reference Papafragou1998). In line with earlier modal definitions, only non-epistemic uses were further analysed by subcategory to provide a fine-grained analysis. Alongside the aforementioned subcategories common in the literature, we also included hypothetical statement/question, past habitual event, past tense ‘will’ and refusal to act, mainly to accommodate the range of meanings associated with would (Murphy, Reference Murphy2012; Ormal-Grenon & Rollin, Reference Ormal-Grenon and Rollin2007; Parrott, Reference Parrott2010). A subcategory of suggestion, to introduce a concept or activity, was also incorporated into the scheme as adopted by Wells (Reference Wells1979).
Context is crucial when analyzing modal utterances, particularly due to modals’ polysemous nature. Therefore, if any function was difficult to determine, the five lines prior and following the utterance were consulted for contextual information.
Coding Scheme
An abbreviated summary of the coding scheme is given here. The detailed scheme, with more examples and context, appears in Appendix A. Examples of modal utterances and their functions produced by the children are provided in Table 2.
1. Main function
We first coded whether the modal had an epistemic or non-epistemic function.
a) EPISTEMIC
The speaker uses the verb to reflect their degree of commitment towards the truth of the proposition (Papafragou, Reference Papafragou1998), i.e., how certain or uncertain they are that what they are expressing is true (e.g., it must/might be the postman).
Other instances of epistemic modality may include a speaker’s assumption (Brown, Reference Brown1973), i.e., hypothesizing about a situation in the present, past or future (e.g., ‘I’m so pleased there’s nothing missing because it would have been a bit embarrassing’ (TM 3;0.0 (Thomas’s mother in the 3;0.0 transcript))Footnote 2 or ‘you will be tired today, won’t you?’ (HM 3;0.6 (Helen’s mother in the 3;0.6 transcript)). Epistemic modals can also be used to infer (e.g., ‘He must not be feeling well’ (TM 3;0.2)).
b) NON-EPISTEMIC
Non-epistemic modality is defined as concerning conditioning factors, which are external to the individual (Palmer, Reference Palmer2001). The modal was coded as non-epistemic if it expressed one of the following functions (defined below): ability, futurity, hypothetical question, hypothetical statement, obligation, past tense will, past habitual event, permission, refusal to act, suggestion or willingness.
2. Non-epistemic subcategories
If a modal was coded as non-epistemic, we assigned its function to one of the following subcategories.
(i) ABILITY
The speaker expresses ability (or inability) to perform. This may be concerned with their own or others’ actions (e.g., ‘I can see Sue’ or, ‘You couldn’t see her but she was there shopping’ (TM 3;0.0)).
(ii) FUTURITY
The modal indicates an event in the future or their own or others’ intention to act (e.g., ‘I shall have Cornflakes with milk’ (TM 3;0.2) or, ‘Who will you play with?’ (HM 4;0.11)).
(iii) HYPOTHETICAL STATEMENT
The modal is used as a statement to describe what may or may not happen in the future (or the past). It is hypothetical since the speaker is imagining an event, which has not (or may not) occur, however without assuming or predicting the event associated with an epistemic reading (e.g., ‘I don’t think there would be an awful lot of room in a windmill actually, Thomas’ (TM 4;0.9)).
(iv) HYPOTHETICAL QUESTION
The purpose of the modal is to ask what may or may not happen in the future (or the past). This is deemed as hypothetical as the speaker is imagining an event, which has (or may) not occur (e.g., ‘What sort of people would live under the ground?’ (TM 4;0.9)).
(v) OBLIGATION
The modal expresses that the speaker or listener should (or should not) act. These utterances can vary in force (e.g., ‘you mustn’t go there’ (TM 4;0.4), or ‘I wonder if you should be wearing your Bob the Builder hat, Thomas, to do this’ (TM 3;0.3)).
(vi) PAST HABITUAL EVENT
The modal describes a habitual event in the past, i.e., an event that occurred on a regular basis (e.g., ‘As you got a little bit older sometimes you would have some cheese biscuits’ (TM 4;0.7)).
(vii) PAST TENSE WILL
The modal is the past tense form of will (i.e., ‘would(n’t)’), used to discuss a past event (e.g., ‘I was so frightened people would throw snowballs in my face’ (TM 4;0.7)). This category can also include reported speech (e.g., ‘He just said there wouldn’t be any trains running along the Burnage Line’ (TM 3;0.7)).
(viii) PERMISSION
The speaker uses the modal to grant/refuse someone permission to do something or to express their own allowance (e.g., ‘You can draw on the picture but not on the table’ (TM 3;0.3), or, ‘Could I give the birthday boy a kiss?’ (TM 3;0.0)).
(ix) REFUSAL TO ACT
The modal indicates how an individual, object or event did not comply with an action (e.g., ‘You wouldn’t sing’ (TM 3;0.0), or, ‘he was shy and he wouldn’t blow his candles out’ (TM 3;0.1)).
(x) SUGGESTION
The speaker uses a modal to suggest an idea (without the forceful nature associated with obligation). The speaker is not giving an order (as indicated by obligation), but solely introducing a concept/activity (e.g., ‘Shall I go upstairs and get the book?’ (TM 3;0.2), or, ‘we can perhaps do some playing later on’ (TM 4;0.7)).
(xi) WILLINGNESS
The modal is associated with the speaker (or their interlocutor)’s desires or preferences (e.g., ‘Would you like some orange?’ (HM 3;0.4)).
3. Other
If a modal could not be assigned to either an epistemic or non-epistemic category, we coded it as ‘Other.’ This only applied to a few utterances in which the modal was part of a formulaic phrase and we could not isolate the modal meaning (e.g., ‘We could do with a rubbish bag’ (HM 4;0.3)).
Reliability
Following the first author’s coding, ten percent of randomly generated utterances from the children’s and mothers’ speech were coded by a second researcher, according to the coding scheme in Appendix A. This resulted in 76% agreement for the mothers (Cohen’s kappa = 0.75) and 89% agreement for the children (Cohen’s kappa = 0.85). Agreement was calculated in relation to whether we correctly coded an utterance as epistemic or non-epistemic and chose the equivalent non-epistemic subcategory if non-epistemic. We note that although the reliability for the input data are lower than for the child data, we were still able to achieve high levels of agreement (83-100%) across the vast majority of categories for the caregivers. The main areas of disagreement were for ‘Ability’ where our two coders agreed on 76% of all utterances coded by one or other as ‘Ability’. The discrepancies were largely due to the second coder allocating some of these utterances to ‘Permission’ and ‘Suggestion’. However, the same pattern was not seen in the children’s data where the two coders agreed on 85% of all utterances coded by one or other coder as ‘Ability’. This suggests that the disagreement likely reflects the occasional difficulties in ascertaining the precise communicative intent from transcriptions of audio-recorded corpus data, rather than reflecting the specificity of the coding scheme itself. Whereas the child’s utterances are often accompanied by contextual information and interpretation from the caregiver, this isn’t necessarily the case for the caregiver’s utterances which often introduce new topics. Caregivers may also be perceived to be more likely to grant permission and/or make suggestions, meaning that coders may be inclined to interpret their utterances as having these meanings in ambiguous contexts more often than in the child data. Despite these issues, we regard the overall kappa value as indicative of a high level of agreement.
Analysis approach (Research questions 3-6)
To determine the predictors of children’s production of a modal form or form-function mapping, when applying control variables, we conducted regression analyses in R. For each research question, to ascertain whether our predictor of interest (e.g., the number of input meanings exhibited by a modal) influenced the outcome measure (e.g., the child’s production of that modal), it was important to consider whether this variable was significant over and above other potential predictor variables in the input (e.g., form frequency) or child characteristics such as age or MLU. The control predictors used in these analyses are defined in Table 3. For each analysis, we state the outcome measure, the predictor of interest and the (relevant) control predictor variables included (see Table 4 for an overview). Definitions of the outcomes and predictors are provided in the relevant analysis section.
Each analysis was performed separately for each child. The data for the input variables were derived from the speech addressed to the children when they were three. This was done for two reasons. First, we needed to ensure as far as possible that any observed predictive relations between the children’s input and their own speech were not simply a reflection of being engaged in the same conversation but rather reflected the broader distributional characteristics of the input. Second, we wanted to avoid a confound between potential effects of input frequency and child socio-cognitive development. For example, in Figure 1, Thomas’s mother’s epistemic use appears rather stable across the two ages, whilst Helen’s mother’s usage increases. It is possible that caregivers may tailor their speech to their child’s socio-cognitive abilities over time. This would make it difficult to interpret effects of input frequency and child age in the models as the input may alter in response to changes in the child’s socio-cognitive abilities. We thus used the age three input sample to control for this possibility, meaning that ‘age’ was the sole predictor to capture potential changes in the children’s socio-cognitive development.
In line with previous studies (e.g., Rowland, Pine, Lieven & Theakston, Reference Rowland, Pine, Lieven and Theakston2003; Theakston, Lieven, Pine & Rowland, Reference Theakston, Lieven, Pine and Rowland2004), we carried out a series of correlational analyses to assess whether the relative frequency of our key input variables remained stable between 3- and 4-years. Strong similarities in the distributional properties of the input at the two ages would suggest two things. First, any reported relations between caregiver input and child speech are unlikely to be due to the partial overlap in our input and child data samples (in the 3-years data). Second, the exclusion of the input data at 4-years from our predictive analyses is unlikely to affect the properties of our input predictors (derived from the input samples at 3-years). The key predictor variable for research question 3 is the frequency of epistemic uses in the input. Correlations between the frequency of epistemic uses for each modal form at 3-years and 4-years are high and significant (Helen r =.926, df = 9, p<.001; Thomas, r = .946, df = 12, p<.001) demonstrating that the relative frequency of epistemic uses across modals is highly consistent between the two ages for both children’s input.
For research questions 4 and 5, we investigated whether the relative number of functions found with each form in the input predicts its acquisition. We therefore ran correlations between the number of distinct functions produced with each modal in the input sample from 3-years and 4-years (Number of input functions). Again, correlations were high and significant (Helen r = .811, df = 13, p<.001; Thomas, r = .897, df = 14, p<.001), demonstrating that the relative number of functions produced with each modal in the input is consistent over developmental time. Finally, for research question 6 the key variable of interest is the relative frequency of form-function mappings. This is a fine-grained version of the frequency data used to derive our other input variables (‘Input form frequency’ and ‘Input function bias’) and thus also serves as a measure of their stability over time. Again, correlations were high and significant (Helen r =.948, df = 37, p<.001; Thomas, r =.932, df = 53, p<.001), demonstrating consistency in these input measures over development.
Our approach to model building was as follows. We first created a model including all relevant control variables, irrespective of each variable’s contribution to model fit. The predictor of interest was then added to the model of control predictors to form the base model. We also tested for theoretically motivated, two-way interactions between variables in the base model. Each interaction was independently added to the base model and the effect of this addition was compared to the base model by ANOVA. Any significant interactions were then collectively added to the base model.
If any interactions were non-significant when combined with the model, the least significant of these interactions was removed and an ANOVA was conducted between this reduced model and the full model (including the other interaction terms). If a given two-way interaction term did not improve the fit when compared to a reduced model, the interaction was removed. The equivalent process was followed for any remaining non-significant and then significant interactions (in order of contribution to model fit, i.e., the interaction that made the least contribution was removed first). Please see Appendix D for an example of how this model building process was applied. In upcoming sections, we report on the final models.
Results
Frequency of forms and broad functions (Research questions 1 & 2)
Appendix C provides details on which modals were used by the children and caregivers to convey which functionsFootnote 3. We first tested our hypothesis that the raw frequency of specific modal forms in the input would correlate with their frequency in the children’s speech (research question 1). This analysis solely included modals produced by either of the children or the mothers across the samples and therefore excluded the negated forms of may and shall (see Appendix B). A Spearman’s rank-order correlation revealed that for both children at age three, there was a positive correlation with their mothers’ modal use (Thomas: r s = .54, p = 0.02; Helen: r s = .85, p<0.001), with a stronger correlation observed at four (Thomas: r s = .94, p<0.001; Helen r s = .91, p<0.001). Positive correlations were also found between the mothers’ frequency of modals at age three and the children’s subsequent modal use at age four (Thomas: r s = .93, p<0.001; Helen r s = .94, p<0.001). The results reveal that, as predicted, the forms more common in caregivers’ speech are typically the forms used most frequently by children, even when controlling for whether dyads are engaged in the same conversation by relating children’s use at age four to their input at age three.
Research question 2 tested whether caregivers were more likely to use modals epistemically than their children by focussing on use of epistemic vs. non-epistemic functions. The children’s use of these functions was compared with their input (see Figure 1) using chi-squared analyses in R (R Core Team, 2014) (analysis 2a). Chi-square analyses indicated that for both children, at both ages, in line with our prediction, the mothers were significantly more likely to use modals epistemically than their children (Thomas 3;0: χ2 = 34.33, df = 1, p<0.001; Thomas 4;0, χ2 = 8.66, df = 1, p = 0.02; Helen 3;0: χ2= 8.66, df = 1, p = 0.003; Helen 4;0, χ2 = 23.37, df = 1, p<0.001).
When taking all the data into account, however, it is unclear whether the children are less capable of producing modals for an epistemic purpose than their mothers, or whether the mothers are simply using forms epistemically, that are not yet in the children’s lexicon. To control for this possibility, we carried out a further analysis (analysis 2b) on only verbs that i.) were produced by both caregivers and children at least five times per dataset (to provide a reliable indicator of their epistemic vs. non-epistemic distribution) and ii.) showed both non-epistemic and epistemic functions. This dataset enabled us to compare the relative non-epistemic-epistemic distributions of the mothers’ and children’s use of a modal. The remaining modals for this analysis were can, can’t, should, will and won’t.
After applying these controls, Thomas’s mother was still significantly more likely to use modals epistemically than Thomas at three (χ2 = 5.42, df = 1, p = 0.02) and four (χ2 = 10.5, df = 1, p = 0.001). For Helen, her mother was significantly more likely to use epistemic modals at three (χ2 = 12.9, df = 1, p < 0.001) but not four (χ2= 0.0003, df = 1, p = 0.9). These controls, however, still fail to account for differences in epistemic/non-epistemic usage that may result from the children and mothers using the modals with differing frequencies. Both mothers used the forms should, will and won’t more often than their children, perhaps due to differing pragmatic goals. Caregivers with more world knowledge are more likely than their children to discuss events outside of the here and now and to hypothesize about future events (Rowe, Reference Rowe2012), thus requiring epistemic modals (e.g., ‘Then it’s half term and the boys will be home’ uttered by Helen’s mother). They also use their knowledge to advise their child on their surroundings (e.g., ‘It won’t be the dustbin man now, Thomas’, uttered in the age three sample). These differences in pragmatic goals could result in greater epistemic modal use from mothers, but do not necessarily indicate that children are unable to use these epistemic forms in the same way should they wish to convey the same pragmatic goal.
To overcome the issue of unmatched distributions, we carried out a further control analysis (analysis 2c) on the five prior modals (can, can’t, should, will, won’t). In this analysis each verb was matched in quantity across the child and input samples at each age (e.g., Thomas aged three and his mother’s speech at this age) by taking all uses of a given modal in the smaller sample and randomly selecting the same number of modal instances from the larger sample. For example, can was used more frequently by Thomas’s mother than Thomas when he was three. We therefore included all Thomas’s can uses, but reduced his mother’s instances of can to the same number by randomly sampling from her can utterances (125 utterances, see Appendix B). For each verb, the data were randomly reduced in this way five times to ensure the samples were representative of overall use. For each of these five samples, the number of non-epistemic and epistemic uses were summed across all considered verbs.
Of the 20 separate chi-squared analyses, 18 returned non-significant results suggesting a similar distribution of epistemic uses between the children and caregivers (χ2 range= 0 – 2.64, p-values range= 0.1 – 1). Only two (from Helen at three) showed a significant difference whereby the proportion of the caregiver’s epistemic uses was higher (χ2 = 4.32, p = 0.04 for both analyses). On balance, these data suggest that, when necessary controls are implemented for both modal form and sample size, the caregivers did not use epistemic modals significantly more often than their children. Instead, the observed, proportionally more frequent, use of epistemic forms in caregiver speech, overall, reflects a larger sample of utterances with specific modals that may reflect the different pragmatic goals of caregivers and children. Epistemic uses are fairly uncommon but they are more easily detected in a large sample from the mothers.
To understand any difficulties children might face in acquiring some modals, we need to look beyond their broad epistemic function. Of course, since our analysis focused only on verbs frequently produced by the children, it could be that other forms are not yet in the children’s lexicon, perhaps because they are relatively infrequent in their input and take longer to learn, and/or because they are struggling with their function. Other modals (e.g., would) may express a more complex, assumptive type of epistemic modality (e.g., Sarah would like that film) that could rely on children’s perspective-taking skills. The key point is that it is only possible to identify where children face difficulties in acquiring modal functions by applying appropriate methodological controls to compare what they produce to what they hear.
Predicting children’s modal use (Research questions 3-6)
Similarly to previous work, the above analyses compared the mothers’ and children’s modal usage. However, prior work has not considered what predicts children’s use of a given modal or function, which we cover in research questions 3 to 6.
What predicts children’s epistemic modal use? (Research question 3)
We illustrated above that both children produce epistemic modals at three. In this section, we target research question 3 to test the prediction that children will more frequently use epistemic modals at four than at three, even when controlling for effects of input frequency. We fitted a logistic regression model using the glm function in R with Age (a categorical variable, three vs. four years of age) as our predictor of interest. We also added input variables as control predictors (‘Input Epistemic Frequency’, ‘Input Form Frequency’). An input account would predict children’s epistemic modal usage to be boosted by frequent forms in the input that consistently map onto the epistemic function. Controlling for the input is crucial to determine to what extent cognitive development influences acquisition rather than children simply taking longer to learn forms/functions that they hear less often. We also included the child’s MLU on each recording as a control variable to see whether epistemic modal use may relate to changes in the child’s language proficiency (see Table 3 for control predictor definitions). We analysed all modal utterances produced by the children. The binary outcome variable was ‘Function’ (0 for non-epistemic and 1 for epistemic).
For both children, ‘Age’ was not a significant predictor of their epistemic modal use (see Tables 5 and 6). ‘MLU’, however, was significant for Thomas. His improvement in language proficiency across the ages (see Table 1) co-occurred with his use of the arguably more complex epistemic function. For both children, ‘Input Epistemic Frequency’ was significant. They were more likely to produce an epistemic function if a particular verb frequently occurred with this function in the input. However, ‘Input Epistemic Frequency’ interacted with ‘Input Form Frequency’ such that the effect of ‘Input Epistemic Frequency’ was boosted for generally less frequent modals. The frequent input modals were typically dominated by non-epistemic uses (e.g., can was the most frequent modal in Thomas’s input (N=170) yet there was only one epistemic instance). Less frequent modals, however, typically showed a stronger bias towards epistemic. All might uses (at a lower 44 instances) were epistemic. They were therefore sensitive to these less frequent modals because they were not masked by such high frequency non-epistemic use.
We also found an interaction between ‘Input Form Frequency’ and ‘Age’ for both children but in opposite directions. At age four, Thomas produced more epistemic functions with lower frequency modals. Yet, for Helen, this pattern was already observed at three. By four, she produced an epistemic function regardless of the form’s overall frequency. Helen appears developmentally more advanced than Thomas and produces more modals at three. This may explain why Helen’s epistemic modal use is not as dependent on input frequency at four as observed for Thomas.
To summarise, the regression analyses confirmed that age was not a significant main factor in children’s epistemic modal usage. Properties of the input, however, mattered –specifically, the frequency of forms and how consistently they mapped onto the epistemic function. These findings demonstrate the need to include methodological control variables within the analysis.
Acquisition of modal forms with fewer input functions (Research question 4)
Our hypothesis for research question 4 was that the number of distinct functions associated with a specific modal in the input would predict its frequency of use in the children’s speech. Modals with fewer functions ought to promote acquisition (see Appendix C for function distributions per modal and Table 2 for examples of form-function mappings the children produced). To assess this, we fitted a linear regression model using R’s lm function. Our predictor of interest was ‘Input Number of Functions’, i.e., the number of functions associated with the caregivers’ use of a modal. The outcome measure was ‘Child Modal Production’, i.e., number of instances of a modal in the child’s speech. We added ‘Input Form Frequency’, ‘Input Function Bias and ‘Age’ as controls (see Table 3) to ascertain whether children’s modal production is independently influenced by a modal’s distinct number of functions in caregiver speech, over and above other factors. ‘Input Function Bias’ may also influence a child’s production of a modal. A modal that is biased to one particular use may promote children’s understanding if they form a strong association between this form and its meaning. Since the outcome measure was derived from the child’s total number of instances of a modal at age three or four, we did not include MLU as a predictor. When collapsing the child’s modal use at one particular age, their average MLU (see Table 1) would not inform us of any additional variance in the model beyond the age predictor aloneFootnote 5.
For both children, ‘Input Number of Functions’ was not a significant predictor of their modal production (see Tables 7 and 8). ‘Input Form Frequency’, however, was, mirroring our earlier correlation findings. For Thomas, ‘Age’ was also significant, showing he produced more modals at four. We also tested for two-way interactions, but none were significant.
Distribution of modal meanings (Research question 5)
In the previous section, we investigated whether modals with more complex form-to-function mappings were acquired later and showed that this was not the case: only form frequency predicted age of acquisition for modal forms. Research question 5 concerns the prediction that children will use a significantly greater number of functions with specific modals at age 4 compared to age 3. We fitted a linear regression model with ‘Child Number of Functions’ as the outcome (the number of functions associated with each child’s modal use). The child’s age was our predictor of interest. To isolate the effect of age on the outcome, we also included ‘Child Form Frequency’, ‘Input Function Bias’ and ‘Input Number of Functions’ as controlsFootnote 6. ‘Input Form Frequency’, although likely predictive of the child’s modal use, was not added based on the observed correlations between the mothers’ and children’s use of modals (RQ1). This predictor would thus be highly correlated with ‘Child Form Frequency’. ‘Child Form Frequency’ was included since a greater number of meanings could be detected from frequent forms the child produces. The child’s broader use of a modal is also likely to be heavily influenced by a high ‘Input Function Bias’ where the modal is strongly biased towards one meaning. Accordingly, ‘Input Number of Functions’ was incorporated since the number of functions associated with a modal in the input will conceivably affect the number of functions the child produces.
For Thomas, ‘Age’ was a significant predictor. He produced a higher number of functions at four (see Table 9). ‘Input Number of Functions’ was also significant, suggesting Thomas was more likely to use a modal for meanings he was exposed to. We can also see an effect of ‘Input Function Bias’. Thomas produced more functions with a modal if it had a strong bias towards one function. These predictors, in isolation, were not significant for Helen (see Table 10). However, ‘Child Form Frequency’ was significant for both children. More meanings were produced with verbs that they frequently used. This predictor also interacted with ‘Input Function Bias’. A low input function bias was only facilitative with forms that they used frequently. We will return to these findings in the Discussion.
Form-Function Mappings (Research question 6)
In the previous section, we investigated whether age predicted the number of functions the child expressed. The following analyses concern what may govern children’s production of a particular form-function mapping (e.g., can-permission). We were particularly interested in whether, as hypothesized, the raw frequency of specific form-function mappings with individual modals in the input would predict the frequency of these same form-function mappings in the children’s speech, and if this was affected by the child’s age (research question 6). We fitted a linear regression model with two predictors of interest: ‘Input Form Function Frequency’, i.e., the number of instances of a particular form-function mapping in the mothers’ speech (e.g., must-obligation) and the child’s age. The outcome measure was ‘Child Form Function Frequency’, which relates to the frequency of a form-function mapping in the child’s speech (e.g., must-obligation). We also added the following controls: ‘Input Form Frequency’ and ‘Input Form Function Weighting’ that could both influence form-function mapping useFootnote 7. A form-function mapping with a high ‘Input Form Function Weighting’, may be more easily acquired if this form is consistently mapped to this meaning in the input. However, this is likely moderated by form frequency. Even if a form is consistently mapped to a particular function, it does not necessarily mean that this relationship will become entrenched if the form is rarely heard.
For Thomas, no main effects were significant. There was, however, a significant interaction between ‘Input Form Function Frequency’ and ‘Input Form Frequency’ (see Table 11). Thomas was more likely to produce the high frequency, form-function mappings of his input, particularly if the form was relatively frequent overall. This indicates Thomas’s overuse of high frequency form-function mappings, relative to his input, for typically high frequency forms, but some underuse of high frequency form-function mappings with the low frequency forms.
For Helen, the frequency of a given form-function mapping in the input did influence the likelihood that Helen produced a modal form for this specific meaning (see Table 12). Unlike Thomas, however, the influence of this predictor was not mediated by form frequency. ‘Input Form Function Weighting’ was an additional significant predictor with a negative co-efficient. Helen was more prone to produce a specific mapping if the given form did not show a strong weighting towards this meaning. We will return to the roles of frequency and proportional weighting in the Discussion.
Summary
We found significant correlations in modal form frequency between the children and their caregivers. However, overall, both mothers were more inclined to use epistemic modals than their children (analysis 2a). Though, once the same modals that appeared in the mothers’ and children’s samples were controlled for frequency (analysis 2c), this conclusion did not hold. The children and their mothers did not differ in the proportion of these meanings. This suggests that children may not be less capable of producing the epistemic function than their caregivers, all other things being equal. We also investigated the children’s development of epistemic modal use from three to four. Age did not determine epistemic modal use, although the epistemic frequency of the form in the input and/or the child’s general language proficiency did.
Furthermore, we demonstrated the role of the child’s own linguistic experience. Both children used a modal for a greater number of functions with forms that they more frequently produced. This was mediated by the modal’s distributional properties in the input however. Children developed a more versatile use of a modal with frequent forms that were not strongly biased towards one meaning. They also showed sensitivity to the frequency of fine-grained, form-function mappings of their input. With a given modal, they were more prone to use this verb for its most common input function. Most of these findings were true for both children – however, there were some individual differences which we take up in the Discussion (see Table 13 for an overview).
Note: Predictors are ticked if the predictor was significant in isolation or as part of an interaction
Discussion
In this paper, we looked at the relationship between modal forms and their functions and investigated this in relation to i) the association between epistemic modal use and age since previous researchers have proposed a link between the acquisition of the epistemic function and Theory of Mind (Moore et al., Reference Moore, Pure and Furrow1990; Papafragou, Reference Papafragou1998) and (ii) usage-based approaches which predict children’s acquisition to be aided by form frequency and their associated functions (Tomasello, Reference Tomasello2003). In relation to the latter, previous work on children’s acquisition of ambiguous lexical items had successfully shown how acquisition was impacted not only by sheer form frequency but also nuanced form-meaning pairings in the input (Cameron-Faulkner et al., Reference Cameron-Faulkner, Lieven and Theakston2007; Theakston et al., Reference Theakston, Lieven, Pine and Rowland2002). The latter approach, however, had not yet been applied to the study of English modals, a complex system in which there are many-to-one mappings of form to function and vice versa.
Production of modal forms and the epistemic function (Research questions 1-3)
Similarly to previous research, we found positive correlations between the raw frequency of specific modals in the input and children’s speech (research question 1) (Van Dooren et al., Reference Van Dooren, Dieuleveut, Cournane and Hacquard2017; Wells, Reference Wells1979). Caregivers and children were more likely to produce non-epistemic functions (Wells, Reference Wells1979), and caregivers produced significantly more epistemic uses than children (research question 2) (Van Dooren et al., Reference Van Dooren, Dieuleveut, Cournane and Hacquard2017). However, an important question we aimed to answer was whether children are less cognitively capable of producing an epistemic function than their caregivers. Our findings suggest not. Once we controlled for modal forms (including forms only capable of both non-epistemic and epistemic functions) and matched for their quantity across mothers and children (analysis 2c), we found no significant difference in epistemic usage. This shows that children can produce some epistemic functions, supporting more recent work in which children produce epistemic adverbs early on (Cournane, Reference Cournane2021). However, children might still struggle with some epistemic uses such as inferences (e.g., it must be broken)Footnote 8 or those in which we take another’s perspective (e.g., Sam would like that book), given that related work on mental state terms shows correlations between third (but not first) person complements and false belief (Boeg Thomsen et al., Reference Boeg Thomsen, Theakston, Kandemirci and Brandt2021). Ideally, future work should introduce measures of the child’s socio-cognitive development as independent predictors of the different types of epistemic functions they produce, although experiments would be needed to examine children’s grasp of nuanced epistemic meanings.
We also demonstrated that age was not a significant predictor of epistemic modal use. This suggests that, even if socio-cognitive development is relevant to the epistemic function, it is not necessarily equivalent to the child turning four, the critical age at which children start to reliably pass explicit false-belief tests (research question 3) (Wellman et al., Reference Wellman, Cross and Watson2001). Our study highlights the importance of controlling for modals’ input characteristics before assuming an independent role of socio-cognitive development, as epistemic modal usage was driven by the input. If, in accordance with the usage-based approach, the children are working out how to convey an epistemic function from their input, they will likely mirror the forms with which their caregivers express this meaning.
Form-function mappings
Production of modal forms (Research question 4)
The usage-based approach suggests that language acquisition is aided not only by a form’s frequency in the input, but also its associated functions, given that children learn language, at least partially, in terms of their communicative intent. We found that the number of functions associated with a modal did not predict the children’s frequency of use of these forms (research question 4). For both children, however, the frequency of the form itself in the input was a significant predictor of their production. Thus, somewhat contrary to the suggestion that a one-to-one mapping between a form and its function promotes acquisition (Bates & MacWhinney, Reference Bates and MacWhinney1987; Slobin, Reference Slobin1985), here it seems that the sheer frequency with which the two children heard the form predicted its emergence in their speech. That still leaves open the question of the function(s) for which they used the form and whether this was related to its input usage.
Number of functions produced (Research question 5)
We therefore explored which functions children used modals for and whether they demonstrated a wider distribution in modal functions with a particular form at four than at three given greater exposure to language (research question 5). Both children used a greater number of functions with modals they used frequently and that were not strongly biased towards one meaning (i.e., demonstrating a low ‘Input Function Bias’). Age, however, was only a significant factor in the number of functions that Thomas produced. The children’s own modal usage therefore affected what they learned, and greater diversity in the caregiver’s use of these modals encouraged greater diversity in the children’s use. However, for less frequent modals, that typically exhibit inconsistent form to function mappings, a relatively higher function bias was facilitative to first encourage the children’s production of its most frequent mapping and only later other meanings. The impact of proportional bias was therefore mediated by form frequency, but not all modals can be extended to different functions (e.g., all might uses were epistemic). Relatedly, a modal’s number of functions in the input was a significant predictor of the number of functions Thomas produced, suggesting that he was sensitive to each modal’s individual usage patterns.
Form-function mappings produced (Research question 6)
Another novel aim of our study was to explore whether the children’s use of a particular form-function mapping (e.g., can-permission) was predicted by its input frequency (research question 6). We found that this was the case for both children. So, although the number of meanings mapped to a particular form did not predict the children’s use of that form, the frequency of a given form-function mapping did. For Thomas, this was moderated by form frequency. The frequency of a specific form-function mapping in the input influenced his use, provided that the form was relatively frequent in his mother’s speech. These findings mirror previous research on acquisition of ambiguous lexical items in which the most frequent form-function mapping in the input is learned first (Cameron-Faulkner et al., Reference Cameron-Faulkner, Lieven and Theakston2007; Theakston et al., Reference Theakston, Lieven, Pine and Rowland2002).
Alongside the raw frequency of a given form-function mapping in the input as a predictor for this analysis, we also investigated whether the extent to which a form is weighted to one of these input functions, relative to others, affects the children’s production of specific form-function mappings over and above frequency alone. For Thomas, the weighting of a verb (e.g., must) towards a specific meaning (e.g., obligation) in the input did not influence his production of that mapping (e.g., must-obligation). Helen, however, was more prone to use a mapping, over and above its form-function frequency, if the given form did not exhibit a strong weighting towards this meaning. This deviates from our prediction (research question 4) that modals with a dominant meaning would be easier to learn than forms exhibiting a more equal distribution of different functions, due to lower competition of other mappings associated with that same form (Bates & MacWhinney, Reference Bates and MacWhinney1987). One possibility is that a lower form to function weighting for less frequent mappings may aid acquisition by encouraging the child to pay more attention to the form itself when used for a diverse range of functions.
This raises the question of the relative importance of form-function mapping frequency vs. its proportional use in acquisition. For Thomas, high frequency seemed to afford acquisition of a mapping. For Helen, however, though frequency of a particular mapping did predict usage, variability in a modal’s use seemed beneficial, particularly in learning to map the modal to less frequent functions. The complexity of the input should also be considered. Thomas’s mother had a far higher MLU than Helen’s mother in the age three samples (6.22 vs. 4.51, respectively), potentially making it harder for Thomas to keep track of the different uses he was hearing. The influence of these input characteristics may also vary according to linguistic and/or socio-cognitive development, age, and children’s grasp of the underlying modal concepts. Both children frequently produced non-epistemic meanings such as ability, futurity, permission, and obligation meanings from age three, consistent with children’s early success on experimental tasks involving deontic reasoning and awareness of others’ intentions and desires (Cummins, Reference Cummins1996; Woodward, Reference Woodward1998). Other functions including hypothetical statement and question and past habitual event were extremely rare in the children’s speech at both ages. Though beyond the scope of this paper, it is possible that, regardless of the input, children will not produce a function if they have not grasped the underlying concept. Posing hypothetical questions, for example, could rely on children’s ability to represent different worlds and, hence, possibility, which is typically acquired later (Redshaw & Suddendorf, Reference Redshaw and Suddendorf2016). Despite this, we cannot rule out a possible role of the input in these cases since very few instances of these functions also appeared in the caregivers’ speech.
The form-function mappings that children produce are also affected by their communicative goals and the surrounding context. In all samples of the children’s speech, the ability function was dominant (see Appendix C). Can’t to denote inability was relatively more frequent than the can-ability counterpart for both children at age three (e.g., ‘can’t reach it’ and ‘can’t get it out’ uttered by Thomas). The children may have learned that producing these types of utterances elicits help from others and therefore provides high reward. In addition, our data were gathered through recordings in the child’s home during play activities, a context that may have biased children to produce more ability meanings as opposed to more abstract epistemic or hypothetical functions. For instance, children frequently used the can-ability mapping within a pretend play context (e.g., ‘Bertie can fly’, ‘the bus can help’). Other activities such as shared book-reading amongst caregivers and children could encourage more abstract functions (e.g., epistemic) if discussing characters’ knowledge and belief states. This would be an interesting avenue for future work. However, we should note that the modals in the input samples were also strongly weighted to the ability function, perhaps also as a function of context, which likely promotes its acquisition. Moreover, this finding in relation to adult speech is not confined to child-directed speech. Other research, which has analysed English adult-directed speech in the British National Corpus, has found that modal verbs are predominantly used for an ability meaning relative to epistemic, obligation and permission functions (Collins, Reference Collins2009; Kennedy, Reference Kennedy2002). In line with our findings, this function is most typically conveyed through the use of can. This suggests that talking about ability is something that has particular importance for speakers in general.
In sum, even for such a complex system of form-function mappings as the English modals, the frequency of a particular form and form-function mapping in the input predicted both children’s usage. Our findings support functionalist approaches to language acquisition such as the usage-based approach in which construction frequency (i.e., the pairing of a form with a specific function) in the input predicts how well, and how early, the child acquires this construction (Goldberg, Reference Goldberg2006; Tomasello, Reference Tomasello2003). However, our findings question the importance of one-to-one mappings between forms and their functions in acquisition, as proposed by the competition model (Bates & MacWhinney, Reference Bates and MacWhinney1987). A form’s bias towards a given function in the input did not predict children’s use of that form nor did a form’s strong weighting to one function over others facilitate children’s production of that function, at least when the frequency of form-function mapping is simultaneously considered.
Conclusion
This paper is the first to take a usage-based approach to the acquisition of English modals. We provide the most comprehensive analysis of the influence of modal forms and functions in the input on children’s acquisition of modals to date, using novel controls that represent frequency of form, function, and their mappings. Modals are highly complex with some forms exhibiting one-to-one form-function mappings and other forms mapping onto numerous meanings. The children’s use of modals was shaped by experience. In particular, the children were more likely to produce the frequent modals and form-function mappings of their input. This supports usage-based theories of language acquisition in which function, and how frequently this is mapped onto a given form, predicts acquisition (Tomasello, Reference Tomasello2003). We did, however, find individual differences regarding the children’s sensitivity to the modals’ distributional properties, potentially reflecting differences in their stage of linguistic development and/or the complexity of the input they received. Acquisition of modals is crucial in developing children’s pragmatic skills but further research, which controls for the modals’ input properties in tandem with the child’s linguistic and cognitive development, is required to best tap into their acquisition and knowledge of these complex verbs.
Declarations of Competing interest
Declarations of interest – none.
Acknowledgements
This work was supported by the International Centre for Language and Communicative Development (LuCiD). We are very grateful for the support of the Economic and Social Research Council [Grant numbers: ES/L008955/1, ES/S007113/1 and PhD studentship reference: 10054247].
Appendices
Appendix A: Modal Coding Scheme
The modal auxiliaries you will be focusing on are: can, could, may, might, must, shall, should, will and would (both affirmative and negated forms, e.g., can’t).
1. Function
You are first going to code whether the modal verb has an epistemic or non-epistemic function. If any modal function appears ambiguous, please consult the transcript in which it appears and read the five lines prior to and following the utterance to gain contextual information.
a) EPISTEMIC
Code the modal verb as epistemic if the speaker is using the verb to reflect their degree of commitment to the truth of the following sub-clause (Papafragou, Reference Papafragou1998, p.370), i.e., how certain or uncertain they are that the content they are expressing is true.
E.g., “That must/will be the postman” (on hearing the doorbell at an expected delivery period) to reflect certainty that this is the case, otherwise opting for a less forceful modal such as “may” or might” (during a potential delivery period when expecting other guests) to express possibility that this conclusion may be either true or false.
Other instances of epistemic modality may include (Brown, Reference Brown1973):
-
• Making an assumption, i.e., predicting or hypothesizing about a situation (either based on available evidence or what you typically know about a person or an event). This may refer to an event in the present, past or future.
E.g., “Laura will enjoy the music”, “That would be nice”, “We could be waiting here for a long time”, “They would have been scared”.
Note: This does not include questions relating to this meaning, e.g., “Would Daddy be angry?” This is because when framing questions using a hypothetical modal such as ‘would’, the speaker is asking a hypothetical question (see this subcategory below), not making assumptions about future events (which would be indicative of speaker belief).
-
• To infer/draw a conclusion (this may or may not be based on direct evidence)
E.g., “You must have left the house later than usual to have missed your train”, (baby cries)> “Jamie might be hungry”.
b) NON-EPISTEMIC
Epistemic modality is subjective in that the speaker chooses a modal verb in order to reflect their beliefs or attitudes towards a proposition. Non-epistemic modality, on the other hand, is often defined as concerning conditioning factors, which are external to the individual (Palmer, Reference Palmer2001, p.9), typically (but not limited to) permission and obligation.
Code the modal verb as non-epistemic if it carries out one of the following functions (defined below): ability, futurity, hypothetical question, hypothetical statement, obligation, past tense ‘will’, past habitual event, permission, refusal to act, suggestion or willingness.
2. Non-epistemic subcategory
You will first need to label the verb as non-epistemic, then in the following column, assign its meaning to one of these subcategories.
a) ABILITY
Code the modal verb as relating to ability if the speaker is expressing ability (or inability) to carry out a task. This may be concerned with their own or others’ actions and may also include questions relating to this meaning.
E.g., “I can reach the bottle”, “He couldn’t catch the bus”.
b) FUTURITY
Code the modal verb as relating to futurity if its sole purpose is to indicate an event occurring in the future. This will often include the speaker referring to their intention to carry out an act but may be focused on another individual or an event. This also includes questions relating to this meaning.
However, be careful to consider whether the verb is being used epistemically (e.g., predicting/hypothesizing). For example, “That dress will fit you”, “Daddy won’t be home until at least 6 o’clock with the traffic”.
E.g., of futurity:
“I will go to the shops in an hour”, “You will have to make sure that you remember to pack your PE kit before school”, “Will you be seeing your grandparents later?” “I shall walk the dog this afternoon.”.
c) HYPOTHETICAL STATEMENT
Code the modal verb as a form of hypothetical statement if it is a statement used to describe what may or may not happen in the future (or the past, though this is less common). It is hypothetical in that the speaker is imagining an event, which has not (or may not) take place. However, without assuming or predicting the event associated with an epistemic reading.
E.g., “If my boss would let me, I would take more holidays”, “We can go to the cinema, if you would like that”, “Say the names of the teachers you would miss when you left school”.
This can be contrasted with epistemic instances of ‘would’. E.g., “You would not remember that, you are too young”, “They would not be pleased if we didn’t pay them”.
d) HYPOTHETICAL QUESTION
Code the modal verb as a form of hypothetical question if its sole purpose is to ask what may or may not happen in the future (or the past). Again, this is deemed as hypothetical as the speaker is imagining an event, which has not (or may not) take place.
E.g., “What would the children do?” “What would daddy have said?”
Note: For questions solely focused on the future (and not hypothetical by use of ‘could’ or ‘would’), this would be classed as futurity. E.g., “Will Hannah be at the party?” “Will you be going to the shops later with grandad?”
e) OBLIGATION
Code the modal verb as relating to obligation if its main function is to express that the speaker or listener should (or should not) carry out an action.
i.) These utterances are usually expressed in a forceful manner and may also include questions relating to this meaning.
E.g., with context: Mother looks at their messy living room and feels the need to tidy it before their guests arrive that evening. She then says: “I must clean up this room”.
E.g., with context: Mother’s child is misbehaving and throws their cup on the floor over dinner. She becomes angry and says: “You should pick that up right now”.
E.g., with context: Mother’s child keeps shouting and she wants them to behave. She utters: “Can you be quiet, please?”.
E.g., with context: Mother is growing frustrated when their child is choosing to draw with crayon on their kitchen table. She then tells them off, reminding them that they should be drawing on paper instead of ruining her furniture by saying: “Shall we draw on the paper and not the table?”.
ii.) There may also be occasions of modals being used less forcefully, but would still be regarded as a form of obligation if the mother is giving an order to their child. For instance, “Would you like to put that wrapper in the bin for mummy?” “Will you remind me to pack your socks?”.
f) PAST HABITUAL EVENT
Code the modal verb as relating to a past habitual event if it is used to describe a habitual event in the past, i.e., something that occurred more than once/on a regular basis. This may also include questions relating to this meaning.
E.g., “You would sleep for hours when you were a baby”, “We would go to France every year for our summer holidays”.
g) PAST TENSE ‘WILL’
Code the modal verb as the past tense of ‘will’ (i.e., ‘would(n’t)’) if its sole purpose is to discuss this event in the past. This may also include questions relating to this meaning.
E.g., “I thought we would go the shops”, “Amy promised that she wouldn’t be late”.
This would also incorporate instances of reported speech in which the speaker is specifically describing what an individual said previously.
E.g., “Daddy said he would be home by 9 o’clock.”, “Kelly said she would come to the party.”.
h) PERMISSION
Code the modal verb as relating to permission if its meaning is associated with a speaker granting/refusing someone permission to do something or expressing their own allowance. This may also include questions relating to this meaning.
E.g., “You can go play when you have finished your tea” “Could I watch the television?”, “May I have a drink?”.
i) REFUSAL TO ACT
Code the modal verb as relating to refusal to act if the speaker is describing how, on a particular occasion, an individual, object or event did not comply with an action. This may also include questions relating to this meaning.
E.g., “We tried to cheer you up but you wouldn’t smile”, “My car engine wouldn’t start this morning”.
j) SUGGESTION
Code the modal verb as relating to a suggestion if the aim of the sentence is to suggest an idea (without the forceful nature associated with obligation). This can also be distinguished from obligation in that the speaker is not giving an order, but solely introducing a concept or activity. This may also include questions relating to this meaning.
E.g., with context: Mother is thinking about what she and her child could do on their free afternoon together. She then says to him: “We can go for a nice walk later”.
E.g., with context: Mother is wondering what story to read her son before bedtime. She picks up a book from the shelf and says to him: “Shall we read this book next?”.
E.g., with context: Mother and their daughter are in the child’s bedroom. The mother picks up a pretty dress from her wardrobe and tells her: “You could wear this to the party later, couldn’t you?”.
k) WILLINGNESS
Code the modal verb as relating to willingness if it is associated with the speaker (or their interlocutor)’s desires or preferences. This may also include questions relating to this meaning.
E.g., “Would you like some milk?” “I would like a sandwich”.
3. Other
Finally, if you feel you cannot assign the modal to either category (if it does not fall into one of the aforementioned non-epistemic subcategories), please code it as ‘other’. This should really only apply to a couple of utterances in the corpus, i.e., if the modal verb is part of a fixed, formulaic phrase where you cannot isolate the modal meaning. For instance, “I could do with a good nap.”.
Appendix B: Raw Modal Frequencies
Appendix C: The frequency of specific modal form-function mappings in the corpora
Appendix D: Model building process
An example of the process followed to build a logistic regression model as applied to research question 3 (whether Helen’s epistemic modal use was predicted by her age).