German children’s processing of morphosyntactic cues in wh-questions

ATTY SCHOUWENAARS; PETRA HENDRIKS; ESTHER RUIGENDIJK

doi:10.1017/S0142716418000334

German children’s processing of morphosyntactic cues in wh-questions

Published online by Cambridge University Press: 08 October 2018

ATTY SCHOUWENAARS ,

PETRA HENDRIKS and

ESTHER RUIGENDIJK

Show author details

ATTY SCHOUWENAARS*: Affiliation:
University of Groningen and University of Oldenburg
PETRA HENDRIKS: Affiliation:
University of Groningen
ESTHER RUIGENDIJK: Affiliation:
University of Oldenburg
*: ADDRESS FOR CORRESPONDENCE Atty Schouwenaars, University of Oldenburg, Institute of Dutch Studies, Ammerländer Heerstraße 114-118, 26111 Oldenburg Germany. E-mail: [email protected]

Article contents

Abstract
EXPLAINING CHILDREN’S SUBJECT–OBJECT ASYMMETRY IN COMPREHENSION
CHILDREN’S PRODUCTION OF WH-QUESTIONS
PREDICTIONS FOR OUR STUDY
CURRENT STUDY
EXPERIMENT 1: COMPREHENSION
EXPERIMENT 2: PRODUCTION
DISCUSSION
Footnotes
References

Rights & Permissions

Abstract

Two experiments investigated the effects of case and verb agreement cues on the comprehension and production of which-questions in typically developing German children (aged 7–10) and adults. Our aims were to determine (a) whether they make use of morphosyntactic cues (case marking and verb agreement) for the comprehension of which-questions, (b) how these questions are processed, and (c) whether the presence and position of morphosyntactic cues available for the listener influence the speaker’s production of which-questions. Performance on a picture selection task with eye tracking shows that children with low working memory make less use of morphosyntactic cues than children with high working memory and adults when interpreting object questions. Gaze data of both groups reveal garden-path effects and revisions for object and passive questions, which can be explained by a constraint-based account. Furthermore, children’s difficulties with object questions are related to the type of disambiguation cue. In a question elicitation task with patient-initial items, children overall prefer production of passives, whereas adults’ productions depend on the availability of disambiguation cues for the listener.

Keywords

eye tracking incremental processing language acquisition morphosyntax wh-questions working memory

Type: Original Article
Information: Applied Psycholinguistics , Volume 39 , Issue 6 , November 2018 , pp. 1279 - 1318

DOI: https://doi.org/10.1017/S0142716418000334 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright: © Cambridge University Press 2018

It is well known that children have problems with thematic-role assignment. Understanding who is doing what to whom is crucial for the correct interpretation of complex sentences such as wh-questions. Many studies report that object questions, in which the object precedes the subject, are difficult for children to comprehend (e.g., for English, O’Grady, Reference O'Grady1997; for Italian, De Vincenzi, Arduino, Ciccarelli, & Job, Reference De Vincenzi, Arduino, Ciccarelli and Job1999; for Hebrew, Friedmann, Belletti, & Rizzi, Reference Friedmann, Belletti and Rizzi2009; Friedmann & Novogrodsky, Reference Friedmann and Novogrodsky2011; for Dutch, e.g., Metz, van Hout, & van der Lely Reference Metz, van Hout and van der Lely2010; Schouwenaars, van Hout, & Hendriks, Reference Schouwenaars, van Hout and Hendriks2014; for German, Biran & Ruigendijk, Reference Biran and Ruigendijk2015; Roesch & Chondrogianni, Reference Roesch and Chondrogianni2015). These studies often report offline accuracy scores, investigating the final interpretation of wh-questions. Online self-paced-reading studies with adults report longer reading times for object questions than for subject questions (Meng & Bader, Reference Meng and Bader2000; Schlesewsky, Fanselow, Kliegl, & Krems, Reference Schlesewsky, Fanselow, Kliegl and Krems2000). Longer reading times are interpreted as a reflection of a revision necessary for the correct interpretation. In the current study, we will investigate online processing of wh-questions using eye tracking to find out whether gaze patterns reflect such revisions, not only for adults but also for children. Furthermore, children’s working memory is measured, as processing wh-questions may involve keeping in mind several possible interpretations or maintaining the dislocated object in memory for some time, both of which require sufficient working memory capacity (e.g., Fiebach, Schlesewsky, & Friederici, Reference Fiebach, Schlesewsky and Friederici2002). German is chosen as the language of investigation because of its different morphosyntactic cues, such as case and verb agreement.

German allows for variation in word order, which makes the following sentence structurally ambiguous:

(1) Welche Schüler begrüßen die Lehrer?

Which pupils are greeting the teachers?

A native speaker of German, when reading this sentence out of context, will likely interpret pupils as the subject and teachers as the object of the sentence. This interpretation is guided by a preference for canonical word order in which the subject precedes the object. Nevertheless, also the reversed interpretation is possible, namely, teachers greeting pupils. Whereas German declarative sentences often start with the subject, wh-questions usually start with the wh-phrase. Accordingly, when the wh-phrase functions as the subject, the subject precedes the object, resulting in a subject question. In contrast, when the wh-phrase is the object, the object precedes the subject, resulting in noncanonical word order in object questions.

How does a listener know whether pupils or teachers is the subject of the sentence? In English, the position of the verb differs between subject and object questions (see the English translations of [2]). This does not hold for German, where the order is always noun phrase–verb–noun phrase (NP–V–NP) and hence does not help the listener in establishing the subject and the object of the sentence. Often context, prosody, or semantic cues such as definiteness and animacy help the listener to correctly interpret wh-questions (especially in globally ambiguous sentences like [1], see, e.g., Bouma, Reference Bouma2008). Moreover, morphosyntactic cues can disambiguate subject and object questions. For example, in German, case on the wh-word or article can disambiguate this as the subject or the object and lead to a single possible interpretation (see [2]).

(2a) Welcher Schüler begrüßt den Lehrer?

Which_NOM pupil greets the_ACC teacher?

“Which pupil is greeting the teacher?”

(2b) Welchen Schüler begrüßt der Lehrer?

Which_ACC pupil greets the_NOM teacher?

“Which pupil is the teacher greeting?”

Singular masculine nouns in German have distinctive case marking that can indicate the subject and the object of a sentence. Nominative case in (2a) on the wh-phrase, welcher “which,” marks the first NP as the subject. Accusative case on the article of the second NP, den “the,” marks the second NP as the object. Therefore, (2a) is a subject question. Likewise, accusative case in (2b) on the wh-phrase, welchen “which,” marks this NP as the object and nominative case on the article of the second NP, der “the,” marks this NP as the subject. Therefore, (2b) is an object question.

Verb agreement can also disambiguate subject and object questions. If only one NP agrees in number with the verb, only that NP can be the subject. In (3a) only the first NP, welche Schülerin “which pupil,” corresponds in number with the singular inflection on the verb, begrüßt “greets,” and therefore is the subject. This leads to a subject question. In (3b), only the second NP, die Lehrer “the teachers,” corresponds in number with the plural inflection on the verb, begrüßen “greet,” and therefore is the subject. This leads to an object question.

(3a) Welche Schülerin begrüßt die Lehrer?

Which pupil_SG greets_SG the teachers_PL?

“Which pupil is greeting the teachers?”

(3b) Welche Schülerin begrüßen die Lehrer?

Which pupil_SG greet_PL the teachers_PL?

“Which pupil are the teachers greeting?”

In (3), case does not disambiguate between subject and object, as the determiners of feminine and plural nouns have the same form for nominative and accusative case. These examples are therefore disambiguated by verb agreement only.

The meaning of sentence (3b) could also be realized as a passive question. In passives, unlike active sentences, not the thematic role of agent, but that of patient is realized as the subject. Hence the patient Welche Schülerin “which pupil” is the object in the active question (3b), but the subject in the passive question (4). Therefore, unlike the object questions (2b) and (3b), the passive question (4) starts with the subject.

(4) Welche Schülerin wird von den Lehrern gegrüßt?

Which pupil_SG is-being_SG by the teachers_PL greeted_PPART.?

“Which pupil is being greeted by the teachers?”

Subject-first structures are acquired earlier and are easier to process than object-first structures. This difference is generally referred to as the subject–object asymmetry. Passives are generally regarded to be acquired relatively late in comprehension (Borer & Wexler, Reference Borer and Wexler1987; Maratsos, Fox, Becker, & Chalkley, Reference Maratsos, Fox, Becker and Chalkley1985). Nevertheless, in passive questions thematic role assignment may be easier than in object questions, as passive morphology (the verb werden “to be,” the by-agent, and the past participle) may be more noticeable and reliable than case or verb agreement. One reason to include passive questions in our study is to compare two different types of noncanonicity: object-before-subject and patient-before-agent. In object questions, these syntactic functions and thematic roles go together (as the subject is the agent), but in passive questions they do not. Passive questions are a viable alternative to object questions for expressing a question about the patient.

In addition, we examine the production of questions. Comprehension may affect production. When in production multiple forms express the same meaning, the speaker’s choice may be influenced by the listener’s ease of comprehension. If speakers take the listener’s perspective into account, we expect them to produce the form that is easier to comprehend for the listener. The presence and position of morphosyntactic cues may therefore not only influence comprehension but also indirectly production.

Thus, the research questions we address in this study are (a) whether German children and adults make use of morphosyntactic cues (case marking and verb agreement) for the comprehension of which-questions, (b) how which-questions are processed, and (c) whether the presence and position of morphosyntactic cues available for the listener influence the speaker’s production of which-questions. These questions will be investigated in a picture-selection task using eye tracking, and a corresponding question-elicitation task with the same participants. We will first review previous explanations for the subject–object asymmetry in children’s comprehension of which-questions. Next, we will review a potential account of children’s production of which-questions. Predictions of this constraint-based account will be formulated for the final interpretations, online gaze patterns, and produced forms by adults and children, and for active as well as passive questions. Then, we will describe our experiment to test these predictions and present our behavioral results, gaze data, and production results. Finally, we will discuss the results and draw conclusions.

EXPLAINING CHILDREN’S SUBJECT–OBJECT ASYMMETRY IN COMPREHENSION

German-speaking children’s ability to use case marking for sentence comprehension starts to develop around the age of 5 (e.g., Lindner, Reference Lindner2003; Roesch & Chondrogianni, Reference Roesch and Chondrogianni2015). Nevertheless, even older children still make many mistakes (Biran & Ruigendijk, Reference Biran and Ruigendijk2015). Whereas children interpret subject questions correctly, they often incorrectly interpret object questions as subject questions. It is argued that 3-year-old children are sensitive to differences in case marking, but are not yet able to use this for building the correct underlying syntactic structure (Schipke, Knoll, Friederici, & Oberecker, Reference Schipke, Knoll, Friederici and Oberecker2012). Children seem to be even less able to use verb agreement, as they still misinterpret object questions disambiguated solely by verb agreement until the age of 8 or 9 (for Dutch, Metz et al., Reference Metz, van Hout and van der Lely2010; Schouwenaars et al., Reference Schouwenaars, van Hout and Hendriks2014; for Italian, De Vincenzi et al., Reference De Vincenzi, Arduino, Ciccarelli and Job1999), even though 5-year-old children seem sensitive to verbal inflection (Brandt-Kobele & Höhle, Reference Brandt-Kobele and Höhle2014). Object-first sentences disambiguated by verb agreement also seem to cause greater processing difficulties for German-speaking children than sentences disambiguated by case marking (Arosio, Yatsushiro, Forgiarini, & Guasti, Reference Arosio, Yatsushiro, Forgiarini and Guasti2012). The same holds for adults (Friederici, Steinhauer, Mecklinger, & Meyer, Reference Friederici, Steinhauer, Mecklinger and Meyer1998; Meng & Bader, Reference Meng and Bader2000). It has been argued that this is caused by the fact that case marking appears directly on the NPs, whereas agreement markers on the verb are indirect (Clahsen, Reference Clahsen1986), meaning that for agreement, number marking on the NP and number marking on the verb have to be linked to one another.

Various explanations have been proposed for children’s subject–object asymmetry in comprehension. One explanation is a processing explanation known as the active filler hypothesis (AFH; Frazier & Flores d’Arcais, Reference Frazier and Flores d’Arcais1989), which has been extended to acquisition (Avrutin, Reference Avrutin2000; Deevy & Leonard, Reference Deevy and Leonard2004). When parsing a sentence, children (like adults) take the first NP to be the subject, which is assigned the agent role. For subject questions, this is the correct interpretation, but for object questions, it is incorrect. Once this misinterpretation is noticed, the parser has to go back to the beginning of the sentence and reinterpret the sentence. It is argued that children do not have enough working memory resources or cognitive control to do so (Choi & Trueswell, Reference Choi and Trueswell2010; Deevy & Leonard, Reference Deevy and Leonard2004). This explanation accounts for adults’ and children’s difficulties in processing object-first structures. It also makes predictions for incremental interpretation: in both subject and object questions, initially the first NP will be interpreted as the subject and hence agent. In the literature on AFH, no explicit predictions have been formulated on the processing of passive questions or on the production of wh-questions.

Another prominent explanation is a syntactic explanation derived from Rizzi’s (Reference Rizzi1990, Reference Rizzi2004) relativized minimality approach (RM; Friedmann et al., Reference Friedmann, Belletti and Rizzi2009; Friedmann & Novogrodsky Reference Friedmann and Novogrodsky2011; Jakubowicz, Reference Jakubowicz2011). RM posits that wh-questions involve syntactic movement operations. In object questions, a relation or dependency needs to be formed between the sentence-initial object wh-phrase and its trace in its original position. This becomes harder if there is an intervener (here the subject) that is a potential candidate for this dependency. Therefore, object questions are harder to process than subject questions, in which there is no intervener. Children experience difficulties especially when the object wh-phrase and the subject intervener are of the same structural type: for example, when they both have a determiner (article, wh-word) and a noun (see Friedmann et al., Reference Friedmann, Belletti and Rizzi2009, for details). According to RM, in passives there is no intervener (Contemori & Belletti, Reference Contemori and Belletti2014). Instead, the internal argument is first “smuggled” inside the moved verb phrase beyond the position of the external argument. Then the internal argument is extracted from the verb to a higher position. Thus, the internal argument is closest to the subject position without directly crossing over the external argument (see Collins, Reference Collins2005, for details). Assuming this smuggling hypothesis, no interpretation problems are predicted for passive questions. Furthermore, it is argued that children prefer passive constructions over object-first constructions in production (Jensen de López, Sundahl Olsen, & Chondrogianni, Reference Jensen de López, Sundahl Olsen and Chondrogianni2014). Therefore, RM predicts difficulties in comprehension and production of object questions, but not subject and passive questions. RM as a theoretical account has been used to predict slower sentence processing for intervention effects, but it does not make predictions about the exact locus of processing difficulty.

A third prominent explanation for children’s subject–object asymmetry is a cue-based explanation based on the competition model (CM). The CM posits that people compute the interpretation of a sentence on the basis of various linguistic cues, eventually choosing the interpretation with the highest likelihood. Initially introduced for sentence processing, this performance model was later applied to language acquisition (Bates & MacWhinney, Reference Bates and MacWhinney1989; MacWhinney, Reference Macwhinney2005). According to the CM, language acquisition requires detecting surface cues in the language and determining the relative strength of these cues, which is based on the reliability and availability of the cues. Whereas there is consensus that case cues are more reliable than word order cues in German, there is no agreement on the validity (the product of reliability and availability) of these cues. According to Kempe and MacWhinney (Reference Kempe and MacWhinney1998), the validity for word order is higher than for case, whereas Dittmar, Abbot-Smith, Lieven, and Tomasello (Reference Dittmar, Abbot-Smith, Lieven and Tomasello2008) argue that the validity for case is higher than for word order in German. Regarding acquisition, the CM predicts that children acquire cues with a higher validity before those with a lower validity (Bates & MacWhinney, Reference Bates and MacWhinney1987). Furthermore, children’s interpretations initially seem to depend on cue availability, and only later cue reliability is used. This could be an explanation for children’s difficulties interpreting object questions. When they base their interpretation on cues that are high in availability, such as word order, instead of high in reliability, such as case, they interpret object questions as subject questions. The CM also makes explicit predictions about children’s comprehension of passives. Due to language-specific properties, such as less reliance on constituent order in German than in English, it is predicted that German children understand passive sentences 1 year earlier than English children (Aschermann, Gülzow, & Wendt, Reference Aschermann, Gülzow and Wendt2004). To our knowledge, however, there are no studies within CM directly comparing comprehension of passives and object-first structures.

Regarding sentence processing, earlier work argues that interpretations do not change when new information comes in (MacWhinney, Bates, & Kliegl, Reference MacWhinney, Bates and Kliegl1984), but later work argues that thematic role assignment is updated at each point in sentence processing and therefore interpretations can change (Bates & MacWhinney, Reference Bates and MacWhinney1989). As for sentence production, according to the CM this is determined by function and frequency of grammatical forms (Bates & MacWhinney, Reference Bates and MacWhinney1989). Therefore, predictions about the production of wh-questions cannot directly be derived from the model itself.

Another explanation for children’s subject–object asymmetry in comprehension is a constraint-based explanation in terms of optimality theory (OT; see Prince & Smolensky, Reference Prince and Smolensky2004). In OT, the realization and interpretation of linguistic expressions is determined by the interaction between the constraints of the grammar, which express general tendencies of the language that can be in conflict. The realized form or selected interpretation is the form or interpretation that optimally satisfies these interacting constraints. Children’s interpretation of wh-questions may result from the interaction between conflicting constraints (Schouwenaars et al., Reference Schouwenaars, van Hout and Hendriks2014). A first relevant constraint is Wh-First, which holds that a wh-constituent comes first in a sentence. When the wh-consituent is the patient, this constraint is in conflict with the constraint Agent-First, which holds that the agent comes first in a sentence (cf. Bouma, Reference Bouma2008; de Hoop & Lamers, Reference de Hoop and Lamers2006). Because, in German, Wh-First is ranked higher than Agent-First (Bouma, Reference Bouma2008; Zeevat, Reference Zeevat2006), a violation of the weaker constraint Agent-First is allowed in order to satisfy the stronger constraint Wh-First. Other morphosyntactic constraints outranking Agent-First are Case, which holds that the subject is marked with nominative case and the object is marked with accusative case, and Agreement, which holds that the verb agrees with the subject (de Hoop & Lamers, Reference de Hoop and Lamers2006). As a result of these interacting constraints, the optimal interpretation of object questions and passive questions, satisfying the constraints best, is a patient-first interpretation.

In OT, children are argued to initially entertain a different constraint ranking than adults (e.g., Fikkert & de Hoop, Reference Fikkert and de Hoop2009; Smolensky, Reference Smolensky1996). This explains children’s non-adultlike patterns of production and interpretation. For example, unlike adults, children may give more importance to Agent-First than to Agreement (Schouwenaars et al., Reference Schouwenaars, van Hout and Hendriks2014) and Case. This non-adultlike ranking leads to a different optimal interpretation for object questions, namely, an agent-first interpretation.

OT is also able to make empirically testable predictions about the interpretation of incomplete sentences, and thus about incremental word-by-word processing (see de Hoop & Lamers, Reference de Hoop and Lamers2006; Stevenson & Smolensky, Reference Stevenson and Smolensky2006). As some constraints only become relevant later in the sentence, when linguistic cues become available that allow potential outputs to be evaluated on the basis of these constraints, intermediate interpretations may differ from final interpretations.

CHILDREN’S PRODUCTION OF WH-QUESTIONS

It is unclear whether the subject–object asymmetry found for comprehension also extends to production. For English, for example, Stromswold (Reference Stromswold1995) found that English children started producing object and subject questions at the same age (between age 1 year, 8 months [1;8] and 3;8) in spontaneous speech. Schouwenaars et al. (Reference Schouwenaars, van Hout and Hendriks2014), in a wh-question elicitation task, found that Dutch 6- and 7-year-olds did not make mistakes in their production of object questions, although they, like adults, preferred to produce passive questions (70%). For Italian, no differences were found between 3- to 5-year-old children’s productions of subject and object which-questions in a wh-question-elicitation task (Guasti, Branchini, & Arosio, Reference Guasti, Branchini and Arosio2012). Nevertheless, besides object questions (~30%), children produced alternative questions with clefts, putting the subject in dislocated position (~20%) or dropping the argument (~45%), which can be explained as a strategy to avoid object questions. This avoidance may indicate that children have problems producing object questions. In Hebrew, 3- and 4-year-old children avoided object relative clauses and produced more subject relatives in a relative-clause-elicitation task (Friedmann et al. Reference Friedmann, Belletti and Rizzi2009), which the authors argue is similar to the subject–object asymmetry in comprehension. Likewise, Italian children (as well as adults) produce passives instead of object relatives (Belletti & Contemori, Reference Belletti and Contemori2010). No wh-question elicitation study has so far been reported with German children. Biran and Ruigendijk (Reference Biran and Ruigendijk2015) report that German children repeated fewer which-object questions correctly than which-subject questions on a repetition task. Often children changed the object-first sentence into a subject-first sentence. Note, however, that the repetition method does not purely test production; to repeat a sentence, it must be understood to a certain degree as well.

As mentioned above, the AFH, RM, and the CM have been proposed to explain subject–object asymmetries in children’s comprehension of wh-questions. However, these accounts do not make explicit predictions about children’s production of wh-questions and require additional mechanisms to explain children’s performance in production. OT, in contrast, makes explicit predictions about production as well. Constraints in OT can be applied to a set of potential meanings to select the optimal meaning for a given form (as in comprehension). However, these constraints can also be applied to a set of potential forms to select the optimal form for a given meaning (as in production). A general assumption in OT is that comprehension and production are explained by the same grammar (i.e., the same set of constraints under the same ranking). Although the constraints are the same, they can nevertheless have different effects in comprehension and production because they apply to different potential outputs (meanings and forms, respectively; see Hendriks, Reference Hendriks2014; Smolensky, Reference Smolensky1996). Thus, OT may make different predictions for the comprehension and production of which-questions (see Schouwenaars et al., Reference Schouwenaars, van Hout and Hendriks2014).

PREDICTIONS FOR OUR STUDY

To examine how adults and children interpret, process, and produce which-questions, in an eye-tracking experiment we collect responses indicating final interpretations as well as gaze data revealing midsentence interpretations, and in a production experiment we elicit questions. In this section, predictions about the outcomes of the experiment are presented based on the constraint-based OT account, as this allows us to formulate specific predictions about adults’ and children’s final interpretations, their incremental processing, as well as their production of wh-questions. As some of the OT constraints reflect well-accepted views on wh-questions, these predictions are not necessarily incompatible with the other three models discussed above.

Regarding children’s final interpretations of which-questions, if children incorrectly have ranked the Agent-First constraint highest and thus prefer the first NP to be the agent, this results in an adultlike agent-first interpretation of subject questions. Object questions, in contrast, are predicted to receive an incorrect agent-first interpretation. For passive questions, an incorrect agent-first interpretation is predicted too. Nevertheless, the interpretation of passive questions may be less affected than that of object questions, as passive questions contain multiple cues for interpretation (the verb werden “to be,” the by-phrase, and the past participle). As these cues may be targeted by other constraints, not discussed here for reasons of space, children may base their interpretations on these other constraints and thus interpret passive questions correctly.

Turning to adults’ incremental processing of which-questions, there are three important moments in the sentence. Consider an object question disambiguated by verb agreement such as (3b). First, when the singular wh-phrase welche Schülerin “which pupil” is encountered, the agent-first interpretation is the optimal interpretation: it satisfies Agent-First, and Case and Agreement cannot be evaluated at this point, since there is no overt case marking and no verb yet. Then, when the plural verb begrüßen “greet” is encountered, the patient-first interpretation becomes the optimal interpretation: the agent-first interpretation now violates Agreement because the sentence-initial wh-phrase and the finite verb do not agree in number, and although the patient-first interpretation violates Agent-First, this interpretation is nevertheless optimal because Agreement is ranked higher than Agent-First. Finally, when the second NP, the plural die Lehrer “the teachers,” is encountered, the patient-first interpretation remains optimal: this interpretation satisfies Agreement because the finite verb agrees with the second NP. Thus, in object questions disambiguated by verb agreement, a shift is predicted from an agent-first interpretation at the sentence-initial wh-phrase to a patient-first interpretation at the finite verb.

Also in subject questions, the initial interpretation is guided by Agent-First. As the initial agent-first interpretation does not violate any further constraints, the intermediate and final interpretations are the same and no shift in interpretation is predicted.

For passive questions, the initial interpretation at the which-phrase is also determined by Agent-First, resulting in an agent-first interpretation. Next, the verb wird “is being” is encountered, indicating a passive question.Footnote ¹ In passives, the patient is the subject, and therefore the verb must agree with the patient and not with the agent. Due to a violation of Agreement by the agent-first interpretation, now the patient-first interpretation becomes the optimal interpretation and remains optimal. Therefore, in passive questions a shift is predicted from an agent-first interpretation to a patient-first interpretation at the finite verb.

The constraint-based OT account predicts a shift in interpretation midsentence for object questions disambiguated by verb agreement and for passive questions, but not for subject questions disambiguated by verb agreement. Further, the constraint-based OT account predicts no intermediate shifts in interpretation for object questions disambiguated by case on the first NP. Because Case is ranked higher than Agent-First, the patient-first interpretation is the optimal interpretation already at the wh-phrase and remains optimal when encountering the next words. If children have ranked Agent-First too high compared to adults’, they will show initial agent-first interpretations. Then, in contrast to adults, children may not overcome their initial misinterpretation of object questions and passive questions if neither Agreement nor Case outranks Agent-First.

As mentioned above, OT also makes specific predictions about the production of which-questions. When speakers wish to express a question about the agent (i.e., a question in which the wh-constituent is the agent), the optimal form is a subject question. This form satisfies all constraints mentioned, as the wh-constituent as well as the agent is in the first position. When speakers wish to express a question about the patient, there are two optimal forms: object questions and passive questions. Both forms violate the Agent-First constraint, but as subject questions violate higher ranked constraints, this form is suboptimal. Therefore, optionality is predicted: speakers can use two different forms to express the same meaning, namely, object questions and passive questions.

If the speaker takes into account the listener’s perspective, the speaker is expected to choose the form that is easiest to understand for the listener. For example, the object question form Welchen Schüler begrüßt der Lehrer? “Which pupil-ACC is the teacher-NOM greeting?” starts with a masculine NP carrying accusative case. As case is unambiguously specified, leading to a correct object-initial interpretation already at the wh-phrase, no shift in interpretation occurs, and hence the sentence should be relatively easy to understand for listeners. In contrast, with feminine, neuter, or plural NPs case morphology does not unambiguously specify whether the first NP is subject or object, then a passive question is predicted to be easiest to understand, because passive questions contain more disambiguating cues in the form of passive morphology than object questions. We therefore predict that when case is available as an early disambiguation cue, speakers who take into account their listener will more likely produce an object question, whereas when case is not available, speakers will more likely produce a passive question. A key question is whether children as speakers are capable of taking into account the perspective of the listener.

Summarizing, the constraint-based OT account predicts the following: (a) adults initially incorrectly interpret, and subsequently revise their interpretation of, passive questions and object questions disambiguated by verb agreement, whereas no revisions are predicted for subject questions and for object questions disambiguated by case; (b) children incorrectly interpret object questions as subject questions; and (c) speakers produce subject questions when the wh-constituent is the agent, object questions when the wh-constituent is the patient and case marking is available, and passive questions when the wh-constituent is the patient and no case marking is available, as these latter forms are assumed to be easiest to understand for the listener.

CURRENT STUDY

To examine these predictions, we conducted an eye-tracking experiment and a production experiment. To avoid an effect of syntactic priming, the experiments were carried out in two sessions with at least 3 days between them. In the first session comprehension was tested, and in the second session production. We will first present the comprehension experiment and then the production experiment.

EXPERIMENT 1: COMPREHENSION

We investigate how German children and adults understand and process which-questions, and to what extent and when they make use of case and verb agreement cues in their interpretation of which-questions.

Method

Participants

Thirty-six typically developing children with no diagnosed language, hearing, or speech pathologies (as reported by the parents) between the age of 7 and 10 were tested (22 male, 7;05–10;09, M=9;01 years old, SD=12.7 months). As a control group 30 adults were tested (14 male, M=24 years old, SD=31.5 months). Participants were recruited at and around the University of Oldenburg. They gave written informed consent prior to the experiment. The study was approved by the Ethical Committee of the University of Oldenburg and in accordance with the declaration of Helsinki.

Screening tests.

AUDITORY DISCRIMINATION OF CASE

In a first screening test, children’s discrimination of nominative and accusative case marking on determiners was tested in an auditory discrimination test. Stimuli were presented auditorily and consisted of pairs of question words or determiners (as in [4]) and pairs of NPs (as in [5]), which were either the same (4) or different with respect to case (5).

(4) der-der

(5) welcher Hund-welchen Hund

The participants had to press a button marked with gleich (the same) or nicht gleich (not the same) depending on whether the two words or NPs were the same or not. In total 16 pairs were presented; 8 per condition (same vs. different). One of the 36 children did not pass this test on a criterion of 14 or more out of 16 correct (M=97.4, SD=4.51; 25 children made no mistakes, 8 children made one mistake, 2 children made two mistakes, and 1 child made three mistakes).

VERB AGREEMENT

To ensure that children understood verb agreement in declarative sentences in which word order does not play a role, a second screening test involving a picture-selection task was carried out. A pair of pictures was presented on the screen while a prerecorded sentence was presented auditorily. The children were asked to select the picture that best matched the sentence (see [6] and Figure 1).

Figure 1 Example of a picture pair, one matching the single-subject interpretation (left), and the other matching the plural-subject interpretation (right) of sentence (6) Sie malt/malen die Prinzessin “She/they paint(s) the princess.”

(6) Sie malt/malen die Prinzessin.

pronoun_SG/PL paint_SG/paint_PL the princess

“She/They paint(s) the princess.”

The German pronoun sie is ambiguous and can refer to a singular feminine referent (“she”) or a plural referent (“they”). In these sentences, therefore, the number of the subject referent is exclusively determined by the number marking on the finite verb. Each picture pair consisted of one picture corresponding to the singular interpretation of the subject and another corresponding to the plural interpretation of the subject (see Figure 1). The position of the target picture (left or right) and of the agent referent on the pictures was balanced over four lists. We used a total of 16 items; 8 per condition (singular vs. plural), with four reversible transitive verbs (filmen “to film,” fangen “to catch,” malen “to paint,” and waschen “to wash”). The third-person singular form for the verbs filmen and malen are formed by stem+t, and for the verbs fangen and waschen by vowel-change in the stem+t. The latter may be more salient and therefore better distinguishable from the plural form. Both types of verbs were at ceiling level (no vowel-change: M=96.7%, SD=1.79; vowel-change: M=96.2%, SD=1.91). Only 1 of the 36 children did not pass this screening test on a criterion of scoring at least 14 out of 16 items correct (19 children made no mistakes, 12 children made one mistake, 4 children made two mistakes, and 1 child made three mistakes).

One child failed on the auditory discrimination of case screening test and another child on the verb agreement screening test. These children are excluded from further analysis. Of the remaining 34 children (21 male, 7;05–10;09, M=9;01 years old, SD=12.7 months) we can be sure that they perceive the differences in case morphology on determiners and wh-words and are sensitive to the number information provided by verbal inflection.

DIGIT SPAN TEST

To examine the role of processing capacity in the comprehension of wh-questions, children’s working memory was tested with a digit span test (HAWIK-IV; Petermann & Petermann, Reference Petermann and Petermann2007) in two conditions: forward and backward. The child was asked to repeat a sequence of digits from 1 to 9, which was read out loud by the experimenter, in the given order (forward) or in the reversed order (backward). The forward session started with a sequence of three digits, the backward session with a sequence of two. For each sequence length, there were two trials, after which the number of digits in a sequence increased with one more digit. The test ended when both trials of the same length were recalled incorrectly. For the analyses, we used the backward digit span (number of digits of longest sequence recalled in reversed order correctly), because besides temporary storage (remembering the digits) it also requires manipulation of information (reordering the digits) and hence is considered a more complete measure of working memory (Baddeley, Reference Baddeley2003).

Comprehension of which-questions.

STIMULI

A picture selection task with eye tracking was used to test the comprehension of three different types of which-questions: subject which-questions, object which-questions, and passive which-questions (see [7]–[15] in Table 1). The subject and object questions were disambiguated by only case, only agreement, or both, resulting in six conditions in total. The differences between these conditions were realized by the gender and number of the nouns. Determiners of German singular masculine nouns differ between nominative (der) and accusative case (den), while no such distinction is present in determiners for feminine or plural nouns (for both cases die). In the first condition Case, masculine noun pairs provided the case disambiguation cue on both the initial wh-phrase and the second NP. Both nouns were singular, so verb agreement was not available as a cue (see [7] and [10]). In a second condition Agr, feminine noun pairs were used, so case was not available as cue. The first noun pair was singular and the second noun pair was plural to provide the subject–verb agreement disambiguation cue (see [8] and [11]). To examine whether a case disambiguation cue in addition to an agreement disambiguation cue helps the listener to revise a first interpretation, a third condition was tested. In this condition, AgrCa, questions were disambiguated by subject–verb agreement and case on the second NP. Of these noun pairs, the first noun was masculine plural (thus ambiguously case marked) and the second noun was masculine singular, thus providing the subject–verb disambiguation cue and a case marking cue on the second NP (see [9] and [12]). With respect to the timing of the disambiguation cues, the Case condition has an early disambiguation cue on the first NP, whereas in the other two conditions, Agr and AgrCa, disambiguation takes place later in the sentence (see [10] vs. [11] and [12]).

Table 1 Example of test sentences

For passive questions the same noun pairs were used as for active questions. In Pas(a) the first and the second NP are both masculine singular (see [13]). In Pas(b) the first NP is feminine singular and the second NP is feminine plural (see [14]). In Pas(c) the first NP is masculine plural and the second NP is masculine singular (see [15]). Nevertheless, for passive sentences these different nouns do not lead to a distinction with respect to type of disambiguation cue, as in active sentences. The passive questions were always disambiguated by passive morphology instead.

There were four lists that differed in order of the items and in position of the target picture (left or right). In total 54 test items were presented: 6 for every condition in Table 1, leading to 18 items per question type. For each trial two pictures were presented side by side. The pictures depicted the correct interpretation or the incorrect interpretation resulting from a role reversal. For example the left-sided picture in Figure 2 represents the correct patient-first interpretation of sentence (12). In the right-sided picture the thematic roles are reversed, representing the incorrect agent-first interpretation.

Figure 2 Example of a picture pair, with one picture matching the patient-first interpretation (left) and the other picture matching the agent-first interpretation (right) of sentence (12): Welche Füchse wäscht der Schwan “Which foxes is the swan washing?” Depending on the nouns used in the test sentences, the number of animals in the picture differs between two (one of each kind, in the Case condition) and three (one of one kind and two of the other kind in the Agr and AgrCa conditions; see this example).

Procedure

In the familiarization phase, the participants were presented with a picture pair for 2500 ms to get used to the pictures. Next, a fixation cross appeared on the screen. After fixating the cross for 500 ms, the picture pair reappeared on the screen, and 50 ms later the prerecorded sentence was presented auditorily, after which the participants had to press the button corresponding to the picture they thought best fitted the sentence (see Appendix A for task instructions). There was no response time limit. The test items were divided into two blocks of 27 items each, both preceded by a 9-point calibration in Tobii and by two practice items (e.g., “Which bird is building a nest?”). Furthermore, in total 7 filler items with one animate noun (e.g., “Which kangaroo is shooting the ball?”) were included. Between the blocks, the verb agreement screening test described above was carried out. The digit span task and the case screening test were carried out in the second session, respectively before and after the first block of the production task. Both sessions took around 30–45 min.

The participants sat in front of a 23-inch Tobii TX300 eye tracker with a resolution of 1920 × 1080 pixels and a screen response time of 5 ms. The eye tracker was connected to two computers. One computer ran the experiment with the software E-Prime 2.0 (Psychological Software Tools, Inc.) and collected the behavioral data. With the use of TET-calls in E-Prime the participants’ eye movements at a sample rate of 300 Hz were collected from the second computer.

Analysis

Accuracy data.

GENERALIZED LINEAR MIXED-EFFECTS REGRESSION MODELING (GLMER)

We used GLMER with the software R (version 3.1.2) to analyze the accuracy data. As a model building strategy, we choose parsimonious mixed models (Bates, Kliegl, Vasishth, & Baayen, Reference Bates, Kliegl, Vasishth and Baayen2015), as these models are more suitable for the typical sample sizes of psycholinguistic research (Matuschek, Kliegl, Vasishth, Baayen, & Bates, Reference Matuschek, Kliegl, Vasishth, Baayen and Bates2017). Our accuracy models include a binomial dependent variable with a logit link function of Item accuracy and random intercepts for Participant and Item. The necessity of taking into account random slopes was assessed. The inclusion of factors was assessed by comparing the Akaike information criterion scores (Akaike, Reference Akaike1974). A decrease of at least 2 in the Akaike information criterion scores means that the inclusion of a factor significantly improves the goodness of fit of the model. Of the fixed factors, the first level is taken as the reference and each other level is contrasted with this baseline level. The order of the levels and the coding for group is children (baseline level) coded as –1 and adults as 1; for type of question the order is subject (baseline, –1), object (0), and passive (1); for position of target, left (baseline, –1) and right (1); and for type of cue, AgrCa (baseline level, –1), Agr (0), and Case (1). To compare the second with the third level and so on, multiple comparisons were made with the use of the glht function of the “mult-comp” package (Hothorn, Bretz, & Westfall, Reference Hothorn, Bretz and Westfall2008), which corrects for multiple comparisons and gives adjusted p values.

Gaze data.

PREPROCESSING OF THE GAZE DATA

Validity of the gaze data was rated by the eye tracker with a value of 0 or 1, meaning that the system is certain that all relevant data for both eyes or highly probable estimations for one eye were recorded. Only valid data points were included. No participants or trials had to be removed due to insufficient (<75%) valid data points. No selection was made based on offline accuracy of the trials. Instead, gaze data from both correct and incorrect trials were included to present a more complete picture of how cues are processed in general. Gaze data was limited to 3000 ms after the onset of the stimulus to cover the complete range of time from onset until the average response time. Areas of interest (AOIs) were defined over target interpretation (target picture), competitor interpretation (competitor picture), and not on AOI. For the statistical analysis, the sum of looks to a specific AOI was calculated per participant per trial and per time bin of 200 ms from the raw data file. For the gaze plots, time bins of 50 ms were used for a more detailed picture.

GENERALIZED ADDITIVE MIXED MODELING (GAMM)

The gaze data were analyzed in R with GAMM (Wood, Reference Wood2006, Reference Wood2011) using the package mgcv 1.8.4 (Wood, Reference Wood2006) and the package itsadug (van Rij, Baayen, Wieling, & van Rijn, Reference van Rij, Baayen, Wieling and van Rijn2015). GAMM is a nonlinear regression analysis and therefore particularly useful for time course data such as eye tracking (Nixon, van Rij, Mok, Baayen, & Chen, Reference Nixon, van Rij, Mok, Baayen and Chen2016; van Rij, Hollebrandse, & Hendriks, Reference van Rij, Hollebrandse and Hendriks2016). Like generalized linear mixed-effects regression modeling, GAMM allows for inclusion of both fixed and random factors. The crucial difference is that GAMMs manage nonlinear data sets. The relations between the factors and the dependent variable are modeled as smooth functions.Footnote ² Smooth functions and parameters are determined by estimation procedures in order to avert overfitting and overgeneralization of the data (van Rij et al., Reference van Rij, Hollebrandse and Hendriks2016; Wood, Reference Wood2006). For the model predictions we used difference plots from the itsadug package (van Rij et al., Reference van Rij, Baayen, Wieling and van Rijn2015). For example, the function get_differences and difference plots were used to calculate differences between children’s and adults’ looking behavior for subject, object, and passive questions.

Results

We will present both offline accuracy scores and online gaze data. The offline accuracy scores inform us about the final interpretation the participants give to the which-questions. The online gaze data inform us about the processing during sentence presentation, namely, about the interpretations given to which-questions at different moments in time.

Accuracy

Figure 3 shows the percentage of correct interpretations of which-questions for children (left) and adults (right). A GLMER model was made to compare the groups (children and adults). One by one, the following fixed factors were included to see whether they improved the goodness of fit of the model: group (adults vs. children), type of question (subject vs. object vs. passive), and type of cue (Case vs. Agr vs. AgrCa). The inclusion of type of cue (valid factor for subject and object questions only) did not improve the model. In addition, no interactions for this variable with group or type of question were found. We examined the possible effects of the material-related variables, such as verb, pair of nouns, session, direction of action, and position of target. Of these variables, only position of target (left vs. right) significantly improved the model. As position of target was balanced over type of question and type of cue, and changed for each item over the different lists, no interactions were found.

Figure 3 Percentages of correct interpretations of subject questions, object questions, and passive questions with their different cues. Case means disambiguated by case on the wh-phrase and the second NP, Agr means disambiguated by verbal agreement, AgrCa means disambiguated by verbal agreement and by case on the second NP, and Pas means passive construction. Error bars indicate standard error.

Table 2 shows the final model for the overall analyses. With this model we can further investigate the effects of type of question and its interaction with group, which contain more than two levels and therefore require multiple comparisons. The only factor with two levels is the position of target. As shown in Table 2, items with the target picture on the right are interpreted better than those where it is on the left.

Table 2 Fixed effects of best fitting generalized mixed effects model to fit the accuracy scores of the which-questions

A multiple comparison reveals that there is a significant difference in accuracy between object questions and subject questions and between object questions and passive questions, but not between subject questions and passive questions as can be seen in Table 3.

Table 3 Multiple comparisons of means for accuracy of the interpretation of the three different types of which-questions (Tukey contrasts)

The multiple comparison in Table 4 shows that the difference between the groups only holds for object questions and not for subject questions or passive questions: children score significantly worse than adults on object questions (β=1.45, z=5.978, p<.05). Specifically, only for children there is a significant difference between subject and object questions (β=–2.39, z=–5.951, p<.001), but not for adults (β=–0.93, z=–1.928, p=.359). In contrast, the difference between object questions and passive questions is significant for both children (β=2.35, z=5.978, p<.001) and adults (β=1.90, z=2.881, p<.05).

Table 4 Multiple comparisons of means for accuracy of the interaction between three different types of which-questions and group (Tukey contrasts)

A closer examination of children’s accuracy scores for object questions revealed that most children (23 out of 34) made only one or no errors (out of 18 object question items). Four children scored at chance level or below when the object question was disambiguated by verb agreement only, but made only one or no errors when case or both case and agreement cues were available. Another 4 children scored at chance level or below for all types of cues. Three other children made two to six errors spread over all cue conditions.

To further unravel children’s accuracy scores for object questions, we investigated the influence of two more factors: digit span backward and age. There was no correlation between digit span and age, r (32)=.06, p=.73. Raw backward digit span was used to make three groups: low (digit span of 3, n=11, 7-year-olds n=1, 8-year-olds n=4, 9-year-olds n=3, and 10-year-olds n=3), medium (digit span of 4, n=13, 7-year-olds n=5, 8-year-olds n=1, 9-year-olds n=4, and 10-year-olds n=3), and high (digit span of 5–6, n=10, 7-year-olds n=1, 8-year-olds n=4, 9-year-olds n=2, and 10-year-olds n=3). Groups instead of the scores as a range were used to avoid that correlations strongly depended on extreme values (as in this data only one child had a digit span score of 6). In order to see whether there were differences between different ages, age is divided into four groups: 7-year-olds (n=6), 8-year-olds (n=9), 9-year-olds (n=10), and 10-year-olds (n=9). Figure 4 shows the mean accuracy scores of object questions by children per digit span group (left) and per age group (right).

Figure 4 Children’s mean accuracy scores (in percentages) on object questions per digit span group (left) and per age group (right).

A new model is made with children’s accuracy scores on object questions as a dependent variable. Because there was no correlation between children’s age and their digit span scores, both digit span and age were included as fixed factors in the model. Only item was included as a random factor and not participant, because each participant had a single score of digit span and of age.

Table 5 shows the final model for the analysis of children’s scores on object questions. Like the other models, this model contains variables with more than two levels. Therefore, multiple comparisons are made for the factors digit span (see Table 6) and age (see Table 7).

Table 5 Fixed effects of best fitting generalized mixed effects model to fit the accuracy scores of children’s object questions

Table 6 Multiple comparisons of means for children’s accuracy scores on object questions in the three different digit span groups (Tukey contrasts)

Table 7 Multiple comparisons of means for children’s accuracy scores on object questions in the four different age groups (Tukey contrasts)

The multiple comparisons in Table 6 confirm significant differences between the low digit span group and the two other groups. Children with a low digit span made more errors on the comprehension of object questions than children with a medium digit span (β=0.87, z=2.941, p<.01) and children with a high digit span (β=1.87, z=–4.359, p<.001). Between the group of children with a medium and a high digit span no significant differences were found (β=1.00, z=–2.228, p=.0643).

The multiple comparisons in Table 7 confirm that 7-year-old children made significantly more errors on the comprehension of object questions than 8-year-old children (β=1.43, z=3.476, p<.01) and 9-year-old children (β=1.29, z=3.585, p<.01). The difference between the 7-year-old and 10-year-old children was not significant (β=0.75, z=2.177, p=.1287). Also between the 8-, 9-, and 10-year-olds no significant differences were found.

Summarizing, the offline data show that children made significantly more errors than adults in their comprehension of object questions, but not of subject or passive questions. Children’s comprehension of object questions was affected by digit span (children with a low digit span misinterpreted object questions significantly more often than children with a medium or high digit span) and age (7-year-olds misinterpreted object questions significantly more often than 8- and 9-year-olds). No differences were found with respect to the different disambiguation cues.

Gaze data

Sentence interpretation is an incremental process, which means that interpretation need not wait until the end of the sentence but can already take place while words are encountered one by one. Crucially, the optimal interpretation can change over time. This is exactly what we will see in the gaze patterns for object and passive questions.

The gaze plots in Figure 5 show that for subject questions, children and adults look increasingly toward the target picture. For object questions, we first see an increase of looks toward the competitor picture, followed by an increase of looks toward the target picture. The increase of looks toward the target picture seems to be earlier for adults than for children. A similar pattern appears for passive questions.

Figure 5 Children’s (dashed line) and adults’ (solid line) online gaze behavior for subject, object, and passive questions. The plots show separate lines for looks toward the target picture (red lines) and competitor picture (blue lines), for children (dashed lines) and adults (solid lines). The vertical lines indicate the mean onset of the verb, the mean onset of the second NP, and the mean offset of the sentence. The horizontal gray lines indicate a significant difference between children’s and adults’ gaze patterns analyzed with the statistical model described in the GAMM section.

A GAMM model is made to investigate differences between the two groups. In a later analysis, we will look at differences with respect to type of cue.

For our overall model we used TCDiff (the sum of looks toward the target minus the sum of looks toward the competitor picture) for timebins of 200 ms as the dependent variable. All interactions between group (adults vs. children) and type of question (subject vs. object vs. passive) were combined into one predictor to see whether there were differences between the groups with respect to the different types of questions. As random effect factors increase the time of running a model (which was already 12 hr), item was not included as a random effect factor. Instead, participant and type of question were combined into one random effect factor (ParticipantQuestion) and added to the model. A summary of the model is given in Appendix B (Table B.1). As this summary merely indicates whether the smooth of each variable is linear or not, further calculations are made in the following paragraphs.

The difference plots (see Figure B.1 in Appendix B) reveal differences between adults’ and children’s gaze patterns for object and passive questions, but not for subject questions. Children’s looks toward the correct picture increase later than adults’ for object and passive questions. The differences between children and adults lasted longer for object questions than for passive questions. This indicates that children needed more time than adults to revise the incorrect interpretation, and even more so in object questions than in passive questions.

To see whether different disambiguation cues lead to different gaze patterns for children and adults, we ran a second analysis. We visualized the gaze patterns for the object questions per type of cue for children and for adults (see Figure 6). For both children and adults, we clearly see a preference for the incorrect initial interpretation (more looks toward the competitor picture than toward the target picture) for the AgrCa and Agr conditions, but not for the Case condition.

Figure 6 Children’s (left plot) and adults’ (right plot) online gaze behavior for object questions. The plots show separate lines for looks toward the target picture (red lines) and competitor picture (blue lines) per type of cue: AgrCa (dotted lines), Agr (dashed lines), and Case (solid lines). The vertical lines indicate the onset of the verb, the onset of the second NP, and the offset of the sentence. The gray horizontal lines indicate a significant difference between the types of cues.

To analyze whether these observed differences between the cues are significant, we made a second GAMM model. Now we included solely the data of the object questions. The input was again TCDiff for timebins of 200 ms. All interactions between group (adults vs. children) and type of cue (Case vs. Agr vs. AgrCa) were combined into one predictor. Participant was used as a random effect factor. A summary of the model is given in Appendix C (Table C.1).

Again difference plots were made to see whether the observed differences were significant (see Figures C.1 and C.2 in Appendix C). For children, there were significant differences in looks between object questions disambiguated by Case and the other two conditions (AgrCa and Agr). This is shown by the increasing proportion of looks toward the target picture for Case, whereas for AgrCa and Agr children initially showed an increasing proportion of looks toward the competitor picture, followed by an increasing proportion of looks toward the target picture. The same pattern and differences were found for adults. We take this to be an indication that case, in contrast to agreement, is used early in processing of which-questions. For children, an additional difference was found between the AgrCa and Agr conditions: the proportions of looks toward the competitor picture for object questions in the AgrCa condition was lower and dropped earlier than for the Agr condition. Thus, children, but not adults, seem to benefit from the extra case cue on the second NP.

Summarizing, the online gaze data show that both children’s and adults’ interpretation changes from an agent-initial interpretation to a patient-initial interpretation during the processing of object questions and passive questions. Children were slower in revising their initial interpretation than adults. Furthermore, whereas object questions disambiguated by verb agreement, or by verb agreement and case on the second NP, were initially interpreted as subject questions, object questions disambiguated by case on the first NP were not. These differences with respect to disambiguation cue may have implications for production when ease of comprehension is taken into account.