Hostname: page-component-78c5997874-94fs2 Total loading time: 0 Render date: 2024-11-19T06:39:38.824Z Has data issue: false hasContentIssue false

Listening like a native: Unprofitable procedures need to be discarded

Published online by Cambridge University Press:  29 May 2023

Laurence Bruggeman*
Affiliation:
The MARCS Institute for Brain, Behaviour and Development, Western Sydney University, Penrith, Australia ARC Centre of Excellence for the Dynamics of Language, Western Sydney University, Penrith, Australia
Anne Cutler
Affiliation:
The MARCS Institute for Brain, Behaviour and Development, Western Sydney University, Penrith, Australia ARC Centre of Excellence for the Dynamics of Language, Western Sydney University, Penrith, Australia Max Planck Institute for Psycholinguistics, Nijmegen, the Netherlands
*
Corresponding author: Laurence Bruggeman; Email: [email protected]
Rights & Permissions [Opens in a new window]

Abstract

Two languages, historically related, both have lexical stress, with word stress distinctions signalled in each by the same suprasegmental cues. In each language, words can overlap segmentally but differ in placement of primary versus secondary stress (OCtopus, ocTOber). However, secondary stress occurs more often in the words of one language, Dutch, than in the other, English, and largely because of this, Dutch listeners find it helpful to use suprasegmental stress cues when recognising spoken words. English listeners, in contrast, do not; indeed, Dutch listeners can outdo English listeners in correctly identifying the source words of English word fragments (oc-). Here we show that Dutch-native listeners who reside in an English-speaking environment and have become dominant in English, though still maintaining their use of these stress cues in their L1, ignore the same cues in their L2 English, performing as poorly in the fragment identification task as the L1 English do.

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
Copyright © The Author(s), 2023. Published by Cambridge University Press

Introduction

The efficiency of listening to speech is based on our ability to adjust the processing mechanisms involved to ensure that they function optimally in the language under use. Different languages deploy different acoustic cues to distinguish between phonemes and hence between spoken words, and listeners learn to process speech in the most efficient manner; together these situations produce language-specific listening, with native users of each language listening in a way that is tailored to the particular properties of the language they have been exposed to (L1; Cutler, Reference Cutler2012).

As a result, the cues used during speech processing can differ from one listener group to another. This can even hold true with two languages that are historically closely related and in which many structural features are highly similar, such as Dutch and English. These two Germanic languages have broadly comparable syntactic and phonological systems. For instance, both languages use lexical stress, and as a result, the syllables of the words of each language differ in the way they are realised suprasegmentally (i.e., in the syllable's duration, and the intensity and the fundamental frequency producing its vocalic portion). The placement of primary stress in English and Dutch is not rule-governed. Stress may fall at any word position (PRImary, poSItion, fundaMENtal, withIN; upper case letters indicate primary stress); also, in every word there is one and only one syllable that may bear primary stress. In both languages, the location of stress within a word may shift under influence of sentence rhythm (e.g., thirTEEN becomes THIRteen when it is followed by MEN; Gussenhoven, Reference Gussenhoven1983; Kager & Visch, Reference Kager and Visch1985; Liberman & Prince, Reference Liberman and Prince1977). Nonetheless, research in recent years has shown Dutch and English to differ quite reliably in the way their listeners handle the various kinds of phonetic cue available for identifying spoken words, with the markers of lexical stress playing the leading role (e.g., Cooper et al. Reference Cooper, Cutler and Wales2002; Tremblay et al., Reference Tremblay, Broersma, Zeng, Kim, Lee and Shin2021).

In English, lexical stress is cued suprasegmentally by duration, intensity and pitch. The most important cue to lexical stress, however, occurs at the segmental level and is provided by the quality of the vowel (e.g., Chrabaszcz et al., Reference Chrabaszcz, Winn, Lin and Idsardi2014; Lin et al., Reference Lin, Wang, Idsardi and Xu2014; Zhang & Francis, Reference Zhang and Francis2010): stressed syllables always contain a full vowel, but vowels in unstressed syllables are frequently reduced towards schwa (Fourakis, Reference Fourakis1991), so that minimal pairs such as PREsent (noun) and preSENT (verb) are segmentally and suprasegmentally distinct. In Dutch, reduction of vowels in unstressed syllables occurs much less frequently than in English (Sluijter & van Heuven, Reference Sluijter and van Heuven1996), leaving duration, intensity and pitch as the most important acoustic correlates to lexical stress (van Heuven & de Jonge, Reference van Heuven and de Jonge2011). In contrast to other studies that have compared listeners’ weighting of all acoustic cues to stress (i.e., both segmental and suprasegmental cues), in the present study we focus specifically on listeners’ use of suprasegmental cues to lexical stress only.

In principle, stress in both English and Dutch can be contrastive and serve to distinguish between segmentally identical word pairs such as INsight and inCITE – although in fact such minimal pairs are rare in all stress languages (Cutler & Jesse, Reference Cutler, Jesse, Pardo, Nygaard, Remez and Pisoni2021), with neither English nor Dutch defying this rule. What is particularly useful about such minimal stress pairs, of course, is how well they show the availability of the suprasegmental cues for listeners. Figure 1 shows the English pair PERvert (noun)/perVERT (verb), and the Dutch pair VOORnaam (noun: “first name”)/voorNAAM (adjective: “respectable”). In duration, amplitude and pitch, each primary-stressed syllable clearly outdoes its segmentally matched but suprasegmentally mismatched companion.

Figure 1. Waveforms and spectrograms for the English minimal stress pair PERvert – perVERT (left) and the Dutch minimal stress pair VOORnaam – voorNAAM (right). Blue lines represent pitch contours.

Even without many such minimal pairs, the simple fact that stress patterns vary from word to word should make suprasegmental stress cues useful for listeners engaged in spoken-word recognition. Word pairs with segmentally identical first syllables, such as PRImary versus priMEval, or OCtopus versus ocTOber, could surely be distinguished more rapidly if a listener takes the stress cues into consideration as well as the segmental differences later in the word.

Indeed, there is evidence that Dutch listeners use them very efficiently. An early demonstration of this (van Heuven, Reference van Heuven1988) used a gating task and sentences in which both words from a pair with versus without initial primary stress were equally plausible (e.g., ORgel, “organ”, versus orKEST, “orchestra”). Listeners heard these sentences truncated so that only a short fragment of the final word was audible, and had to guess which word it was; 76% of their guesses from just the initial vowel were correct, and this could only have been due to use of the suprasegmental differences. Other Dutch listeners in a similar study using minimal stress pairs also achieved high correct identifications of the source word (this time in 86% of cases; Cutler & van Donselaar, Reference Cutler and van Donselaar2001).

Although both of these results are from ‘offline’ tasks (with decision responses collected after speech processing has concluded), they certainly indicate that Dutch listeners exploit not only segmental but also suprasegmental information. Investigations using ‘online’ tasks measuring processing speed confirmed these findings. In a priming task with minimal stress pairs Dutch listeners were quicker to accept words primed with their initial syllable only if the prime had the correct suprasegmental cues (Cutler & van Donselaar, Reference Cutler and van Donselaar2001), and quicker to accept a visually presented word when it was primed by a spoken bisyllabic fragment of the same word as long as, again, the suprasegmental cues were correct (van Donselaar et al. Reference van Donselaar, Koster and Cutler2005). Likewise, incorrectly applied stress patterns proved to affect word recognition in Dutch, in that mis-stressing impeded word recognition (Koster & Cutler, Reference Koster and Cutler1997; van Leyden & van Heuven, Reference van Leyden, van Heuven, Cremers and den Dikken1996). Clearly, suprasegmental stress cues aid listeners of Dutch to quickly distinguish between differently stressed Dutch words.

Figure 1 suggests that the strength of suprasegmental cues to lexical stress in spoken English words is no less than that of Dutch words. It is thus on the face of it surprising that the Dutch results above have no match in English. Mis-stressing does not prevent English word recognition in noise as long as vowels are intact (Slowiaczek, Reference Slowiaczek1990), and it fails to affect the speed with which English words are recognised (Small et al. Reference Small, Simon and Goldberg1988), the acceptability of spoken words in sentences (Slowiaczek, Reference Slowiaczek1991) or the judged naturalness of spoken words (Fear et al. Reference Fear, Cutler and Butterfield1995). In English, minimal stress pairs even prime each other's associates (Cutler, Reference Cutler1986). As for the fragment priming results from Dutch, these too do not replicate in English; segmental overlap does prime matching word forms, but whether the segments are accompanied by matching suprasegmental features as well makes no difference to listeners’ responses (Cooper et al., Reference Cooper, Cutler and Wales2002; Experiment 1a; Fear et al., Reference Fear, Cutler and Butterfield1995; Small et al., Reference Small, Simon and Goldberg1988). Native listeners of Dutch and English thus appear to differ in the extent to which they exploit suprasegmental stress cues during spoken-word recognition, despite the similarity between the two languages and their close relatedness. In both languages, the information is there in the signal; in one language, the information is used, in the other it is not. As proposed by Cooper et al. (Reference Cooper, Cutler and Wales2002), listeners’ use (or otherwise) of suprasegmental stress information depends on whether it is useful. That, in turn, depends on the structure of the lexicon (Cutler & Pasveer, Reference Cutler, Pasveer, Hoffmann and Mixdorff2006).

The vocabularies of English and Dutch differ in the distribution and the frequency of occurrence of speech fragments that are ambiguous on a segmental level yet can be disambiguated when suprasegmental stress patterns are taken into account. In English, such fragments occur relatively infrequently, since the vowel in a syllable which itself is preceded by a stressed syllable is frequently reduced, leading to a pair of segmentally differing rather than a pair of segmentally identical syllables. English listeners are therefore not confronted with segmental ambiguity at all; the first two syllables of words such as ocTOber (with a stressed and therefore full vowel in the second syllable) and OCtopus (with a reduced vowel in the second syllable) can be disambiguated on segmental differences alone. There is no additional information to be gained by taking suprasegmental stress cues into account.

The Dutch lexicon, on the other hand, contains many words of three syllables or more that have full vowels in the first two syllables, and as a result, many pairs that are temporarily ambiguous (such as okTOber and OKtopus). For Dutch listeners, the use of suprasegmental stress cues is thus efficient, indeed essential, as it provides disambiguating information that is not available on a segmental level. The vocabulary asymmetry results in native speakers of English and Dutch developing differently weighted models of segmental and suprasegmental information and, in consequence, quite different listening strategies. In both languages, the suprasegmental information is there in the signal; but whether listeners use it depends on whether it is useful in speeding the recognition of their words. The asymmetry in this case of otherwise highly similar languages simply reflects the efficiency of the speech processing system.

The question at issue in the present study is what consequences the asymmetry may have for those who fully command both languages. Previous research on lexical stress has shown that listeners’ use of acoustic cues to lexical stress in a second language (L2) is strongly influenced by their use of these cues in the L1 (e.g., Choi, Reference Choi2022; Cooper et al., Reference Cooper, Cutler and Wales2002; Dupoux et al., Reference Dupoux, Sebastián-Gallés, Navarrete and Peperkamp2008; Kim & Tremblay, Reference Kim and Tremblay2021; Qin et al. Reference Qin, Chien and Tremblay2017; Tremblay et al., Reference Tremblay, Broersma, Zeng, Kim, Lee and Shin2021). While some listeners of languages without lexical stress may struggle to perceive English lexical stress (e.g., Lin et al., Reference Lin, Wang, Idsardi and Xu2014), others may be able to perceive it by exploiting acoustic cues that they rely on for other aspects of lexical access in their native language. For instance, Cantonese listeners, experienced in the use of F0 as a cue to lexical tones, and listeners of Gyongsang-Korean, a dialect with lexical pitch accents, can both successfully discriminate minimal stress pair words in the L2 English despite the lack of lexical stress in their native language (Choi et al. Reference Choi, Tong and Samuel2019; Kim & Tremblay, Reference Kim and Tremblay2021). Listeners whose L1 does have lexical stress tend to transfer their cue use from the L1 to the L2, leading to non-native-like stress perception (Cooper et al., Reference Cooper, Cutler and Wales2002; Cutler, Reference Cutler2009; Tremblay et al., Reference Tremblay, Broersma, Zeng, Kim, Lee and Shin2021). Dutch listeners presented with segmentally identical but suprasegmentally distinct word fragments (such as oc-/OC) in their L2, English, actually outdo native listeners in their ability to correctly classify the source word (Cooper et al., Reference Cooper, Cutler and Wales2002; Cutler, Reference Cutler2009). Thus, when they process L2 speech they draw upon skills induced by their L1 which are not in the possession of L1 listeners to English whose previous English input of course has not induced any such skills. But with substantial experience in the same L2, might Dutch-native listeners learn to listen like the English do, and ignore those features which are useful in their L1 but actually are not appropriate for their L2? Of particular interest then is the kind of learning involved. With a few notable exceptions (e.g., Tremblay & Spinelli, Reference Tremblay and Spinelli2014; Weber & Cutler, Reference Weber and Cutler2006), existing studies of phonological structure in L2 listening have tended to focus on the acquisition of L2-appropriate strategies, but the present question amounts to whether L2 listeners can learn that their perceptual performance could be improved by dropping an L1 strategy.

The appropriate population for such a question is one immersed in an L2 environment and predominantly using the L2 in daily life. Our study involves a population of native Dutch-speaking emigrants in Australia. Dutch emigrants tend to quickly adopt the language of their new environment (Clyne & Pauwels, Reference Clyne, Pauwels, J. and S.1997), with the result that Dutch emigrants in Australia typically use English, their L2, for everyday communication. In Experiment I, these Dutch emigrants living in Australia completed a replication of Experiment 3 from Cooper et al.'s (Reference Cooper, Cutler and Wales2002) study. If the emigrants exploit suprasegmental cues to lexical stress in English, their accuracy is predicted to be high and resemble that of the Dutch L2 listeners in the original study by Cooper and colleagues. If, on the other hand, the emigrants have stopped using suprasegmental stress cues as they are not useful for the L2, accuracy is predicted to be lower than that of Cooper et al.'s Dutch listeners and more similar to the accuracy of the English L1 listeners in that same experiment. Experiment II aimed to establish the validity of new Dutch stimulus materials that we constructed in parallel to the English stimuli from Cooper et al. (Reference Cooper, Cutler and Wales2002), and was conducted with native Dutch listeners in the Netherlands; Experiment III then used these new materials to examine the L1 identification accuracy available to the same group of Dutch emigrants who had completed Experiment I.

Experiment I – use of suprasegmental stress cues in L2 listening

Method

Participants

Twenty-four participants were recruited from the Dutch emigrant community in the wider Sydney area (aged 27–73 years, M = 48.8, SD = 14.9; 14 females). All participants were native speakers of Dutch, who grew up in the Netherlands and had migrated to Australia as adults (mean age at migration: 28.4 years, SD = 7.7, range: 18–52). Their mean length of residence in Australia was 20.5 years (SD = 15.2). Participants were highly proficient in their L2, English, as indicated by their mean score of 93.6 (SD = 5.3) on the Lexical Test for Advanced Learners of English (LexTALE; Lemhöfer & Broersma, Reference Lemhöfer and Broersma2012). To measure their frequency of L1 and L2 use, the question “Please indicate to what extent you use Dutch and English in the situations listed” was included as part of a background questionnaire participants completed prior to the start of the experiment. All participants reported using the L2, English, more frequently than the L1, Dutch, which was mostly restricted to use with family members. See Appendix S1 (Supplementary Materials) for the full list of situations and a tally of responses to this question. No participant reported any hearing problems. All participants provided written informed consent prior to the start of the experiment and were paid for their participation.

Materials

Stimulus materials were taken from Experiment 3 of Cooper et al. (Reference Cooper, Cutler and Wales2002) and consisted of truncated recordings of 21 pairs of English words, spoken by a male native speaker of Australian English (see Appendix S2, Supplementary Materials). Words in each pair differed in their stress pattern, so that in each case one word had primary stress on the first syllable (e.g., RObot), while primary stress for the other word fell on the second syllable (e.g., roBUST). To ensure the truncated words in each pair were segmentally the same and differed only suprasegmentally, the first syllable of all words always contained a full vowel. Mean log word frequencies in the CELEX lexical database of English (Baayen et al. Reference Baayen, Piepenbrock and Gulikers1995), as reported by Cooper et al. (Reference Cooper, Cutler and Wales2002), were 2.18 for first-syllable stress words, and 1.88 for second-syllable stress words. Each word was truncated at the end of the first syllable and had been recorded twice, resulting in a total of 84 spoken word fragments, that were each presented twice (making 168 trials). Mean durational, F0 and amplitude measures for the syllable fragments are shown in Table 1, averaged across all fragments with the same stress type. All measures were computed over the voiced portion of a fragment only, with the exception of duration, which was measured over the entire fragment. In conformity with the study by Cooper et al., different pseudo-randomised stimulus lists were created for all participants, and fragments from the same word pair never occurred in successive trials.

Table 1. Mean values on six acoustic measures of the stimuli of Experiment I. Values were averaged across all fragments from source words with first-syllable (left) or second-syllable stress (right).

Procedure

Participants were tested individually in a sound-attenuated booth. Auditory stimuli were presented over Beyerdynamic DT770 PRO headphones at a comfortable sound level, kept constant for all participants. Instructions in English were displayed on the computer screen and were repeated and clarified orally (in Dutch) by the experimenter. Participants were instructed to listen carefully to each word fragment and decide whether the fragment they heard formed the beginning of the word displayed on the left of the screen or of that on the right. The screen position (left or right) of the word that was the correct response was counterbalanced across presentations of the same word fragment. At the start of each trial, the response words were displayed on the computer screen for a preview period of 2000 ms. The truncated word fragment was then played and participants gave their response. There was no time-out period and the next trial started 500 ms after a response was received. Participants responded using the shift keys, pressing the left shift key to select the word printed on the left of the screen and the right shift key to choose the word printed on the right. Upon completion of the experiment, participants completed the English version of the LexTALE (Lemhöfer & Broersma, Reference Lemhöfer and Broersma2012) to assess their English proficiency.

Results and discussion

One trial had a response time of less than 100 ms and was therefore excluded from all analyses reported below. The results of the remaining trials are displayed in Figure 2a. For comparison, Figure 2b contains the mean results of both the English and Dutch listener groups tested by Cooper et al. (Reference Cooper, Cutler and Wales2002; henceforth referred to as L1 controls and L2 controls, respectively). Overall, the emigrants correctly identified the source word for 61.9% of truncated fragments. They assigned fragments more accurately to their source words when they had first-syllable stress (72.3%) than when they originated from words with second-syllable stress (51.5%). This asymmetry may be the result of the fact that listeners selected the response option with first-syllable stress more often than the other option. Indeed, in 60.4% of all trials, participants judged a word with first-syllable stress to be the source of the fragment they had heard, and this percentage is very similar to the first-syllable-stress judgments on these same materials made by the L2 (58.5%) and L1 listeners (62.9%) of Cooper et al. (Reference Cooper, Cutler and Wales2002). This bias towards words with first-syllable stress may reflect differences in word frequency (of the source words used in the present experiment, those with first-syllable stress had higher word frequencies than those with second-syllable stress), in acoustic clarity (syllables with primary stress tend to be articulated more precisely; Scarborough et al., Reference Scarborough, Keating, Mattys, Cho and Alwan2009), and/or in the lexical statistics of stress patterns (first-syllable stress is the most frequently occurring stress pattern in English; Clopper, Reference Clopper2002; Cutler & Carter, Reference Cutler and Carter1987).

Figure 2. Mean percentage of correct responses from Experiment I (panel A) and from Cooper et al. (Reference Cooper, Cutler and Wales2002; panel B). Error bars represent standard errors.

The emigrants’ overall identification accuracy was statistically compared to that of the L1 (mean accuracy = 59.2%) and the L2 controls (mean accuracy = 72.3%) from Experiment 3 of Cooper et al.'s (Reference Cooper, Cutler and Wales2002) study by fitting a generalised linear mixed-effects model to the combined data from the study by Cooper et al. and the present experiment. This was done in R (R Core Team, 2019), using family ‘binomial’ and the logit-link function from the lme4 package (Bates et al., Reference Bates, Maechler, Bolker and Walker2015). Listener group (emigrants, L2 controls, L1 controls) was entered into the model as a fixed categorical predictor. This predictor was coded using Helmert contrasts, such that the beta value of Group1 represents the difference between the mean of the L1 listeners on one hand, and that of both groups of L2 listeners combined on the other, whereas the beta value of Group2 represents the difference between the means of those latter two groups (see Table 2 for the contrast matrix). Random intercepts were added to the model for participants and items. Results of the model fit are displayed in Table 3, and showed significant effects of Group1 and Group2. Post-hoc analyses with Tukey-adjusted α-levels were conducted with the emmeans package (Lenth, Reference Lenth2019) and revealed that the emigrants’ accuracy was significantly different from that of the L2 controls (p < .001) but not from the accuracy of the L1 controls (p = .58). This suggests that the emigrants no longer use suprasegmental stress cues to the same extent as their compatriots who remained in the Netherlands.

Table 2. Helmert contrast coding for the predictor Listener group.

Table 3. Results of the generalised linear mixed-effects model on the responses of Experiment I and of Experiment 3 from Cooper et al. (Reference Cooper, Cutler and Wales2002).

We then compared the emigrants’ response accuracy to chance level (i.e., 50%) with a two-sided binomial test. Since the aforementioned bias towards first-syllable-stress responses prevents a meaningful interpretation of participants’ accuracy for fragments with this stress pattern, this comparison was only carried out with participants’ judgments for items with second-syllable stress (cf. Cooper et al., Reference Cooper, Cutler and Wales2002). While the L2 controls had performed significantly better than chance, this was not the case for the emigrants, who performed neither better nor worse than chance level (z = 1.34, p = .181).

In sum, the results from this experiment clearly show that the Dutch emigrants do not exploit suprasegmental information to the same extent as Dutch L2 listeners living in the Netherlands, and that their use of this information is more in line with that of English L1 listeners. This indicates that after an extended period of daily L2 use, the emigrants have learned the properties of the English lexicon and adjusted the way they listen accordingly to optimise processing efficiency. This finding can be interpreted in different ways. On one hand, the emigrants may have expanded their strategy repertoire to include not only an L1-specific way of suprasegmental cue use, but also an extra, L2-specific way. Alternatively, under influence of their L2, the emigrants may have lost the L1-specific ability to exploit suprasegmental cues in favour of a new strategy that is more efficient for the L2, essentially replacing one strategy with another. Under this interpretation, the emigrants would only have the new L2-specific strategy at their disposal, even when listening to their L1.

To determine which of these two interpretations is the most likely, we decided to examine the emigrants’ use of suprasegmental stress cues in their L1, Dutch. However, in contrast to Experiment I, for which stimuli and control data were readily available from the literature, no suitable stimuli nor pre-existing control data were available for this Dutch experiment. Previous studies of suprasegmental stress cue use in Dutch (e.g., Cutler & van Donselaar, Reference Cutler and van Donselaar2001; Donselaar et al., Reference van Donselaar, Koster and Cutler2005; van Heuven, Reference van Heuven1988) could not provide a direct comparison as they had used paradigms that differed from the present study. Therefore, our new stimulus materials were first tested with a group of Dutch L1 listeners living in the Netherlands (Experiment II), before the emigrants’ use of suprasegmental stress cues in Dutch was assessed using the same materials (Experiment III).

Experiment II – validation of the Dutch materials with L1 controls

Method

Participants

Participants were 20 native Dutch-speaking participants (aged 18–67 years, M = 28.1, SD = 15.6; 15 females), recruited from the participant pool of the Centre for Language Studies at Radboud University in Nijmegen, the Netherlands. All were native speakers of Dutch and none reported any hearing problems. Data from a further five participants were excluded because it was revealed after testing had been completed that they did not meet participation requirements. Participants were given the choice between a gift voucher or course credit in return for their participation. Written informed consent was obtained from each participant before the experiment.

Materials

Twenty-one pairs of bisyllabic Dutch words (see Appendix S3, Supplementary Materials) were recorded by a 29-year-old female native speaker of Dutch. First syllables in each pair were segmentally the same but suprasegmentally different, in that one word in each pair had primary stress on the first syllable (e.g., GIEter “watering can”), while primary stress for the other word fell on the second syllable (e.g., giTAAR “guitar”). Mean log word frequencies in the CELEX lexical database of Dutch (Baayen et al., Reference Baayen, Piepenbrock and Gulikers1995), were 0.79 for first-syllable stress words, and 0.60 for second-syllable stress words. Each word was recorded individually, and then truncated at the end of the first syllable, giving 42 spoken word fragments. Syllable fragments contained full vowels only. Mean durational, F0 and amplitude measures for these syllable fragments are shown in Table 4, averaged across all fragments with the same stress type. As for Experiment I, duration was calculated across the entire syllable, whereas all other measures were computed over the voiced portion of the fragment only. Each fragment was presented four times each, for a total of 168 trials. As in Experiment I, each participant was presented with a different pseudo-randomised stimulus list, and fragments from the same word pair never occurred in successive trials.

Table 4. Mean values on six acoustic measures of the stimuli of Experiment II and III. Values were averaged across all fragments from source words with first-syllable (left) or second-syllable stress (right).

Procedure

The procedure was identical to that of Experiment I, with the following exceptions. Participants were tested individually in a quiet room at the Centre for Language Studies at Radboud University, using Sennheiser HD215 headphones. Written and oral instructions were in Dutch.

Results and discussion

There were no responses faster than 100 ms, so none were excluded. Overall response accuracy was 71.6%, with participants correctly selecting the source word for 71.4% (SD = 10.5) of fragments from words with first-syllable stress, and for 71.9% (SD = 9.7) of fragments from words with second-syllable stress (see Figure 3). Unlike the results of Experiment I, the response pattern was symmetric across fragment types, and there was no response bias; participants selected the first-syllable-stress response in 49.7% of all trials. As in Experiment I, we then compared participants’ judgments to chance level (i.e., 50%). The absence of response bias allowed us to do this for both types of fragments. Two-sided binomial tests showed that participants correctly identified both fragment types above chance level (first-syllable stress: z = 17.54, p < .001; second-syllable stress: z = 17.98, p < .001).

Figure 3. Mean percentage of correct responses from the Dutch control participants of Experiment II. Error bars represent standard errors.

The high response accuracy and above-chance performance found here are in line with previous findings regarding Dutch listeners’ use of suprasegmental cues to lexical stress and thus confirm the appropriateness of our set of Dutch stimulus materials.

Experiment III – use of suprasegmental stress cues in L1 listening

Method

Participants

Twenty of the emigrants who had previously participated in Experiment I also completed the present experiment. The remaining four emigrants were unavailable for participation. Participants were aged 27–73 years (M = 49.1, SD = 15.3; 11 females) and had resided in Australia for an average of 20.5 years (SD = 15.6). Their mean age at migration was 29.2 years (SD = 8.19, range: 18–52). Participants’ mean score on the Dutch version of the LexTALE was 91.7, indicating that they maintained high proficiency in their L1, Dutch, despite migration to an English-speaking environment. Participants provided written informed consent before the start of the experiment and were paid for their participation.

Materials and procedure

Stimulus materials and procedure were as in Experiment II, with the following exceptions. The emigrants were tested in a quiet room at our lab, their house or their workplace, using Sennheiser HD280 headphones. Written and oral instructions were in Dutch. To assess their Dutch proficiency, participants completed the Dutch version of the LexTALE (Lemhöfer & Broersma, Reference Lemhöfer and Broersma2012) once the experiment had finished.

Results and discussion

Data for one participant were lost due to experimenter error. Thus, the results from 19 emigrants were included in the analyses reported below. There were no responses faster than 100 ms, so none were excluded. The results are shown in Figure 4a; for comparison, the mean results of the Dutch listeners tested in Experiment II are shown in Figure 4b. Participants correctly responded in 66.7% of all trials. As in Experiment II, there was no response bias – first-syllable-stress responses were given on 51.1% of trials – and the response pattern was symmetrical: the emigrants correctly assigned 67.8% (SD = 10.4) of fragments from words with first-syllable stress, and 65.6% (SD = 12.0) of fragments from words with second-syllable stress. Participants’ judgments were once again compared to chance (i.e., 50%) with two-sided binomial tests. This comparison showed that participants performed above chance level assigning fragments from words with first-syllable stress (z = 14.24, p < .001), as well as from words with second-syllable stress (z = 12.52, p < .001) to their source words.

Figure 4. Mean percentage of correct responses from the Dutch emigrants of Experiment III (panel A) and the Dutch control participants of Experiment II (panel B). Error bars represent standard errors.

The performance accuracy of the emigrants was compared statistically to that of the Dutch control participants of Experiment II with a generalised linear mixed-effect model, again using family ‘binomial’ and the logit-link function from the lme4 package. Listener group (with Emigrants coded as −0.5 and Controls as 0.5) was entered into the model as a deviation-coded fixed categorical predictor. Random intercepts were added to the model for participants and items. Results of the model fit are displayed in Table 5, and showed no significant effect of Listener group, suggesting that the emigrants do not differ from the control participants in the extent to which they exploit suprasegmental stress cues in Dutch.

Table 5. Results of the generalised linear mixed-effects model on the responses of Experiments II and III.

This experiment investigated the ability of Dutch-English bilingual emigrants to exploit suprasegmental stress cues in their L1, Dutch. While the emigrants gave slightly fewer correct responses than a control group of L1 listeners residing in the Netherlands, statistical comparisons indicated that this difference was not significant. This suggests that when it comes to the use of suprasegmental stress during L1 listening, the emigrants still apply L1-appropriate lexical procedures, despite the fact that they no longer live in an L1 environment and predominantly use the L2 in daily life.

General discussion

Efficient listening is tailored to the language of input. Our results show that listeners who competently use two languages can adjust their speech processing separately for each language. This holds even when the languages share a particular phonological feature which is realised similarly in each; if the vocabulary structures are such that attention to that feature speeds lexical recognition in one language but not in the other, listeners indeed apportion their attention differently in accord with this contrast in utility. Also notably, this processing asymmetry occurs even in cases where the feature is useful in the L1 but not in the L2; a known and well-used L1 processing operation that can easily be applied in just the same way to the L2 will nonetheless eventually be abandoned if the return that it offers on word recognition speed is low.

Dutch and English are both lexical stress languages, with stressed and unstressed syllables suprasegmentally differing from one another in the same ways in each language. It is known that the amount of lexical competition is significantly reduced by taking suprasegmental information into account in Dutch, and that listeners do avail themselves of this assistance in listening. In our second experiment here we have extended evidence for this processing behaviour to a task not previously tested with listeners to Dutch. It is further known that this facilitatory effect of competition reduction for recognition does not appear in English, and that English listeners generally ignore the suprasegmental dimensions in recognising words.

Also previously known was that Dutch listeners with English as their L2 would indeed attend to the relevant suprasegmental cues when presented with English stimuli, and in consequence would actually outperform English-native listeners in a simple fragment identification task. What was not previously known was what we have shown in Experiments I and III of our present study: that when such listeners were no longer living in their L1 environment, but instead were exposed to the L2 on a daily basis, this greater experience would lead them to abandon the suprasegmental processing for English (despite maintaining it in their L1 Dutch). Thus, sufficient experience in L2 listening can, at least in this dimension, cause a listener to listen like a native.

The listener's ability to adapt to the conditions under which listening occurs is well documented; it is quite often our lot to have to compensate for noisy listening environments, or for talkers with unfamiliar speech patterns, and when the language involved is our L2 rather than our L1, the task is known to become even harder (Garcia Lecumberri et al. Reference Garcia Lecumberri, Cooke and & Cutler2010). Notwithstanding this variation in the resulting difficulty, the flexibility of listening is expressed in L2 as clearly as in L1, and our present results confirm that this can result in language-by-language adjustment differences.

Note that the emigrant community we have studied here is already known to consider the language as a factor in fine-tuning phoneme categorisation decisions separately for individual talkers. This fine-tuning, vital to successful speech processing, can be elicited in the laboratory using a two-part procedure in which listeners first hear a slightly unusual phoneme presented within a word that supports a clear phonemic interpretation (e.g., an unusual [f] at the end of autogra-); this is then followed by a phoneme categorisation test, with materials spoken by the same talker. The latter test reveals that (in this case) the listeners’ phoneme category for [f] as uttered by that talker has expanded to include the unusual sound that was heard. The emigrants’ adaptation processes were found, in such a test, to be active in their (dominant) L2 English, but not in their L1 Dutch (Bruggeman & Cutler, Reference Bruggeman and Cutler2020). That is, their speech perception was subject to language-specific constraints.

This L1/L2 adaptation asymmetry was ascribed to differences in the talker populations that provided the emigrants’ conversation partners. Although they reported using both English and Dutch extensively and regularly, the interlocutors involved in their English conversations were many and varied, while their Dutch interactions were mainly with family members. The proposed explanation was that such adaptation processes need not be called upon with highly familiar interlocutors, whose particular speech patterns will be long-known. Interestingly, the emigrant group was not alone in showing a differential listening pattern across interlocutor groups; just such a pattern also appeared in heritage learners with differing interlocutor groups for their languages (Cutler et al. Reference Cutler, Burchfield, Antoniou, Calhoun, Escudero, Tabain and Warren2019). These participants, born into Mandarin-speaking families but living in English-speaking Sydney, showed, like the emigrants, strong perceptual learning only in English (the environmental language which was also their language at work and among their friends). In Mandarin, substantially less learning was observed. And, like the emigrants, the heritage learners largely confined their use of their earliest language to interactions with family members. Note that the same Mandarin materials used in that study had produced robust perceptual learning with other participant groups, as had the Dutch materials used by the emigrant listeners. Thus, the flexibility of speech perception as displayed in perceptual learning also leads, where necessary, to outcomes that differ across a bilingual's languages.

The present finding significantly extends the scope of these earlier findings in that it involves a particular level of processing which, as the evidence shows, can be switched on or switched off. Suprasegmental realisation of syllables is noted, and with phonemic information is taken into account in lexical recognition decisions. We have seen evidence of this processing in our listeners’ L1, but not in their L2. In failing to apply such processing in the L2, the listeners have succeeded in listening in just the way that native L1 English listeners do.

Now it might seem sensible (and has been staunchly held to be so: van Heuven, Reference van Heuven1988) that if you as a listener are accustomed to taking account of suprasegmental cues to stress in lexical processing in L1, and the same cues are present in your L2, then you would use them there too. They might even provide a little reduction in the overall disadvantage that is the lot of L2 listeners. But there is clear evidence from the statistics of the English lexicon that accounts for why native listeners to English avoid using these cues. The added information provides, on average, less than a one-word difference in the number of competitor words that the listener needs to consider (Cutler & Pasveer, Reference Cutler, Pasveer, Hoffmann and Mixdorff2006, comparing word overlap statistics with versus without taking into account syllabic stress match). In languages in which experimental evidence indicates regular use of such cues by native listeners, analyses like this have shown much larger competition effects than are seen in English. In Spanish, for instance, in which use of the cues leads to competition being reduced by two-thirds (Cutler et al. Reference Cutler, Norris, Sebastián-Gallés, Kin and Bae2004), and in Dutch there is a 50% reduction (Cutler & Pasveer, Reference Cutler, Pasveer, Hoffmann and Mixdorff2006).

Note that both the Dutch and English vocabularies have a strong tendency for words to bear stress on the first syllable. Comparing the two complete vocabularies, however, reveals that medial syllables are significantly more likely to bear secondary stress – i.e., have a full vowel but not be marked for primary stress – in Dutch than in English. In English, the most likely vowel option for such medial syllables is the unstressed vowel schwa. If a lexical manipulation is undertaken, wherein those syllables in Dutch words are deemed to contain schwa instead of their actual full vowel, then recalculated competition statistics for such a version of Dutch now resemble those reported by Cutler and Pasveer (Reference Cutler, Pasveer, Hoffmann and Mixdorff2006) for English (Bruggeman & Cutler, Reference Bruggeman, Cutler, C. and M. D.2016).

Thus, it is actually sensible behaviour for English L1 listeners to take no account at all of the suprasegmental cues, because their usefulness is so limited that, one can only conclude, it simply does not warrant taking on the processing load that would result from adding such a calculation into the process of word recognition. As the empirical evidence described in the introduction of this study reveals, taking no account of this suprasegmental dimension is exactly what L1 users do when processing English words. The present results are thus encouraging for L2 learners; when an L1 listening strategy fails to provide us with increased processing efficiency for the L2, it may be abandoned in favour of a strategy more suitable for the language in question.

The findings of the present study are more heartening for L2 listeners than those of several previous phonetic examinations of late-bilingual emigrants, which have suggested that native-like L2 speech perception may be hard to achieve for those who, like the Dutch emigrants of the present study, move to the L2 environment as adults. Native Italian-speaking and native Catalan-speaking emigrants in Canada still perceive L2 English vowels differently from native English listeners more than 20 years after emigration and despite using English as the dominant language (Cebrian, Reference Cebrian2006; Flege & MacKay, Reference Flege and MacKay2004). Spanish–Swedish bilingual emigrants do not categorise L2 Swedish voice onset times in a native-like way even after residing in Sweden for many years (Abrahamsson & Hyltenstam, Reference Abrahamsson and Hyltenstam2009). The language pairs in these studies were not as closely related as Dutch and English, so their typological distance may have played a role in listeners’ difficulty attaining native-like L2 perception. However, a subset of the same Dutch emigrants who participated in the present study were previously found to be insensitive to the transitional cues in English that native listeners use to identify /f/ and /s/ (Bruggeman, Reference Bruggeman2016; Cutler et al. Reference Cutler, Bruggeman and Wagner2016), suggesting that typological proximity cannot be the sole factor enabling native-like L2 perception. The level of processing involved may also play an important role: in contrast to the lexical task used in the present study, all aforementioned studies concerned the lower-level perception of segmental differences; prosody was not investigated. Conceivably, lower level speech processing may be less malleable than the higher level of processing examined here.

Previous research on the use of lexical stress and other prosodic cues during L2 listening has mostly focused on the acquisition of new listening strategies (i.e., listeners’ ability to exploit cues that are useful for the L2 but are not used in their L1), and has found varying degrees of success holding for such acquisition (e.g., Choi et al., Reference Choi, Tong and Samuel2019; Dupoux et al., Reference Dupoux, Sebastián-Gallés, Navarrete and Peperkamp2008; Gilbert et al., Reference Gilbert, Honda, Phillips and Baum2021; Gilbert et al. Reference Gilbert, Itzhak, Baum, Barnes, Brugos, Shattuck-Hufnagel and Veilleux2016; Lin et al., Reference Lin, Wang, Idsardi and Xu2014; Tremblay, Reference Tremblay2008; Tremblay et al., Reference Tremblay, Broersma, Zeng, Kim, Lee and Shin2021). There is also some evidence suggesting that listeners are influenced by listening strategies from both their L1 and L2 when processing unfamiliar languages, without committing to a single strategy, even if strategies conflict with one another and one of them would be most appropriate for the unfamiliar language (Tremblay et al., Reference Tremblay, Namjoshi, Spinelli, Broersma, Cho, Kim, Martínez-García and Connell2017).

This does not occur for all L2 listeners under all circumstances, as evidenced by the results of Cooper et al. (Reference Cooper, Cutler and Wales2002). The listeners in that study differed from those in the present study on several dimensions: they were proficient enough in their L2 to complete listening experiments in that language, but they were not immersed in the L2 and did not use it as often, nor in as many facets of daily life, as the emigrants in the present study. As with many other aspects of L2 processing, a certain level of proficiency, usage, and/or dominance may thus be necessary for listeners to achieve a result such as we report here: being able to successfully ignore cues in the L2 though the same cues are useful in the L1. We do not know how large the L2 lexicon has to be to achieve this result, or even whether actually knowing many words is the crucial trigger. Future research should probe this further.

In the L2 version of the experiment, the emigrants gave a similar percentage of first-syllable-stress responses (60%) as the L1 controls from the study by Cooper et al. (Reference Cooper, Cutler and Wales2002). This buttresses our conclusion that they have acquired a sense of the lexical statistics of English, and of the frequency of occurrence of stressed and unstressed full vowels. Interestingly, in the L1 version of the experiment, the emigrants gave fewer first-syllable-stress responses (51%) than in the L2 version; indeed, their percentage was similar to that of the L1 controls. The emigrants thus appear to have retained earlier acquired knowledge about the Dutch lexicon and its statistics.

In the paradigm used here, listeners respond once their processing of the auditory stimulus has finished, leading to a measure of their use of the suprasegmental stress cues that may be termed ‘offline’. Thus, although the emigrants’ use of suprasegmental cues to lexical stress proved native-like in both their languages, we do not yet know exactly how they analyse such cues during spoken-word recognition – for instance, we do not know whether bilinguals’ knowledge of L2 suprasegmental properties modulates lexical competition in L2 listening, as L1 knowledge does for L1 listeners (Jesse et al. Reference Jesse, Poellmann and Kong2017; Reinisch et al. Reference Reinisch, Jesse and McQueen2010; Sulpizio & McQueen, Reference Sulpizio and McQueen2012).

Future studies could also examine the exact (segmental and suprasegmental) acoustic cues used by each listener population. The stimulus materials of the present study were specifically designed not to contain segmental cues to lexical stress. So, while our findings clearly indicate that the Dutch emigrants use suprasegmental cues to English stress to a similar extent as L1 listeners of English, they only allow indirect inferences regarding their use of segmental cues. Additionally, since we used naturally produced speech and did not manipulate duration, amplitude and pitch independently of one another, the results of the present study do not allow us to draw conclusions about the exact weighting listeners apply to the individual suprasegmental cues to stress. Dutch listeners to L2 English prioritise pitch cues (Cutler et al., Reference Cutler, Wales, Cooper, Janssen, Trouvain and Barry2007; Tremblay et al., Reference Tremblay, Broersma, Zeng, Kim, Lee and Shin2021). However, since proficient Dutch L2 listeners seem to pay more attention to vowel quality in English than those who are less proficient (Tremblay et al., Reference Tremblay, Broersma, Zeng, Kim, Lee and Shin2021), it appears that L2 cue weighting becomes more native-like with increasing proficiency. To further explore this possibility, we examined which acoustic correlates to stress were used by the highly proficient emigrants tested here in Experiment I, and compared this to the findings of Cutler et al. (Reference Cutler, Wales, Cooper, Janssen, Trouvain and Barry2007), who analysed the cue use of the less-proficient Dutch L2 listeners of English who participated in the study by Cooper et al. (Reference Cooper, Cutler and Wales2002). This showed a mixed pattern: like Cooper et al.'s L2 listeners, the emigrants identified fragments with primary stress more accurately with higher mean rms amplitude. However, no significant correlations were found between the emigrants’ response accuracy and any of the other cues used by the less-proficient L2 listeners tested by Cooper et al. (Reference Cooper, Cutler and Wales2002). This supports the notion that cue weighting may change with increasing proficiency. For L2-dominant, highly proficient L2 listeners like the emigrants tested here, this may then potentially have follow-on effects on the native-likeness of their cue weighting in the L1, Dutch.

As noted, Dutch and English are closely related languages that are highly similar in many respects. Among the things we do not know is also whether the present findings might extend to L2 listeners whose L1 and L2 are more typologically distant. On the one hand, such listeners may be more likely to abandon L1 strategies that do not improve listening efficiency during L2 listening, since the greater difference between their languages may have lowered their expectations regarding the cross-language applicability of listening strategies. On the other hand, the lack of overlap between their languages may make it harder for these listeners to abandon L1 strategies. The acquisition of new listening strategies for the L2 that increase proficiency may have to be prioritised over the abandonment of L1 strategies that impede efficiency but not performance. But this is for the future; for now, we are sure that processing strategies may be deployed in a language-by-language fashion. We further know that familiar L1 strategies that may have been applied to the L2 when the L2 was a newer experience can, once that L2 has become a more familiar and even dominant communication medium, simply be jettisoned if they do not pay off.

Supplementary material

The supplementary material for this article can be found at https://doi.org/10.1017/S1366728923000305

Appendix S1: Tally of Dutch emigrants’ answers to the question “Please indicate to what extent you use Dutch and English in the situations listed”.

Appendix S2: English words used in Experiment I.

Appendix S3: Dutch words used in Experiment II and III.

Acknowledgments

We thank Mandy Visser for recording the Dutch stimuli, Margret van Beuningen and Mirjam Broersma for facilitating data collection for Experiment II, and Mirjam Broersma for helpful discussions. This work is dedicated to the memory of Anne Cutler, who passed away on 7 June 2022. At the time of her death, we had completed most of the requested revisions for this article. Sadly, Anne never got to see the final version.

Competing interest declaration

Competing interests: The author(s) declare none.

Footnotes

deceased author

References

Abrahamsson, N., & Hyltenstam, K. (2009). Age of onset and nativelikeness in a second language: Listener perception versus linguistic scrutiny. Language Learning, 59, 249306. doi:10.1111/j.1467-9922.2009.00507.xCrossRefGoogle Scholar
Baayen, R. H., Piepenbrock, R., & Gulikers, L. (1995). The Celex lexical database (CD-rom).Google Scholar
Bates, D., Maechler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67, 148. doi:10.18637/jss.v067.i01CrossRefGoogle Scholar
Bruggeman, L. (2016). Nativeness, dominance, and the flexibility of listening to spoken language. [Doctoral dissertation], Western Sydney University. https://pure.mpg.de/rest/items/item_2332366_5/component/file_2334720/contentGoogle Scholar
Bruggeman, L., & Cutler, A. (2016). Lexical manipulation as a discovery tool for psycholinguistic research. In C., Carignan & M. D., Tyler (Eds.), Proceedings of the 16th Australasian International Conference on Speech Science and Technology (pp. 313316). Parramatta.Google Scholar
Bruggeman, L., & Cutler, A. (2020). No L1 privilege in talker adaptation. Bilingualism: Language and Cognition, 23, 681693. doi:10.1017/S1366728919000646CrossRefGoogle Scholar
Cebrian, J. (2006). Experience and the use of non-native duration in L2 vowel categorization. Journal of Phonetics, 34, 372387. doi:10.1016/j.wocn.2005.08.003CrossRefGoogle Scholar
Choi, W. (2022). Theorizing positive transfer in cross-linguistic speech perception: The Acoustic-Attentional-Contextual hypothesis. Journal of Phonetics, 91, 101135. doi:10.1016/j.wocn.2022.101135CrossRefGoogle Scholar
Choi, W., Tong, X., & Samuel, A. G. (2019). Better than native: Tone language experience enhances English lexical stress discrimination in Cantonese-English bilingual listeners. Cognition, 189, 188192. doi:10.1016/j.cognition.2019.04.004CrossRefGoogle ScholarPubMed
Chrabaszcz, A., Winn, M., Lin, C. Y., & Idsardi, W. J. (2014). Acoustic cues to perception of word stress by English, Mandarin, and Russian speakers. Journal of Speech, Language, and Hearing Research, 57, 14681479. doi:10.1044/2014_JSLHR-L-13-0279CrossRefGoogle ScholarPubMed
Clopper, C. G. (2002). Frequency of stress patterns in English: A computational analysis. Indiana University Linguistics Club Working Papers Online, 2, 19.Google Scholar
Clyne, M., & Pauwels, A. (1997). Use, maintenance, structures, and future of Dutch in Australia. In J., Klatter-Folmer & S., Kroon (Eds.), Dutch Overseas: Studies in maintenance and loss of Dutch as an immigrant language (pp. 3449). Tilburg University Press.Google Scholar
Cooper, N., Cutler, A., & Wales, R. (2002). Constraints of lexical stress on lexical access in English: Evidence from native and non-native listeners. Language and Speech, 45, 207228. doi:10.1177/00238309020450030101CrossRefGoogle ScholarPubMed
Cutler, A. (1986). Forbear is a homophone: Lexical prosody does not constrain lexical access. Language and Speech, 29, 201220. doi:10.1177/002383098602900302CrossRefGoogle Scholar
Cutler, A. (2009). Greater sensitivity to prosodic goodness in non-native than in native listeners. The Journal of the Acoustical Society of America, 125, 35223525. doi:10.1121/1.3117434CrossRefGoogle ScholarPubMed
Cutler, A. (2012). Native listening: Language experience and the recognition of spoken words. The MIT Press.CrossRefGoogle Scholar
Cutler, A., Bruggeman, L., & Wagner, A. (2016). Use of language-specific speech cues in highly proficient second-language listening. The Journal of the Acoustical Society of America, 139, 21612161. doi:10.1121/1.4950402CrossRefGoogle Scholar
Cutler, A., Burchfield, L. A., & Antoniou, M. (2019). A criterial interlocutor tally for successful talker adaptation? In Calhoun, S., Escudero, P., Tabain, M., & Warren, P. (Eds.), Proceedings of the 19th International Congress of Phonetic Sciences (pp. 14851489). Australasian Speech Science and Technology Association Inc.Google Scholar
Cutler, A., & Carter, D. M. (1987). The predominance of strong initial syllables in the English vocabulary. Computer Speech & Language, 2, 133142. doi:10.1016/0885-2308(87)90004-0CrossRefGoogle Scholar
Cutler, A., & Jesse, A. (2021). Word stress in speech perception. In Pardo, J. S., Nygaard, L. C., Remez, R. E., & Pisoni, D. B. (Eds.), The handbook of speech perception (2 ed.). Wiley-Blackwell.Google Scholar
Cutler, A., Norris, D., & Sebastián-Gallés, N. (2004). Phonemic repertoire and similarity within the vocabulary. In Kin, S. & Bae, M. J. (Eds.), Proceedings of the 8th International Conference on Spoken Language Processing (Interspeech 2004-ICSLP) (pp. 6568). Sunjijn Printing Co.Google Scholar
Cutler, A., & Pasveer, D. (2006). Explaining cross-linguistic differences in effects of lexical stress on spoken-word recognition. In Hoffmann, R. & Mixdorff, H. (Eds.), Proceedings of Speech Prosody 2006. TUD Press.Google Scholar
Cutler, A., & van Donselaar, W. (2001). Voornaam is not (really) a homophone: lexical prosody and lexical access in Dutch. Language and Speech, 44, 171195.CrossRefGoogle Scholar
Cutler, A., Wales, R., Cooper, N., & Janssen, J. (2007). Dutch listeners' use of suprasegmental cues to English stress. In Trouvain, J. & Barry, W. J. (Eds.), Proceedings of the 16th International Congress of Phonetic Sciences (pp. 19131916). Pirrot.Google Scholar
Dupoux, E., Sebastián-Gallés, N., Navarrete, E., & Peperkamp, S. (2008). Persistent stress ‘deafness’: The case of French learners of Spanish. Cognition, 106, 682706. doi:10.1016/j.cognition.2007.04.001CrossRefGoogle ScholarPubMed
Fear, B. D., Cutler, A., & Butterfield, S. (1995). The strong/weak syllable distinction in English. The Journal of the Acoustical Society of America, 97, 18931904. doi:10.1121/1.412063CrossRefGoogle ScholarPubMed
Flege, J. E., & MacKay, I. R. A. (2004). Perceiving vowels in a second language. Studies in Second Language Acquisition, 26, 134. doi:10.1017/S0272263104026117CrossRefGoogle Scholar
Fourakis, M. (1991). Tempo, stress, and vowel reduction in American English. The Journal of the Acoustical Society of America, 90, 18161827. doi:10.1121/1.401662CrossRefGoogle ScholarPubMed
Garcia Lecumberri, M. L., Cooke, M., & Cutler, A. (2010). Non-native speech perception in adverse conditions: A review. Speech Communication, 52, 864886. doi:10.1016/j.specom.2010.08.014CrossRefGoogle Scholar
Gilbert, A. C., Honda, C. T., Phillips, N. A., & Baum, S. R. (2021). Near native-like stress pattern perception in English-French bilinguals as indexed by the mismatch negativity. Brain and Language, 213, 104892. doi:10.1016/j.bandl.2020.104892CrossRefGoogle ScholarPubMed
Gilbert, A. C., Itzhak, I., & Baum, S. (2016). A cross-language investigation of word segmentation by bilinguals with varying degrees of proficiency: Preliminary results. In Barnes, J., Brugos, A., Shattuck-Hufnagel, S., & Veilleux, N. (Eds.), Proceedings of Speech Prosody 2016 (pp. 3639).CrossRefGoogle Scholar
Gussenhoven, C. (1983). Stress shift and the nucleus. Linguistics, 21, 303340. doi:10.1515/ling.1983.21.2.303CrossRefGoogle Scholar
Jesse, A., Poellmann, K., & Kong, Y-Y. (2017). English listeners use suprasegmental cues to lexical stress early during spoken-word recognition. Journal of Speech, Language, and Hearing Research, 60, 190198. doi:10.1044/2016_JSLHR-H-15-0340CrossRefGoogle ScholarPubMed
Kager, R. W. J., & Visch, E. (1985). Rhythmic stress phenomena in English and Dutch. Formal Parameters of Generative Grammar, 1, 5863.Google Scholar
Kim, H., & Tremblay, A. (2021). Korean listeners’ processing of suprasegmental lexical contrasts in Korean and English: A cue-based transfer approach. Journal of Phonetics, 87, 101059. doi:10.1016/j.wocn.2021.101059CrossRefGoogle Scholar
Koster, M., & Cutler, A. (1997). Segmental and suprasegmental contributions to spoken-word recognition in Dutch. In Proceedings of EUROSPEECH 97 (pp. 21672170). International Speech Communication Association.CrossRefGoogle Scholar
Lemhöfer, K., & Broersma, M. (2012). Introducing LexTALE: A quick and valid Lexical Test for Advanced Learners of English. Behavior Research Methods, 44, 325343. doi:10.3758/s13428-011-0146-0CrossRefGoogle ScholarPubMed
Lenth, R. (2019). emmeans: Estimated Marginal Means, aka Least-Squares Means (Version R package version 1.4.2.). https://CRAN.R-project.org/package=emmeansGoogle Scholar
Liberman, M., & Prince, A. (1977). On stress and linguistic rhythm. Linguistic Inquiry, 8, 249336.Google Scholar
Lin, C. Y., Wang, M. I. N., Idsardi, W. J., & Xu, Y. I. (2014). Stress processing in Mandarin and Korean second language learners of English. Bilingualism: Language and Cognition, 17, 316346. doi:10.1017/S1366728913000333CrossRefGoogle Scholar
Qin, Z., Chien, Y-F, & Tremblay, A. (2017). Processing of word-level stress by Mandarin-speaking second language learners of English. Applied Psycholinguistics, 38, 541570. doi:10.1017/S0142716416000321CrossRefGoogle Scholar
R Core Team (2019). R: A language and environment for statistical computing (Version 3.6.1) [Computer Software]. R Foundation for Statistical Computing. https://www.R-project.org/Google Scholar
Reinisch, E., Jesse, A., & McQueen, J. M. (2010). Early use of phonetic information in spoken word recognition: Lexical stress drives eye movements immediately. The Quarterly Journal of Experimental Psychology, 63, 772783. doi:10.1080/17470210903104412CrossRefGoogle ScholarPubMed
Scarborough, R., Keating, P., Mattys, S. L., Cho, T., & Alwan, A. (2009). Optical phonetics and visual perception of lexical and phrasal stress in English. Language and Speech, 52, 135175. doi:10.1177/0023830909103165CrossRefGoogle ScholarPubMed
Slowiaczek, L. M. (1990). Effects of lexical stress in auditory word recognition. Language and Speech, 33, 4768. doi:10.1177/002383099003300104CrossRefGoogle Scholar
Slowiaczek, L. M. (1991). Stress and context in auditory word recognition. Journal of Psycholinguistic Research, 20, 465481. doi:10.1007/BF01067638CrossRefGoogle ScholarPubMed
Sluijter, A. M. C., & van Heuven, V. J. (1996). Acoustic correlates of linguistic stress and accent in Dutch and American English. In Proceedings of the 4th International Congress on Spoken Language Processing (ICSLP96). Philadelphia, PA.Google Scholar
Small, L. H., Simon, S. D., & Goldberg, J. S. (1988). Lexical stress and lexical access: Homographs versus nonhomographs. Perception & Psychophysics, 44, 272280. doi:10.3758/BF03206295CrossRefGoogle ScholarPubMed
Sulpizio, S., & McQueen, J. M. (2012). Italians use abstract knowledge about lexical stress during spoken-word recognition. Journal of Memory and Language, 66, 177193. doi:10.1016/j.jml.2011.08.001CrossRefGoogle Scholar
Tremblay, A. (2008). Is second language lexical access prosodically constrained? Processing of word stress by French Canadian second language learners of English. Applied Psycholinguistics, 29, 553584. doi:10.1017/S0142716408080247CrossRefGoogle Scholar
Tremblay, A., Broersma, M., Zeng, Y., Kim, H., Lee, J., & Shin, S. (2021). Dutch listeners' perception of English lexical stress: A cue-weighting approach. The Journal of the Acoustical Society of America, 149, 37033714. doi:10.1121/10.0005086CrossRefGoogle Scholar
Tremblay, A., Namjoshi, J., Spinelli, E., Broersma, M., Cho, T., Kim, S., Martínez-García, M. T., & Connell, K. (2017). Experience with a second language affects the use of fundamental frequency in speech segmentation. PLoS ONE, 12, e0181709. doi:10.1371/journal.pone.0181709CrossRefGoogle ScholarPubMed
Tremblay, A., & Spinelli, E. (2014). English listeners’ use of distributional and acoustic-phonetic cues to liaison in French: Evidence from eye movements. Language and Speech, 57, 310337. doi:10.1177/0023830913504569CrossRefGoogle Scholar
van Donselaar, W., Koster, M., & Cutler, A. (2005). Exploring the role of lexical stress in lexical recognition. The Quarterly Journal of Experimental Psychology, 58A, 251273. doi:10.1080/02724980343000927CrossRefGoogle Scholar
van Heuven, V. J. (1988). Effects of stress and accent on the human recognition of word fragments in spoken context: Gating and shadowing. In Proceedings of the 7th FASE/Speech-88 Symposium (pp. 811–818). Edinburgh.Google Scholar
van Heuven, V. J., & de Jonge, M. (2011). Spectral and temporal reduction as stress cues in Dutch. Phonetica, 68, 120132. doi:10.1159/000329900CrossRefGoogle ScholarPubMed
van Leyden, K., & van Heuven, V. J. (1996). Lexical stress and spoken word recognition: Dutch vs. English. In Cremers, C. & den Dikken, M. (Eds.), Linguistics in the Netherlands 1996 (Vol. 13, pp. 159170). John Benjamins.Google Scholar
Weber, A., & Cutler, A. (2006). First-language phonotactics in second-language listening. The Journal of the Acoustical Society of America, 119, 597607. doi:10.1121/1.2141003CrossRefGoogle ScholarPubMed
Zhang, Y., & Francis, A. (2010). The weighting of vowel quality in native and non-native listeners’ perception of English lexical stress. Journal of Phonetics, 38, 260271. doi:10.1016/j.wocn.2009.11.002CrossRefGoogle Scholar
Figure 0

Figure 1. Waveforms and spectrograms for the English minimal stress pair PERvert – perVERT (left) and the Dutch minimal stress pair VOORnaam – voorNAAM (right). Blue lines represent pitch contours.

Figure 1

Table 1. Mean values on six acoustic measures of the stimuli of Experiment I. Values were averaged across all fragments from source words with first-syllable (left) or second-syllable stress (right).

Figure 2

Figure 2. Mean percentage of correct responses from Experiment I (panel A) and from Cooper et al. (2002; panel B). Error bars represent standard errors.

Figure 3

Table 2. Helmert contrast coding for the predictor Listener group.

Figure 4

Table 3. Results of the generalised linear mixed-effects model on the responses of Experiment I and of Experiment 3 from Cooper et al. (2002).

Figure 5

Table 4. Mean values on six acoustic measures of the stimuli of Experiment II and III. Values were averaged across all fragments from source words with first-syllable (left) or second-syllable stress (right).

Figure 6

Figure 3. Mean percentage of correct responses from the Dutch control participants of Experiment II. Error bars represent standard errors.

Figure 7

Figure 4. Mean percentage of correct responses from the Dutch emigrants of Experiment III (panel A) and the Dutch control participants of Experiment II (panel B). Error bars represent standard errors.

Figure 8

Table 5. Results of the generalised linear mixed-effects model on the responses of Experiments II and III.

Supplementary material: PDF

Bruggeman and Cutler supplementary material

Appendices S1-S3

Download Bruggeman and Cutler supplementary material(PDF)
PDF 98.5 KB