INTRODUCTION
Bilingual children are commonly categorized as either simultaneous or sequential (Goldstein, Reference Goldstein2004), although the age criteria defining each group vary among researchers. Without controversy, children who learn two languages from birth are designated as simultaneous bilinguals (Padilla & Lindholm, Reference Padilla, Lindholm, Martinez and Mendoza1984), but so have been those who learn two languages within the first year after birth (Genesee, Paradis & Crago, Reference Genesee, Paradis and Crago2004), or even up to age three (McLaughlin, Reference McLaughlin1978). Following McLaughlin, we identify children exposed to both languages before three years of age as ‘simultaneous bilinguals’, or, simply, bilingual children.
Phonetic category formation in the sense of this paper refers to the processes by which bilingual or second language (L2) speakers come to distinguish phonetic details of shared phonemes in each language (Flege & Eefting, Reference Flege and Eefting1987; Flege, Reference Flege and Strange1995; Yeni-Komshian, Flege & Liu, Reference Yeni-Komshian, Flege and Liu2000). For example, Spanish and English share the phoneme /p/, but /p/ in each language is produced with different voice-onset-time (VOT) values. Thus, it is an interesting question whether bilingual or L2 speakers exhibit accurate phonetic realizations of Spanish /p/ and English /p/ when speaking the two languages. While phonetic category formation in production has been investigated extensively in adult bilingual and L2 speakers (see Flege, Reference Flege and Strange1995; Bohn & Munro, Reference Bohn and Munro2007, for more information), little such research has been carried out with bilingual children, although more robust findings have been reported in the perceptual domain.
The purpose of this study was to determine whether three-year-old Korean–English bilingual (KEB) children fully distinguish stop and vowel categories in their productions of the two languages, and how these two systems interact with each other. Whether bilingual children develop one versus two linguistic systems in the learning of their respective languages has long been of interest to bilingual researchers (Swain, Reference Swain1972; Padilla & Liebmann, Reference Padilla and Liebmann1975; Volterra & Taeschner, Reference Volterra and Taeschner1978; Genesse, Reference Genesee1989; Goodz, Reference Goodz1989). Although dominant findings support the notion that bilingual children do operate with two systems, the question has not been fully investigated in the phonetic domain among young bilingual children (i.e. three-year-olds) due to methodological limitations (e.g. sample size). In particular, most such studies examined the two-way voiced and voiceless stop contrasts of English–Spanish or English–German bilingual children. But it is not clear whether similar characteristics are found among KEB children because Korean has a three-way laryngeal manner contrast in stops. Moreover, previous studies examining the phonetic categories of young bilingual children investigated either stops or vowels, not both systems simultaneously, leaving open the question of whether these develop in tandem or in sequence.
More recent attention has been paid to examining the degree of interaction between the two languages of bilingual children (Paradis, Reference Paradis and Dopke2000), and several hypotheses have been proposed. Positing three types of interdependence (transfer, acceleration, delay), Paradis and Genesee (Reference Paradis and Genesee1996) hypothesized that the grammars of bilingual children are acquired autonomously during the acquisition process. Conversely, Johnson and Lancaster (Reference Johnson and Lancaster1998) concluded that the features of one language may influence those of the other. While most research on this issue has dealt with the lexical, syntactic, and phonological domains, the present study pursues the question with respect to phonetic properties.
English and Korean stops and vowels
Before turning to the existing literature in phonetic category formation, we briefly review the phonetics of stops and front vowels in Korean and English. The Korean and English stop systems are compared in Table 1. Korean contrasts three manners of stops, all of which are voiceless in word-initial position, but these differ with respect to degree of aspiration and f 0 in the following vowel (Kang & Guion, Reference Kang and Guion2006). The three Korean stop types are often called ‘lenis’, ‘aspirated’, and ‘fortis’. The VOT periods of lenis and aspirated stops are longer than for fortis stops, while the f 0 values of aspirated and fortis stops are higher than for lenis stops (Cho, Jun & Ladefoged, Reference Cho, Jun and Ladefoged2002; Silva, Reference Silva2006; Oh, Reference Oh2011). Among adult male speakers (Oh, Reference Oh2011), Korean lenis stops are produced with moderate-lag VOT (44–66 ms) and low f 0 (84–184 Hz), aspirated with long-lag VOT (74–97 ms) and high f 0 (105–216 Hz), and fortis with short-lag VOT (10–24 ms) and high f 0 (93–204 Hz). Recent research shows that the VOT values for phrase-initial lenis and aspirated stops are converging among contemporary, younger Seoul speakers, who distinguish these now only by f 0 (Silva, Reference Silva2006; Iverson & Park, Reference Iverson and Park2008). Unlike Korean, in word/phrase initial position English contrasts just two stop categories, usually characterized as ‘voiced’ versus ‘voiceless’ (see Iverson & Salmons, Reference Iverson and Salmons1995, for interpretation of these as ‘lax’ versus ‘aspirated’). These contrasts are also reliably differentiated by VOT. Thus, Lisker and Abramson (Reference Lisker and Abramson1964) reported that English voiceless stops are produced with long-lag VOT (80 ms) and voiced stops with short-lag VOT (15 ms). Similar to Korean stops, moreover, f 0 plays a role in differentiating voiceless from voiced stops in English (Whalen, Abramson, Lisker & Mody, Reference Whalen, Abramson, Lisker and Mody1993) in that the former are often associated with higher f 0 than the latter.
notes: Korean fortis stops are marked with the apostrophe diacritic; English voiceless stops, typically aspirated (especially in word-initial position), are indicated with inverted apostrophe.
There are five front vowels in English, all unrounded: /i ɪ e ε æ/. These subcategorize as either tense (/i e/) or lax (/ɪ ε æ/) (Ladefoged, Reference Ladefoged2006). Korean monophthongal vowels, on the other hand, include just two front unrounded vowels: /i ε/. An additional mid front unrounded monophthongal vowel, /e/, has merged with /ε/ in the speech of most Koreans, resulting in a contemporary system of seven monophthongs overall (see Lee & Iverson, Reference Lee and Iverson2012, for more information). Unlike English, the Korean vowel system does not make a tense–lax distinction (e.g. /i/ vs. /ɪ/).
The acoustic features of corresponding Korean and English vowels are manifested differently (Yang, Reference Yang1996). In general, F2 values of English vowels are higher than those of Korean vowels for both male and female speakers. For male speakers, the five English front vowels occupy vowel space distinct from the two Korean front vowels /i ε/. For female speakers, however, English and Korean /ε/ are produced in the same vocalic space, though Korean /i/ remains distinct from the other English vowels.
Stop and vowel development in monolingual children
Stop and vowel development in monolingual English- and Korean-learning children has been well established. Literature reports that English-learning children stabilized voiced and voiceless stop phonemes at 21–22 months and 23–24 months, respectively (Macken & Barton, Reference Macken and Barton1980; Lowenstein & Nittrouer, Reference Lowenstein and Nittrouer2008). After reviewing ten studies of phonological development, Bernthal, Bankson, and Flipsen (Reference Bernthal, Bankson and Flipsen2009) summarized that English voiced and voiceless phonological stop categories are fully developed by age three. In comparison to English, age of acquisition for Korean stop contrasts appears to vary. Kim and Stoel-Gammon (Reference Kim and Stoel-Gammon2009) reported that accuracy for stops produced by monolingual Korean children was less than 70% at four years of age, arguing that Korean children distinguished stops mainly by VOT, with f 0 classification emerging at around 2;6. The accuracy of Korean stops reported in Kim and Stoel-Gammon was lower than the previous studies (Kim, Reference Kim1996; Kim & Pae, Reference Kim and Pae1995), where 75% of stops were accurately produced by two to three years of age. These discrepancies may be due to the sample size. Kim and Stoel-Gammon (Reference Kim and Stoel-Gammon2009) included only ten children per age group.
Unlike stops, the literature consistently reports that both English and Korean vowel distinctions are fully implemented by age three. Larkins (Reference Larkins, Irwin and Wang1983) collected 200 spontaneous utterances from each of twenty children between 34 and 38 months of age. All English monophthongs, except for /ɪ/ and [ɚ] (< /ər/), were produced with 100% accuracy. Even these two vowels were produced with 99% accuracy. Pollock and Berni (Reference Pollock and Berni2003) also reported a high percentage correct (97%) of non-rhotic vowels for 36–47-month-old normally developing children. For Korean, Kwon (Reference Kwon1982) found that all vowels were produced with 94% accuracy at age 3;3. At age 5;5, /a i u ε/ were produced with 100% accuracy, the others (/o ɨ ʌ/) with better than 95% accuracy. Thus, the findings of previous studies indicate that both monolingual English- and Korean-learning children were able to produce the stops and vowels of their native languages with high accuracy by three years of age.
Although cross-linguistic studies may provide a basis for understanding the speech of bilingual children, they have mainly been limited to babbling (Boysson-Bardies, Halle, Sagart & Durand, Reference Boysson-Bardies, Halle, Sagart and Durand1989; Whalen, Levitt & Goldstein, Reference Whalen, Levitt and Goldstein2007; Rvachew, Alhaidary, Mattock & Polka, Reference Rvachew, Alhaidary, Mattock and Polka2008). Only a few studies have examined early word production cross-linguistically. Chung, Kong, Edwards, Weismer, Fourakis, and Hwang (Reference Chung, Kong, Edwards, Weismer, Fourakis and Hwang2012) conducted a cross-linguistic examination of the three ‘corner’ vowels /i a u/ in English, Cantonese, Korean, Japanese, and Greek children and adults in order to identify cross-linguistic differences in the acoustic realization of the three shared vowels. They found that children as young as two years of age demonstrated language-specific characteristics in their vowel spaces. Specifically, Cantonese /i/ and /u/ vowels were more peripherally located in the vowel space whereas those of English and Japanese were more centralized.
Lee and Iverson (Reference Lee, Potamianos and Narayanan2009, Reference Lee and Iverson2011) also investigated whether cross-linguistic differences appear in the vowel and stop productions of monolingual English-speaking and Korean-speaking children at ages five and ten, and reported that both vowels and stops reflected differences between English and Korean to some degree. English-speaking children as young as five years demonstrated higher F2 values for the back vowel /u/ than Korean-speaking children, similar to adults (Yang, Reference Yang1996). Stops produced by the five- and ten-year-old monolingual English- and Korean-speaking children also showed some distinctive characteristics. For example, five-year-old children produced Korean aspirated stops with longer VOT than English voiceless stops; however, they produced English voiced stops and Korean fortis stops similarly in terms of both VOT and f 0. In addition, Lee and Iverson (Reference Lee and Iverson2011) found that in five-year-old monolingual children, the five stop categories across English and Korean were not fully separated acoustically, whereas ten-year-old monolingual children distinguished all five categories. It is of interest, then, to investigate the cross-linguistic acoustic characteristics of bilingual children younger than five years of age.
Stop and vowel development in bilingual children
A number of production accuracy studies of bilingual toddlers have been undertaken, but the majority of these have dealt with Spanish–English bilingual (SEB) children. Goldstein and Washington (Reference Goldstein and Washington2001) examined English and Spanish stops produced by four-year-old SEB children and found that production accuracy of English stops (97%) and Spanish stops (93%) was similar to monolingual English-speaking (96·8%) or Spanish-speaking children (90%). High accuracy of stop production was also reported in a subsequent study (Goldstein, Fabiano & Washington, Reference Goldstein, Fabiano and Washington2005). In a recent study, Fabiano-Smith and Goldstein (Reference Fabiano-Smith and Goldstein2010) examined consonant accuracy among three-year-old SEBs, and reported accuracy values of 77% for Spanish and 73% for English, which were lower than the accuracy values found for four-year-old SEBs in the Goldstein and Washington (Reference Goldstein and Washington2001) study. Unlike previous studies, monolingual children were significantly more accurate than bilinguals for Spanish stops, though not for English stops. With respect to vowels, English and Spanish tokens produced by three- to four-year-old SEB children were produced with over 80% accuracy (Gildersleeve-Newmann, Kester, Davis & Pena, Reference Gildersleeve-Neumann, Kester, Davis and Pena2008; Gildersleeve-Newmann, Pena, Davis & Kester, Reference Gildersleeve-Neumann, Pena, Davis and Kester2009). These studies suggest that overall production of consonant and vowel accuracy in both English and Spanish became similar to monolingual English-speaking children by four years of age.
Inasmuch as many three- to four-year-old bilingual children demonstrate well-established phonological stop categories, and are able to produce them with relatively high accuracy, the question arises as to whether bilingual children can form phonetically detailed categories as well. Fabiano-Smith and Goldstein (Reference Fabiano-Smith and Goldstein2010) hypothesized that “bilingual children perceive phonetically similar sounds as common between their two languages and categorize them into the same phonemic category” (p. 163), though there may be reduced use of allophonic variants of a shared phoneme. An acoustic investigation of phonetic category formation allows us to test this hypothesis.
Limited studies have examined phonetic category development in bilingual toddlers. Deuchar and Clark (Reference Deuchar and Clark1996) collected data on English and Spanish stop consonants produced by a Spanish–English bilingual child, reporting that adult-like English stop distinctions were established at 2;3, a point at which Spanish stop distinctions were just beginning to emerge. When English voiceless (aspirated) and Spanish voiceless (unaspirated) stops were compared, the VOT difference was significant. However, the VOT values of English voiced and Spanish voiced stops were not significantly different. In another study, Johnson and Wilson (Reference Johnson and Wilson2002) observed two Japanese–English bilingual children, one at 2;10 and the other at 4;8. While the older child differentiated English and Japanese in terms of the VOT of prevocalic voiceless stops, the younger child was not able to differentiate them. Kehoe, Lleo, and Rakow (Reference Kehoe, Lleo and Rakow2004) examined four German–Spanish bilingual children longitudinally from 1;9 to 3;0, and reported great individual variety. Only one child demonstrated a significant difference between Spanish unaspirated and German aspirated voiceless stops, but not between voiced stops across languages.
In a later study, Fabiano-Smith and Bunta (Reference Fabiano-Smith and Bunta2012) included a relatively larger number of three-year-old Spanish–English bilingual children (n = 8) in their evaluation of stop productions. The English and Spanish voiceless stops were compared to voiceless stops produced by eight monolingual Spanish- and eight monolingual English-speaking children. It was found that bilingual children produced the English stops with markedly less aspiration than their monolingual peers, i.e. with reduced VOT values, but they produced unaspirated Spanish stops similarly to Spanish-speaking children. In terms of cross-language contrasts, then, the bilingual children did not produce distinctively different VOT values for the voiceless stops of English and Spanish, rendering them both with short-lag VOT, whereas VOT values for English (aspirated) and Spanish (unaspirated) voiceless stops among monolingual children did differ, as in adult speech.
These previous investigations of early phonetic category formation thus present mixed results in that small-scale studies reported that bilingual children showed distinctive phonetic categories for voiceless stops across languages (specifically, aspirated versus unaspirated), whereas a group study found that bilingual children did not evidence distinctive categorization of these across the two languages. Phonetic category formation in very young bilingual children is still open to inquiry, therefore, and further studies are warranted to investigate whether bilingual toddlers can implement laryngeal timing distinctions among stops across languages.
It is well known, however, that after five years of age bilingual children are able to distinguish the laryngeal details of similar stop phonemes (Watson, Reference Watson1982; Lee & Iverson, Reference Lee and Iverson2011, Reference Lee and Iverson2012). Watson examined stop productions of five-, six, eight-, and ten-year-old French–English bilingual children. By the age of six years, children had developed two sets of distinctions, but not the five-year-old children. In recent studies, Lee and Iverson (Reference Lee and Iverson2011, Reference Lee and Iverson2012) examined Korean and English stops and vowels produced by Korean–English bilingual (KEB) children at five and ten years of age. The ten-year-old KEBs were found to distinguish all stop categories across the two languages, whereas the five-year-olds failed to differentiate details between phonetically similar English and Korean stops. Specifically, the five-year-old bilingual children produced English voiced stops and Korean fortis stops with similar VOT and f 0 values. Furthermore, their English voiceless and Korean lenis stop pairs and English voiceless and Korean aspirated stop pairs were also not differentiated. With respect to vowels, however, both age groups of KEB children had fully distinctive English and Korean vowel productions. Lee and Iverson examined all English and Korean vowels in their study, but front vowels were discussed only briefly. Both ages of KEB children produced Korean /i/ significantly differently from English /ɪ/, and Korean /ε/ was significantly different from English /æ/. Lee and Iverson stressed the importance of differential development in phonetic category formation between vowel and stop categories in bilingual children, noting that vowels are typically acquired earlier than consonants. But as they examined the speech of only five- and ten-year-old KEB children, the question remains as to just when phonetic category differentiation begins to emerge.
Autonomy versus interdependence
Assuming that a bilingual child possesses two separate systems, the next logical question is whether the two linguistic systems interact with each other. Paradis and Genesee (Reference Paradis and Genesee1996) argue for the two system hypothesis of ‘autonomous’ acquisition: “… the bilingual children show no evidence of transfer, acceleration, or delay in acquisition, and support the hypothesis that their grammars are acquired autonomously” (p. 1). Conversely, the notion that two systems interact so that speech production differs from that of monolingual children is referred to as ‘interdependence’ (Johnson & Lancaster, Reference Johnson and Lancaster1998). In support of the interdependence hypothesis, previous studies have consistently showed that speech production is not always the same between bilingual and monolingual children (Mack, Reference Mack and Nelde1990; Khattab, Reference Khattab2000; Paradis, Reference Paradis and Dopke2000; Johnson & Wilson, Reference Johnson and Wilson2002; Baker & Trofimovich, Reference Baker and Trofimovich2005; Lee & Iverson, Reference Lee and Iverson2011, Reference Lee and Iverson2012; Fabiano-Smith & Bunta, Reference Fabiano-Smith and Bunta2012). For example, Lee and Iverson found that ten-year-old KEB children produced Korean aspirated stops with longer VOT as compared to monolingual Korean children, indicating a dissimilation effect in which they tended to maximize the VOT differences among Korean lenis, Korean aspirated, and English voiceless stops. For vowels, ten-year-old KEB children produced higher F2 values in the Korean back vowels /u o/, whereas they produced English /æ/ with higher F1, which indicates both assimilation and dissimilation. Based on previous studies, it is not still clear whether and/or how the two linguistic systems interact. Further study is warranted.
Purpose of the study
The present study has three goals: first, to investigate whether three-year-old KEB children form distinctive phonetic categories in production; second, to examine whether distinctive phonetic production categories appear in both stop and vowel systems; and third, to investigate how the two systems interact. We first compare stops and vowels produced by monolingual English- and Korean-speaking children in order to identify cross-linguistic production patterns in monolinguals. Then we compare stops and vowels across English and Korean produced by KEB children to identify whether distinctive phonetic categories appear in stops and vowels. Finally, we compare stops and vowels produced by KEB children with those of their monolingual counterparts in order to identify similarities and differences in stop and vowel productions among the two groups.
METHODS
Participants
A total of forty-two children (15 monolingual English-speaking, 15 monolingual Korean-speaking, 12 KEB), whose ages ranged from 3;1 to 3;10, participated in the study. The monolingual English-speaking children (M = 3;5) were recruited from Texas in the US and the Korean-speaking children (M = 3;6) from Seoul metropolitan areas in South Korea. These children were raised in monolingual families where Southwestern American English or standard Korean (Seoul dialect) was spoken. All participating monolingual children were from families of middle or higher socioeconomic status and all mothers had received a high school or higher education. None of the participating children had a history of speech or hearing impairment, based on reports from both teachers and parents. In order to establish further their normal development of communication skills, monolingual children were examined using the Preschool Language Scale 4-Screening Test (PLS-4 Screening; Zimmerman, Steiner & Pond, Reference Zimmerman, Steiner and Pond2005) for English, and the Communication section of the Korean Age and Stage Questionnaire (K-ASQ; Heo, Squires, Lee & Lee, Reference Heo, Squires, Lee and Lee2006) for Korean. These screening tests only indicate whether a child demonstrates age-appropriate development. Test results confirmed that all monolingual children were within normal limits.
The bilingual children (M = 3;5), who lived in Texas, were recruited on the basis that a candidate child must satisfy four criteria: (a) be exposed to both English and Korean for at least 20 months; (b) have one parent who is a native speaker of English so that both languages are spoken at home, or attend English-speaking daycare centers at least three times per week if only Korean is spoken at home; (c) have a mother who received at least a high school education; and (d) be from families of median household income. Bilingual participant characteristics are illustrated in Table 2.
PLS-4 Screening and K-ASQ were also used to evaluate overall language/communication skills, although these assessments were not designed specifically for bilingual children. All bilingual children were within normal range for the K-ASQ and PLS-4. Based on the screening tests as well as interaction with the primary investigator, who is a Korean–English bilingual speaker, each child's proficiency was rated for each language on a scale from 0 (child could not speak the indicated language at all) to 4 (child had native-like proficiency in the language). These scales have been used in previous bilingual studies (Pena, Bedore & Rappazzo, Reference Pena, Bedore and Rappazzo2003; Fabiano-Smith & Goldstein, Reference Fabiano-Smith and Goldstein2010). The language proficiency for each child was discussed with parents. Based on the test results, speech and language skills during assessment, and parent reports, all of the bilingual children were rated as either 3 or 4 in both Korean and English.
Stimuli
Tables 3 and 4 show the words containing target stops and monopthongal front vowels for English and Korean. These words were selected because of similar (if not always identical) context, viz., non-high vowels following prevocalic stops, and because they are likely to be familiar to children as young as three years of age, with a few exceptions.
Data-collection procedure
Picture-naming tasks were used to elicit target stops and vowels. In the event that a three-year-old child did not know some target words, a delayed imitation technique was employed. For example, the facilitator produces the target word (e.g. ‘crying’) in a sentence and asks the child for the target word: “This boy is crying. There are tears on his face. What is he doing?” Some studies examining the speech of young children have employed direct imitation using a recorded audio stimulus naming the picture played via speakers, or the experimenter's voice prompt (Lee, Potamianos & Narayanan, Reference Lee, Potamianos and Narayanan1999’ Chung et al., Reference Chung, Kong, Edwards, Weismer, Fourakis and Hwang2012). The delayed imitation technique allows us to obtain target word productions, but also prevents children from directly imitating auditory cues, although Goldstein, Fabiano, and Iglesias (Reference Goldstein, Fabiano and Iglesias2004) as well as others (Bankson & Bernthal, Reference Bankson and Bernthal1982; Andrews & Fey, Reference Andrews and Fey1986) reported that the vast majority of words were produced identically in both imitative and spontaneous speech.
The monolingual and bilingual data were collected at daycare centers or in participants’ homes in a quiet room in a natural play setting. PLS-4 and/or K-ASQ were given first; then, target sounds were elicited using the ‘fishing game’, in which each of the target pictures was placed on a fish. When a child caught a fish, the child was asked to name the picture. Each word was elicited three times by asking “Say once more”. Most monolingual and bilingual children were able to produce the target words for vowels spontaneously. English got and cop were elicited using the delayed imitation technique for all children. English dot and Korean /tʰal/ ‘mask’ were also elicited by the delayed imitation technique for bilingual children.
A digital flash recorder (Marantz Model PMD670) and a wireless microphone (Sennheizer Model EW100) were used to record at a sampling rate of 44·1 kHz. Each child wore the wireless microphone clipped to clothing at the shoulder. A Korean–English bilingual researcher collected all Korean and English data from the bilingual children. English and Korean tokens were elicited separately. When English words were being elicited, conversation with the experimenter was in English only, and when Korean words were elicited, all conversation was in Korean.
Acoustic analysis
Computerized Speech Lab (model 4300, Kay Elemetrics) was used to analyze the recordings. Speech recordings were downsampled to 22·05 KHz. Among three productions for each target word, two productions were selected for acoustic analysis. The selected tokens were correctly produced and acoustically measurable productions with similar pitch or amplitude. Any tokens with missing burst or devoiced tokens were not included for acoustic analysis. Some children tended to produce the third repetition with either rising or falling pitch. Choosing two tokens which had relatively consistent pitch can provide more reliable data.
For stops, VOT and f 0 were obtained for each target word. VOT was measured from the beginning of the stop release to the onset of voicing in the following vowel, using both waveforms and wide-band spectrograms. f 0 was measured at voicing onset with a 25 ms window from the first harmonic value in the Fast Fourier Transform (FFT) in the following vowel. For vowels, a spectrogram of each word containing a target vowel was made using a 512-point discrete Fourier transform analysis with a 20 ms Hamming window. First and second formant frequency (F1 & F2) values were taken at the mid-point of the steady state portion of vowel between vowel onset and offset points and were computed automatically by Linear Predictive Coding (LPC) analysis with the order of 24 and visually verified using the spectrographic display. Formant frequencies obtained were converted to bark scale in order to normalize for any possible gender differences (Traunmüller, Reference Traunmüller1988). Vowel onset was defined as the onset of regular periodicity on the acoustic waveform corresponding to a visible F1 trace on the spectrogram. Vowel offset was defined as the point where waveform periodicity ceases and waveform amplitude decreases markedly.
For reliability, measurements of randomly selected tokens (10%) were made independently by another researcher. A Pearson correlation coefficient between the original and new values was obtained using Statistical Package for the Social Sciences (SPSS). The correlation between the two measurement values was significant (γ (328) = ·98, p < ·001). A correlation coefficient greater than .75 indicates the variables to be highly associated. Thus, acoustic values were measured consistently.
Statistical analysis
Statistical comparisons of vowel formants, duration and VOT, and f 0 values of stops were completed using SPSS (v.20). For between-group comparisons, mixed analysis of variances (ANOVA) and independent t-tests were used, and paired t-tests for within-subjects comparisons were conducted. A significance level of p < ·05 was adopted. Effect size was calculated using partial eta squared (η 2 p ), interpreting the effect as follows: 0·00–0·09 = negligible; 0·1–0·29 = small; 0·30–0·49 = moderate; and 0·5 and greater = large (Rosenthal & Rosnow, Reference Rosenthal and Rosnow1984).
RESULTS
Production accuracy of stops
Before an acoustic analysis was conducted, accuracy of the stop productions was evaluated. A native speaker of English or Korean, who was blind to the study, independently listened to all target word productions in her respective language. They were asked to listen to tokens saved as a separate digital file and transcribe what they heard. The speech tokens were played using a waveform editor so that each token could be played as often as necessary. Ten percent of the data were re-transcribed by a second native speaker of each language for inter-transcriber reliability. The phoneme-by-phoneme inter-rater reliability was 95% for English stops and 90% for Korean stops.
Of 486 English stops, 95·4% and 97·1% were correctly produced by monolingual English (ME) and KEB children, respectively. An independent t-test revealed no significant difference on accuracy of English stops between ME and KEB children (t(25) = –0·397, p = ·695). The small number of English stop errors produced by monolingual English and KEB children were related to place of articulation, not voicing or aspiration. Two ME children and one KEB child produced cop for top or top for cop once. Accuracy on the 729 Korean stops was lower than for English. Ninety-one percent of Korean stops were correctly produced by monolingual Korean (MK) children, whereas 90% produced by KEB children were phonemically accurate. An independent t-test revealed no significant difference on accuracy of Korean stops between MK and KEB children (t(25) = –0·103, p = ·919). All errors in Korean were related to VOT; accuracy of fortis stops was higher than lenis and aspirated stops by both monolingual and bilingual children. The detailed percentage of accuracy for Korean and English stops is shown in Table 5.
Acoustic analysis of stops
Comparisons between monolingual English- and Korean-speaking children
Figure 1 (top panel) shows VOT (x-axis) and f 0 (y-axis) for monolingual English- and Korean-speaking children. VOT and f 0 values for the three places of articulation were combined because the focus of the study is on phonetic category formation in stops with respect to laryngeal contrast, not place of articulation. In addition, patterns were similar among the three places of articulation when the VOT and f 0 values of English and Korean stops were compared separately for each place of articulation. Thus, six independent t-tests for all comparisons between English and Korean were conducted in order to examine whether each English and Korean pair is significantly different. The six comparisons included: English voiced–Korean lenis; English voiced–Korean aspirated; English voiced–Korean fortis; English voiceless–Korean lenis; English voiceless–Korean aspirated; and English voiceless–Korean fortis. The alpha-level was adjusted to .008 (.05/6) because six comparisons were made. The significant pairs are listed in Table 6.
note: Adjusted alpha level = 0·008.
English voiced and Korean fortis stops were produced with shorter VOT, whereas English voiceless, Korean lenis, and Korean aspirated were produced with longer VOT. Each of the stop types showed considerable deviation along this continuum. Independent t-tests indicated that VOT values of English voiced stops were significantly different from those of Korean lenis and aspirated stops. VOT values of English voiceless stops were also significantly different from those of Korean fortis stops. However, the other English and Korean pairs (voiced–fortis, voiceless–lenis, voiceless–aspirated) were not significantly different. In terms of f 0 values, none of the English–Korean stop pairs showed significant difference.
Comparisons between English and Korean among bilingual children
Figure 1 (bottom panel) shows the VOT (x-axis) and f 0 (y-axis) values of stops produced by KEB children. Results of paired t-tests using Bonferroni correction were given in Table 6. As with the monolingual children, English voiced and Korean fortis stops were produced with shorter VOT, but English voiceless, Korean lenis, and Korean aspirated stops were produced with longer VOT. Deviation in f 0 for the Korean fortis stop was greater than the English voiced stop, while f 0 values for the English voiceless and Korean aspirated stops nearly overlapped. The t-test results of VOT for each comparison were similar to those of monolingual children. VOT values of English voiced stops were significantly different from those of Korean lenis and aspirated stops, and VOT values of English voiceless stops were significantly different from those of Korean fortis stops, but the differences of VOT values for the other pairs were not significant. Similarly, none of the f 0 values of English and Korean stop pairs were significantly different in KEB children.
Comparisons between monolingual and bilingual children in English
Figure 2 (top panel) shows means and standard deviations of VOT values for voiced and voiceless English stops produced by ME and KEB children. Since the present study focuses on laryngeal contrast between monolingual and bilingual children, VOT values of three place of articulation were combined. A mixed ANOVA revealed no significant two-way interaction between voicing type and group (F(1,25) = 0·95, p = ·34, η2p = 0·04). The main effect for group (F(1,25) = 0·05, p = ·83, η 2 p = 0·002) was not significant, either. However, there was significant main effect for voicing type (F(1,25) = 190·24, p < ·001, η 2 p = 0·88). As expected, VOT of voiceless was significantly longer than that of voiced in both monolingual and bilingual children.
Figure 2 (bottom panel) shows means and standard deviations of f 0 values for voiced and voiceless English stops produced by ME and KEB children. A mixed ANOVA revealed significant interactions for voicing type * group (F(1,25) = 6·09, p = ·02, η 2 p = 0·19), as well as a significant main effect for voicing type (F(1,25) = 12·21, p = ·002, η 2 p = 0·33). Post-hoc comparison (using α = ·05) indicated that f 0 values for English voiceless stops produced by KEB children were significantly higher than ME children (p = ·02). f 0 values for English voiced stops were insignificant between the two groups (p = ·45). The main effect for group was not significant (F(1,25) = 3·61, p = ·07, η 2 p = 0·13).
Comparisons between monolingual and bilingual children in Korean
Figure 3 (top panel) shows VOT values for three types of Korean stops produced by MK and KEB children. KEB children produced longer VOT for lenis stops than MK, whereas both groups of children produced aspirated and fortis stops with similar VOT values, resulting in a less clear distinction between lenis and aspirated stops in KEB children. A mixed ANOVA revealed no significant two-way interactions (F(2,50) = 1·40, p = ·26, η 2 p = 0·05) nor main effect for group (F(1,25) = 1·08, p = ·31, η 2 p = 0·04). However, there was a significant main effect for voicing type (F(2,50) = 103·68, p < ·001, η 2 p = 0·81). These results indicated that both KEB and KE children demonstrated distinctive phonemic categories among the three-way Korean stops. Non-significance between the two groups may be due to the greater variability of VOT values in both groups of children. Post-hoc comparison (using α = ·05) indicated that VOT values for aspirated stops were significantly longer than those of lenis (p = ·006) and fortis (p < ·001). VOT values for lenis stops were significantly longer than fortis (p < ·001).
Figure 3 (bottom panel) shows f 0 values for three types of Korean stops produced by MK and KEB children. Lenis stops were produced with lower f 0 values than aspirated and fortis stops. KEB children showed more variability than MK children for all types of Korean stops. A mixed ANOVA indicated that a two-way interaction effect was not significant (F(2,50) = 0·23, p = ·78, η 2 p = 0·009), nor was a main effect for group found (F(1,25) = 2·68, p = 0·11, η 2 p = 0·09). However, there was a significant main effect of voicing (F(2,50) = 17·76, p < ·001, η 2 p = 0·41). Post-hoc comparison (using α = ·05) indicated that f 0 of aspirated (p < ·001) and fortis (p < ·001) stops was significantly higher than that of lenis; however, there was no significant difference in the f 0 values between aspirated and fortis stops.
Production accuracy of vowels
Of the 162 Korean vowels, 100% production accuracy was found. Both MK and KEB children produced all Korean vowels correctly. For the 324 English vowels, 93% and 92% of vowels were produced accurately by monolingual and bilingual children, respectively. An independent t-test revealed no significant difference on accuracy of English vowels between ME and KEB children (t(25) = 0·41, p = ·069). The detailed percentage of accuracy for each English vowel is shown in Table 7.
Acoustic analysis of vowels
Comparisons between monolingual English- and Korean-speaking children
Figure 4 (top panel) shows F1 (x-axis) and F2 (y-axis) values of English and Korean vowels as produced by monolingual English- and Korean-speaking children. All were produced with similar deviations, except for Korean /i/, for which deviation was greater in F2. Independent t-tests were conducted between monolingual English- and Korean-speaking children (see Table 8). The Korean vowel /i/ was compared with English /i/ and English /ɪ/. Korean /ε/ was compared with English /ɪ/, /ε/, and /æ/. The alpha level was adjusted to .01 (.05/5). Results showed that among five pairs, three pairs, such as Korean /i/ and English /i/, Korean /ε/ and English /ε/, and Korean /ε/ and English /ɪ/, were not significantly different. However, Korean /i/ and English /ɪ/ and Korean /ε/ and English /æ/ differed from each other in terms of both F1 and F2 parameters.
note: Adjusted alpha level = 0·01.
Comparisons between English and Korean among bilingual children
Figure 4 (bottom panel) shows F1 (x-axis) and F2 (y-axis) values of English and Korean vowels produced by bilingual children. English /i/ and Korean /i/ were produced with similar deviations, whereas English /ε/ showed greater deviations in F1 and F2. Results of paired t-tests using the Bonferroni correction are shown in Table 8. Findings were similar to those for monolingual children in that Korean /i/ and English /ɪ/, as well as Korean /ε/ and English /æ/, were significantly different. However, other vowel pairs were similar to each other in terms of both parameters.
Comparisons between monolingual and bilingual children in English
Figure 5 shows the English vowels produced by ME and KEB children. In terms of F1, a mixed ANOVA indicated no significant interaction effect for vowel type * group (F(3,75) = 0·53, p = ·66, η 2 p = 0·02) nor main effect of group (F(1,25) = 0·097, p = ·76, η 2 p = 0·004). However, there was a significant main effect of vowel type (F(3,75) = 185·50, p < ·001, η 2 p = 0·88). Post-hoc comparison (using α = ·05) indicated that English vowels were significantly different from each other (p < ·001 for all comparisons) except for /ɪ/ and /ε/ (p = ·63). In terms of F2 values, there were no significant interaction effects (F(3,75) = 0·27, p = ·86, η 2 p = 0·01), nor main effect for group (F(1,25) = 0·02, p = ·88, η 2 p = 0·01). However, there was significant main effect of vowel type (F(3,75) = 116·16, p < ·001, η 2 p = 0·82). Post-hoc comparison (using α = ·05) indicated that F2 values of all English front vowels were significantly different from each other in both groups (p < ·001 for all comparisons).
Comparisons between monolingual and bilingual children in Korean
Figure 6 shows the two Korean vowels produced by KE and KEB children. For F1, a mixed ANOVA indicated a significant main effect of vowel type (F(1,25) = 152·59, p < ·001, η 2 p = 0·86). Thus, monolingual Korean and bilingual children produced different F1 values for Korean /i/ and /ε/. However, no significant group effect (F(1,25) = 0·023, p = ·88, η 2 p = 0·00) and no significant vowel type * group interaction (F(1,25) = 0·18, p = ·67, η 2 p = 0·00) were found. The same results were obtained in F2 values. A significant main effect of vowel type was found (F(1,25) = 111·21, p < ·001, η 2 p = 0·82). However, there were no significant main effects of group (F(1,25) = 0·51, p = ·48, η 2 p = 0·02), and no significant interaction effect of vowel type * group (F(1,25) = 0·74, p = ·40, η 2 p = 0·03).
DISCUSSION
English and Korean stops and vowels produced by three-year-old monolingual children
Consistent with previous studies (e.g. Kim & Pae, Reference Kim and Pae1995; Bernthal et al., Reference Bernthal, Bankson and Flipsen2009), both three-year-old ME and MK children produced stops with high accuracy. Our acoustic analyses confirmed that ME children made clear distinctions between voiced and voiceless English stop categories, whereas MK children demonstrated separate fortis, lenis, and aspirated phonemic stop categories in Korean. Similar to stops, four English front vowels and two Korean front vowels were produced by monolingual children with high accuracy, consistent with the previous studies (Kwon, Reference Kwon1982; Larkins, Reference Larkins, Irwin and Wang1983; Pollock & Berni, Reference Pollock and Berni2003). Vowel phonemes in each language were acoustically different in terms of either F1, F2, or both. Thus, the results of this study confirm that phonemic categories for stops and front vowels are well established by three years of age in monolingual children.
When the acoustic features of English and Korean vowels were compared cross-linguistically among ME and MK children, Korean /i/ and /ε/ were produced with lower F1 and higher F2 values than English /ɪ/ and /æ/, respectively. Identifying formant differences between Korean /i/ and English /ɪ/, as well as Korean /ε/ and English /æ/ in monolingual children, provides an important basis for investigating vowels produced by bilingual children. Without understanding the cross-linguistic similarities and differences of these vowel pairs in age-equivalent monolingual children, it is difficult to establish whether bilingual children produce English and Korean vowels distinctively.
The different formant values of Korean /i/ and English /ɪ/, as well as Korean /ε/ and English /æ/, pairs were similar to monolingual English- and Korean-speaking adults; however, the cross-linguistic similarities between English /i/ and Korean /i/, as well as English /ε/ and Korean /ε/, differed from monolingual adults (Yang, Reference Yang1996). Yang reported that English /i/ and Korean /i/ were produced differently by both male and female adult speakers, and English /ε/ and Korean /ε/ were produced distinctively by adult males. The patterns of three-year-old children were consistent with those of five- and ten-year-old children in a previous study (Lee & Iverson, Reference Lee and Iverson2009), in that English and Korean /i/ and English and Korean /ε/ were produced similarly by these older children. Thus, the findings suggest that monolingual children may need more time until they produce fully distinctive front vowel systems across English and Korean.
Unlike the vowels, language-specific characteristics were not obvious in stops produced by three-year-old monolingual children. Although the Korean fortis or English voiced stops (fall in the short-lag VOT) differed from the English voiceless or Korean lenis and aspirated stops (fall in the long-lag VOT), respectively, in terms of VOT values, these stop pairs were not further distinguished from each other in terms of f 0 values. Furthermore, the English voiceless stops were produced similarly to the Korean lenis or aspirated stops with respect to both VOT and f 0 values. These results differed from older monolingual children at five and ten years of age (Lee & Iverson, Reference Lee and Iverson2011), for whom five stop pairs of English and Korean differed in terms of either VOT, f 0, or both. The findings suggest that phonological stop distinctions within a language (e.g. voiced vs. voiceless in English) may be fully established in children as young as three years; however, the phonetic details for each language may not yet at that point be completely implemented.
One versus two phonetic systems in three-year-old bilingual children
Two earlier studies (Lee & Iverson, Reference Lee and Iverson2011, Reference Lee and Iverson2012) have reported that ten-year-old KEB children demonstrate fully distinctive stop and vowel systems, whereas five-year-old KEB children show distinctive systems for vowels, but not stops. Thus, it is of interest to examine whether younger KEB children distinguish stops and vowels differently. The current study provides a comprehensive picture of the developmental pattern of phonetic category formation in three-year-old KEB children.
Similar to monolingual children, three-year-old KEB children produced both stop and vowel phonemes in English and Korean with high accuracy. Acoustic analyses confirm distinctive phonological stop and vowel categories in KEB children for both English and Korean. These results are consistent with previous studies (Gildersleeve-Neumann et al., Reference Gildersleeve-Neumann, Pena, Davis and Kester2009; Fabiano-Smith & Goldstein, Reference Fabiano-Smith and Goldstein2010).
In stops, we did not find clear distinctions among the phonetic categories of stops in bilingual children, parallel to the productions of monolingual children. Among the six stop comparisons across English and Korean, only stop pairs that fall into the different VOT categories (i.e. long-lag vs. short-lag VOT) were significantly difference in VOT values. However, none of these pairs differed in f 0 values. Thus, our findings suggest little evidence on detailed phonetic distinctions of stop categories among bilingual toddlers. Previous studies (Deuchar & Clark, Reference Deuchar and Clark1996; Johnson & Wilson, Reference Johnson and Wilson2002; Kehoe et al., Reference Kehoe, Lleo and Rakow2004) have reported that one toddler per study made phonetic distinctions for voiceless stops across languages (English and Spanish voiceless stops or Spanish and German voiceless stops). The differences may be attributed to type of investigation in conjunction with sample size. With a relatively larger sample, Fabiano-Smith and Bunta (Reference Fabiano-Smith and Bunta2012) also found that bilingual children did not distinguish English and Spanish voiceless stops. After employing a substantial number of bilingual participants, the current study is able to assess better whether bilingual toddlers are sensitive to the acoustic properties of stops. It should be noted, however, that our findings could be colored by the fact that Korean has a more complex stop system than Spanish or German. Though our previous study (Lee & Iverson, Reference Lee and Iverson2011) did not find fully distinctive stop categories in five-year-old KEB children, it is not still known to what extent three-year-old bilingual children may distinguish these three-way categories; further studies should examine other groups of bilingual toddlers involving complex stop systems (e.g. Thai or Hindi) in order to verify the findings.
Unlike stops, we found that the three-year-old bilingual children employed distinctive vowel categories across languages in their productions; in particular, they were able to distinguish Korean /i/ and English /ɪ/, as well as Korean /ε/ and English /æ/. The same patterns appeared in the monolingual children of the present study as well as in the five- and ten-year-old KEB children in the previous investigation (Lee & Iverson, Reference Lee and Iverson2012). This finding suggest that distinctive vowel categories emerge at a young age in KEB children. Previous studies on adolescent or adult L2 learners reported merged vowel categories across languages (Bohn & Flege, Reference Bohn and Flege1992; Guion, Reference Guion2003; Baker & Trofimovich, Reference Baker and Trofimovich2005). Present findings align with our previous study (Lee & Iverson, Reference Lee and Iverson2012) in suggesting that early exposure to the English vowel system leads to more distinctive vowel categories.
Based on consideration of both vowels and stops, our findings confirm the notion that simultaneous bilingual children operate with two separate phonetic systems (Padilla & Liebmann, Reference Padilla and Liebmann1975; Genesse, Reference Genesee1989; Goodz, Reference Goodz1989). Similar to previous studies on the semantic (Goodz, Reference Goodz and Genessee1994; Genesee, Nicoladis & Paradis, Reference Genesee, Nicoladis and Paradis1995), pragmatic (Meisel, Reference Meisel, Hyltenstam and Obler1989; De Houwer, Reference De Houwer1990; Nicoladis, Reference Nicoladis1994), and syntactic levels (Meisel, Reference Meisel, Hyltenstam and Obler1989, Reference Meisel and Meisel1990; Kaiser, Reference Kaiser and Meisel1994), which reported that bilingual children around two years of age show early language separation, we find that bilingual children also separate phonetic systems at an age as early as three years.
The results of the present study, however, suggest that the two separate phonetic systems may not emerge holistically in an across-the-board fashion. That is, two distinctive systems may be realized in one sound category (vowels) but not in the other (stops). Most previous studies examining phonetic categories of bilingual children have investigated stops (Deuchar & Clark, Reference Deuchar and Clark1996; Johnson & Wilson, Reference Johnson and Wilson2002; Kehoe et al., Reference Kehoe, Lleo and Rakow2004; Fabiano-Smith & Bunta, Reference Fabiano-Smith and Bunta2012), but only a few examined vowels in young bilingual children (e.g. Kehoe, Reference Kehoe2002). The fact that vowel distinctions are typically acquired earlier than consonants has been widely reported in the literature on both English- and Korean-speaking monolingual children (Kwon, Reference Kwon1982; Hare, Reference Hare, Irwin and Wang1983; Bernthal et al., Reference Bernthal, Bankson and Flipsen2009; Kim & Stoel-Gammon, Reference Kim and Stoel-Gammon2009), and our current study found this as well. But we did not find consistent support for the claim of Fabiano-Smith and Goldstein (Reference Fabiano-Smith and Goldstein2010) that “bilingual children perceive phonetically similar sounds as common between their two languages and categorize them into the same phonemic category” (p. 163), i.e. that bilingual children basically do not distinguish phonetic details across languages at this young age.
Autonomous acquisition versus interdependence
On the understanding that bilingual children possess two linguistic systems, the way in which these systems interact becomes of interest. Previous studies examining speech production in bilingual children reported influence patterns. For example, VOT values were not the same as those of their monolingual counterparts (e.g. Lee & Iverson, Reference Lee and Iverson2011; Fabiano-Smith & Bunta, Reference Fabiano-Smith and Bunta2012), vowel length contrast acquisition was delayed (e.g. Kehoe, Reference Kehoe2002), and formants of vowels were different as compared to their monolingual counterparts (e.g. Lee & Iverson, Reference Lee and Iverson2012). Following Flege's (Reference Flege and Strange1995) assumption, this pattern may be understood as ‘assimilation or dissimilation’, or ‘transfer’ by Paradis and Genesee (Reference Paradis and Genesee1996). However, there is also emerging evidence that the two languages may develop independently without interaction among bilingual speakers who acquired two languages early (e.g. Flege, Munro & MacKay, Reference Flege, Munro and MacKay1995; Munro, Flege & MacKay, Reference Munro, Flege and MacKay1996; Flege, MacKay & Meador, Reference Flege, MacKay and Meador1999; Kang & Guion, Reference Kang and Guion2006). These studies examined mainly adult bilingual speakers, not bilingual children.
In the present study, we found that most stop and vowel productions of simultaneous KEB children were similar to those of monolingual children, consistent with a claim that bilingual adult speakers who acquired two languages simultaneously demonstrate speech production without much interaction between the two languages. An exception to the independent systems observation relates to the f 0 values of English. KEB children produced English voiceless stops with higher f 0 values compared to monolingual children. The higher f 0 in the bilingual children seems to be influenced by the higher f 0 of Korean aspirated stops.
Paradis and Genesee (Reference Paradis and Genesee1996) hypothesized that the grammars of bilingual children are acquired autonomously because they found that bilingual children showed the same patterns of acquisition as well as a similar developmental rate as monolinguals. Although some transfer effect may appear in bilingual toddlers, the findings of the current study suggest that in general the phonetic systems of simultaneous bilingual children may develop interdependently. The discord in findings between the current and previous studies may be attributed to differing methodology and the type of bilingual children (i.e. simultaneous vs sequential). Most previous studies examining phonetic categories included bilingual children who had varying onset of exposure to L2, and some did not specify the onset of exposure. For example, Fabiano-Smith and Bunta (Reference Fabiano-Smith and Bunta2012) included eight Spanish–English bilingual children who were either simultaneous or L2 learners. Similarly, Lee and Iverson (Reference Lee and Iverson2011, Reference Lee and Iverson2012) reported that the KEB children had at least two years of exposure for five-year-olds and five years of exposure for ten-year-olds, leaving open the question of whether the children were simultaneous or sequential bilinguals. By contrast, the bilingual children in the current study maintained homogeneity in that all twelve KEB children were exposed to both languages at 18 months of age. Since very few group experiments have been conducted to examine the production patterns of simultaneous bilingual toddlers, future studies are warranted to identify differences in phonetic category formation and the degree of interaction between simultaneous and sequential bilingual children.
Limitations and future directions
As the findings of the current study are based only on speech production, it has not been established here whether KEB children demonstrate similar characteristics in the perceptual domain. Moreover, it is not yet known whether the same kinds of production patterns can be obtained among other types of simultaneous bilingual children, such as Thai–English or Hindi–English, whose L1 phonological systems differ significantly in other ways from L2 English. Future studies are thus warranted to verify our observations and their implications. By the same token, subsequent studies may pursue an examination of individual participant characteristics, in addition to group characteristics, among bilingual children. Finally, with respect to vowels, the current study examined only front vowels, and these were not always in the same phonetic context. Thus, further work should compare the full English and Korean vowel inventories after selecting target words with a more similar phonetic context.