Hostname: page-component-586b7cd67f-vdxz6 Total loading time: 0 Render date: 2024-12-01T21:31:23.031Z Has data issue: false hasContentIssue false

An acoustic analysis of Oromo vowels of the northern dialect

Published online by Cambridge University Press:  27 February 2023

Feda Negesse*
Affiliation:
Department of Linguistics, Addis Ababa University [email protected]
Rights & Permissions [Opens in a new window]

Abstract

This study presents the results of an acoustic analysis of Oromo vowels of the northern dialect. The data for the study were collected from 19 speakers (nine female, 10 male) who produced the vowels in the same phonetic environment. Such acoustic measures as duration, fundamental frequency and the first three formant frequencies were extracted for the analysis. In a linear mixed-effects model, each acoustic parameter was modelled as a function of the fixed effects such as gender and vowel quality, and of participants’ random effects. The model shows a main effect of gender on all the acoustic measures, with the female speakers producing the vowels with significantly greater duration and formant frequencies. The model also indicates the main effect of vowel quality on all the acoustic measures with the exception of duration. The proportion of variances explained by gender and vowel quality is found to be large for fundamental frequency and the first formant frequency respectively. As regards the classification of the vowels, Support Vector Machine reveals that the time-varying frequency does not have an advantage over a steady state in separating vowels of both genders. However, it is generally fairly effective in classifying vowels of the dialect of the language.

Type
Research Article
Copyright
© The Author(s), 2023. Published by Cambridge University Press on behalf of The International Phonetic Association

1 Introduction

The survey of online phonetic studies indicates that acoustic characteristics of vowels of some languages are well investigated, having attracted the attention of relatively many phonetic researchers (Peterson & Barney Reference Peterson and Barney1952, Clopper, Pisoni & de Jong Reference Clopper, Pisoni and de Jong2005, Fox & Jacewicz Reference Fox and Jacewicz2009, Williams, Escudero & Gafos Reference Williams, Escudero and Gafos2018). On the other hand, vowels of other languages seem understudied, lacking adequate data on their acoustic features. For instance, Oromo, a Cushitic language widely spoken in the Horn of Africa, has vowels that are not well researched (Tujube Amansa Reference Amansa2018). There is a paucity of data on the acoustic characteristics of vowels of the language and thus the extent to which acoustic characteristics of its vowels can be affected by vowel quality and gender is not clearly known. The current study, therefore, examines the effects of vowel quality and gender on acoustic properties of Oromo vowels using sets of acoustic measures extracted at more than one-time point.

1.1 Acoustic characteristics of vowels

Acoustic description of vowels involves a few parameters such as duration, fundamental frequency (f0) and the first three formant frequencies (F1, F2, F3). Vowel duration may vary with vowel quality, age, gender, dialect, phonetic environment and speech style (Clopper et al. Reference Clopper, Pisoni and de Jong2005, Fox & Jacewicz Reference Fox and Jacewicz2009). For instance, low vowels can have the longest duration across languages due partly to the long time it takes for articulators to reach their targets (Solé & Ohala Reference Solé, Ohala, Cécile Fougeron, D’Imperio and Vallée2010, Derib Ado Reference Ado2011). Studies of vowel duration in different languages indicate that vowels could be longer when they are produced by child and female speakers, before voiced consonants and in clear speech or slow speech (Lee, Potamianos & Narayanan Reference Lee, Potamianos and Narayanan1999, Hillenbrand, Clark & Nearey Reference Hillenbrand, Clark and Nearey2001, Ferguson & Kewley-Port Reference Ferguson and Kewley-Port2007). On the other hand, fundamental frequency is derived from the vibration of the vocal folds and affected by age, gender, dialect, and vowel quality (Tujube Amansa Reference Amansa2018, Williams et al. Reference Williams, Escudero and Gafos2018). Children and female speakers have comparatively higher fundamental frequency due to the anatomy of their vocal folds, which are thinner and shorter (Simpson Reference Simpson2001, Pisanski et al. Reference Pisanski, Fraccaro, Tigue, O’Connor, Susanne Röder, Andrews, DeBruine, Jones and Feinberg2014). While previous studies reported significant effects of such factors on vowel acoustic attributes, they seem to have paid less attention to the proportion of variances explained by these factors (Abelson Reference Abelson1995).

F1, F2 and F3 are linked to vowel quality and previous studies have used these acoustic measures to conduct acoustic analyses of languages, namely English, Dutch, Greek, German and Siwu (Peterson & Barney Reference Peterson and Barney1952, Jongman, Fourakis & Sereno Reference Jongman, Fourakis and Sereno1989, Hillenbrand et al. Reference Hillenbrand, Getty, Clark and Wheeler1995, Adank, van Hout & Smits Reference Adank, van Hout and Smits2004, Kpodo Reference Kpodo2013). Their use is motivated by the close link between the acoustic features and articulatory parameters such as vowel height, backness and lip rounding. F1 and F2 correspond to height and backness respectively, with the high and back vowels having the lowest F1 and F2. In other words, F1 is inversely related to height while F2 is inversely related to backness but the association between F1 and height is comparatively stronger (Ximenes, Shaw & Carignan Reference Ximenes, Shaw and Carignan2017, Williams et al. Reference Williams, Escudero and Gafos2018, Lawson, Stuart-Smith & Rodger Reference Lawson, Stuart-Smith and Rodger2019). The formant frequencies are influenced by gender, age, dialect, language and vowel quality. Female speakers, children and clear speech have greater formant frequencies and this variation is known to occur cross-linguistically (Simpson Reference Simpson2001, Pisanski et al. Reference Pisanski, Fraccaro, Tigue, O’Connor, Susanne Röder, Andrews, DeBruine, Jones and Feinberg2014, Leung et al. Reference Leung, Jongman, Wang and Sereno2016). Though the general patterns of formant frequencies are similar across languages and easily predictable, the same vowel in different languages may have different formant values (Bradlow Reference Bradlow1995, Yang Reference Yang1996, Strange et al. Reference Strange, Andrea Weber, Levy, Hisagi and Nishi2007, Chung et al. Reference Chung, Eun Jong Kong, Weismer, Fourakis and Hwang2012).

1.2 Vowel classification

Hillenbrand et al. (Reference Hillenbrand, Getty, Clark and Wheeler1995) classified vowels of North American English using the first two formant frequencies measured at a steady state and reported a 95.4 $\%$ accurate classification. This result is slightly greater than that of an early acoustic study of the same vowels (94.4 $\%$ ) by Peterson & Barney (Reference Peterson and Barney1952). The accuracy showed an 11.5 $\%$ improvement when acoustic parameters measured at 20 $\%$ , 50 $\%$ and 80 $\%$ of vowel duration were used. The spectral change (time-varying frequencies) had an advantage over the steady state in classifying vowels of the language. Similarly, a study of vowels of Northern and Southern Standard Dutch reveals that the varieties show a small variation as a function of formant frequencies measured at a steady state for the nine monophthongal vowels. Yet, they exhibit large variations with respect to formant frequencies measured at 30 $\%$ and 70 $\%$ points of the three long mid vowels and as well as the three diphthongal vowels (Adank, van Hout & Smits Reference Adank, van Hout and Smits2004). There are also research results that corroborate the advantage of spectral change over the steady state in bearing more information particularly for separating confusing vowels (Morrison & Nearey Reference Morrison and Nearey2007, Williams et al. Reference Williams, Escudero and Gafos2018). The current study also attempts to investigate the role of spectral change (time-varying frequencies) in classifying Oromo vowels of the northern dialect.

Fundamental frequency and duration are also important acoustic parameters that could contribute to the classification of vowels in different languages. Their contribution is observed particularly in separating neighbouring sounds. For instance, the f0 of vowels of American English has an attested role in distinguishing front vowels along their heights (DiBenedetto Reference DiBenedetto1989). In classifying American, Australian English and Dutch vowels, f0 has a role slightly comparable to that of the third formant (Hillenbrand et al. Reference Hillenbrand, Getty, Clark and Wheeler1995, Adank, van Hout & Smits Reference Adank, van Hout and Smits2004, Williams et al. Reference Williams, Escudero and Gafos2018). An acoustic analysis of dynamic acoustic properties of monophthongs and diphthongs in Western Sydney, Australian English indicates that duration together with the three first formants accurately separated 74.9 $\%$ of the token. However, the removal of duration from a parameter set entered into the classifier decreased the classification rate by 14.8 $\%$ (Elvin, Williams & Escudero Reference Elvin, Williams and Escudero2016). The addition of vowel duration to the formant frequencies in quadratic discriminant analysis resulted in an improvement in classification accuracy for American English (Hillenbrand et al. Reference Hillenbrand, Getty, Clark and Wheeler1995). It is also reported that duration improved the classification rate of Dutch vowels and the contribution of duration was found to be larger in separating vowels of the Northern Standard Dutch as compared to those of the other varieties (Adank, van Hout & Smits Reference Adank, van Hout and Smits2004). The current study also aims to determine the contribution of fundamental frequency and duration in distinguishing Oromo vowels of the northern dialect.

1.3 Previous studies of Oromo vowels

The exact number of languages spoken in Ethiopia is unknown but the usual estimate is eighty and above, as reported by the 2007 National Population Census (CSA 2007). The classification of the languages into Semitic, Cushitic, Omotic and Nilo-Saharan is relatively well established, with some debate on whether or not Cushitic and Omotic should constitute separate families (Tosco Reference Tosco2000). Oromo belongs to the Cushitic family and is widely spoken in Ethiopia and to some extent in Kenya as small communities living scattered in the northern part of the country speak different varieties of the language (Lamberti Reference Lamberti1991). The language is classified as Lowland East Cushitic together with Somali, Afar and other small languages in East Africa. It has attested regional varieties but the number of such varieties is debatable, varying with studies. The number of the regional varieties in previous studies ranges from three to six (Stroomer Reference Stroomer1987, Kebede Reference Kebede2009). For example, in Kebede’s (Reference Kebede2009) genetic classification, such varieties as Western, Eastern, Northern, Central and Waata are identified, and the current study is based on this classification since it is the most comprehensive study on Oromo dialectology.

According to previous studies, Oromo has five vowels, all of which contrast in length (Owens Reference Owens1985, Stroomer Reference Stroomer1987, Lloret-Romanyach Reference Lloret-Romanyach1988). The studies have classified the vowels into high, mid and low in terms of their height, and their phonemic status is well established. The high vowels are /i u/, the mid ones are /e o/, and /a/ is the only low vowel in the language. All of them can be lengthened, having their own long counterparts (Table 1). A recent acoustic study of the vowels has confirmed this traditional classification of the vowels (Tujube Amansa Reference Amansa2018). The study investigated the extent to which acoustic properties of Oromo vowels would vary with vowel quality, gender and dialect. The acoustic measures in the study were extracted from the midpoint of vowel duration and the phonetic environment was not strictly controlled, with positions of vowels differing in the carrier words. The number of speakers (32 female, 32 male) who participated in the study was large. Repeated measures ANOVA was used to conduct statistical analysis but a linear mixed model could have been preferably employed. In spite of its limitations, the study found a significant variation in classification rates with vowel quality, gender and dialect. It showed significant differences in formant frequencies of short and long vowels. In addition, the acoustic classification indicated that the first two formant frequencies measured at the midpoint correctly separated 80 $\%$ of tokens of the long vowels. The formants correctly classified 64.5 $\%$ of tokens of long vowels of the northern dialect; this score was very low compared with those of the other dialects. However, no other study has been carried out to validate above findings by using a different method (Section 2).

Table 1 An inventory of Oromo short vowel phonemes and their corresponding long ones (Owens Reference Owens1985).

The main objective of the current study is to validate and extend the findings of the acoustic study cited in the preceding paragraph. The study focuses on vowels of the northern dialect of the language because various reasons. This dialect has the smallest number of speakers (457,278) as compared to the other dialects of the language (CSA 2007). The dialect is spoken in the Oromia Zone, which is found in the Amhara Regional State (Figure 1). It is surrounded by Afar and Amharic speakers, being isolated from the other dialects. Amharic is a working language of Ethiopia and stands second to Oromo in terms of the number of native speakers (CSA 2007). The dialect has influenced and has been influenced by Amharic (Tosco Reference Tosco2000, Baye Reference Yimam2016). Presumably, there could be a danger of gradual shift to Amharic as the children are speakers of Amharic having acquired it from their parents who are bilingual speakers of Amharic and Oromo. The other warning sign is that Rayya, which is part of the northern dialect, has shifted to the neighbouring Semitic languages (Kebede Reference Kebede2009). In addition, it is a common observation that the interaction of language and politics has been so strong in Ethiopia that languages or their varieties could be the beneficiaries or the victims of such interaction. This study is significant in documenting the current phonetic properties of vowels of the dialect so that it can be used as a reference for future studies.

Figure 1 Location of Oromia Zone where the northern dialect of Oromo is spoken (Baker Reference Baker2012).

2 Method

2.1 Participants

Participants of the study were 19 native speakers (9 female, 10 male) of the northern dialect and had normal speech and hearing, with their ages ranging from 20 to 35 years. They also speak Amharic, having acquired from their community and learned it at school as a subject. They were doing college courses at Kemise College of Teachers’ Education to be teachers for primary schools and their consent was obtained before they took part in the study. They reported no travel history as they were born, brought up and educated in the study site (Figure 1).

2.2 Recording procedure

The stimuli were five long vowels of Oromo embedded in real words in the same phonetic environment (Table 2). The long vowels were selected as they are long enough to take the measurement at three points, which are not much influenced by the flanking consonants. The words containing the vowels were embedded in a carrier phrase ‘____ say’, following the word order of the language attested in Owens (Reference Owens1985) and Stroomer (Reference Stroomer1987). For instance, one of the carrier phrases was ‘Dhaabuu jedhi’ for recording the vowel /aː/ and it literally means ‘/ɗaːbuː/ say’.Footnote 1 The stimuli were recorded with Zoom H4n Handy Recorder in a quiet room while the participants were saying them at their usual speech rates. They were recorded at 44.1k Hz and digitised at 16 bits. They were randomly presented in a PowerPoint on an HP laptop and the presentation rate was adjusted based on the speech rates comfortable to all participants. Instruction for the recording was written in Oromo and the recording could only start when the participants were ready after reading and understanding the instruction. The recording took place in two rounds and the first brief round was intended to familiarise participants with the recording procedure. The second round was the actual recording session in which each stimulus was recorded five times in a random order (Schoormann, Heeringa & Peters Reference Schoormann, Heeringa and Peters2019).

Table 2 List of real words used for recording vowels.

Note: No item analysis was conducted and thus the results of the study could not be generalised to other phonetic environments or population of items.

The recording totally yielded 475 tokens (19 speakers × 5 vowels × 5 repetitions) but badly recorded tokens, which had poorly resolved formant frequencies and exaggerated pitches, were discarded. Those tokens were mainly found in the first and the last recordings, and thus the first and the last tokens of each participant were not included in the analysis. The remaining 285 tokens (19 speakers × 5 vowels × 3 repetitions) were used to extract acoustic parameters for the study. The southern variety of Oromo is claimed to be tonal (Voigt Reference Voigt1985) but such a claim has not been made of the northern dialect. However, a study on Oromo phonology claimed that a disyllabic word that ends in a long vowel has primary stress and a high pitch in the last syllable (Wako Reference Wako1981). Accordingly, all the words in the current study have the same pitch and stress patterns because all of them are disyllabic ending in a long vowel (Table 2).

2.3 Measurement procedures

Praat, a free speech software version 6.20 (Boersma & Weenik Reference Boersma and Weenink2001), was used to extract duration, fundamental frequency and the first three formant frequencies of the vowels. A Praat script was used to automatically extract the acoustic parameters in TextGrid (Lennes Reference Lennes2003). Duration was automatically measured in milliseconds between interval tiers labelled with the vowel symbols. The tiers were placed on the vocalic segment, with the start points and the endpoints demarcated by the onset (where there is also a noticeable increase in intensity) and offset (where there is also a noticeable decrease in intensity) of the quasi-periodic wave (Kirtley et al. Reference Kirtley, Grama, Drager and Simpson2016). The start and endpoints were set to the nearest positive zero-crossings. Identifying the boundaries of the vowels in speech software was easy and straightforward as all carrier words have the vowels in between implosive and plosive sounds (Table 2). The added advantage of such a phonetic context is that it could remove the confounding effect of consonantal environments which may occur when different environments are used for the target sounds (Hillenbrand et al. Reference Hillenbrand, Clark and Nearey2001, Strange et al. Reference Strange, Andrea Weber, Levy, Hisagi and Nishi2007).

2.4 Acoustic and statistical analyses

The acoustic parameters extracted include duration, f0, F1, F2 and F3 of the five vowels. Formant frequencies that had extreme values were identified and measured again manually. Duration, fundamental frequency and the first three formants measured at the midpoint of the vowel duration were used for an acoustic description. The ggplot2 package (Wickham Reference Wickham2016) was employed for the graphical representations of the data.

The lme4 package in R version 4.1 (Bates et al. Reference Bates, Mächler, Bolker and Walker2015) was used to assess the study results with a linear mixed-effects model, which was fitted with maximum likelihood. A linear mixed model was chosen to overcome the violation of an assumption of independence, which was caused by multiple measures taken from a single speaker (Winter Reference Winter2020, Brown Reference Brown2021). Each acoustic parameter was modelled as a function of the fixed effects, gender (with two levels) and vowel quality (with five levels), consisting of participants’ random effects. To solve the model convergence issue, some of the derivative computation that took place after the model had got a solution was omitted, by using a control parameter (Brown Reference Brown2021). In addition, the intercept-slope interactions were uncorrelated when they made the model failFootnote 2 to converge (Bates et al. Reference Bates, Mächler, Bolker and Walker2015, Winter Reference Winter2020). Post-hoc comparisons of contrasts of significant effects were carried out using the emmeans package in R (Lenth et al. Reference Lenth, Buerkner, Giné-Vázquez, Maxime Herve, Jonathon Love, Riebl and Singmann2022). Such contrasts compared the difference between each pair of means with an appropriate Tukey's adjustment for the multiple testing. The mixed function in the afex package (Singmann et al. Reference Singmann, Ben Bolker, Frederik Aust, John Fox, Lawrence, Love, Lenth and Haubo Bojesen Christensen2016) was used to conduct likelihood ratio tests for the fixed effects, with the argument method set to ‘LRT’. The r.squaredGLMM function in the MuMIn package (Bartoń Reference Bartoń2017) was employed to get the marginal R 2 for the mixed models. The R 2 is a useful metric to determine the proportion of variance explained by the fixed effect, and both the fixed and random effects, and to examine the model fit (Harrison et al. Reference Harrison, Lynda Donaldson, Julian Evans, Fisher, Goodwin, Robinson and Inger2018)

In addition to the acoustic description, the study aims at classifying the vowels by using sets of acoustic parameters measured at 20 $\%$ , 50 $\%$ and 80 $\%$ of each vowel’s duration. The parameters were normalised with Lobanov’s procedure (Adank Reference Adank2003) before they were employed in Support Vector Machine classifier (Kong, Mullangi & Kokkinakis Reference Kong, Mullangi and Kokkinakis2014). Of the four kernel functions, the Radial Basis Kernel (RBF) was selected because of its good performance. After the selection of the kernel type, the parameter C and the kernel parameter gamma (best C and best gamma) were determined based on gridsearch with five-fold cross-validation (Olson & Delen Reference Olson and Delen2008). Duration was excluded from the classification of the vowels because it did not significantly vary with a vowel quality.

3 Results

Table 3 presents the average duration and formant frequencies of Oromo long vowels produced by speakers of the northern dialect. The acoustic parameters were extracted at the midpoint of each vowel. The acoustic parameters in Table 3 and Figure 2 were extracted at midpoint of each vowel.

Table 3 Duration, f0, F1, F2 and F3 averages and standard deviations (SD) for the vowel tokens.

F = female; M = male

Figure 2 Distribution of female (left) and male (right) speakers’ vowel tokens on an F1 × F2 vowel space. The formants are measured at 50 $\%$ of vowel duration and normalised with the Lobanov procedure. Vowel tokens of the female speakers exhibit a greater variation and this is also evident in Table 3.

3.1 Duration

One of the research objectives is to determine if the duration of vowels of the northern variety of Oromo varies as a function of vowel quality and gender. The vowel, /aː/ has the longest duration (322 ms) while the /iː/ has the shortest duration (329.7 ms), which implies that vowel duration is inversely related to its height (Figure 3). The effect of vowel quality on duration is not significant ( $ \chi^{2}$ (4) = 6.79, p < .15).

Figure 3 Raincloud plot of vowel duration (ms) by vowel quality. The error bars represent 95 $\%$ of confidence intervals.

However, the main effect of gender on vowel duration is significant as female speakers produced vowels with longer duration than male speakers ( $ \chi^{2}$ (1) = 8.5, p < .004). The average duration of male speakers’ vowels is less than 300ms while that of female speakers is greater than this average (Figure 4). The ratio is 1.32:1, which means female speakers produce vowels, which are 1.32 times longer than male speakers’ vowels. Gender accounts for 36 $\%$ (R 2 = .36) of the variance in the duration of the vowels. The result shows that the interaction between vowel quality and gender is not significant ( $ \chi^{2}$ (4) = 9.16, p < .06).

Figure 4 Raincloud plot of vowel duration (ms) by gender. The error bars represent 95 $\%$ of confidence intervals.

3.2 Fundamental frequency

One of the acoustic parameters used for describing vowels is f0 because the sounds are known to have their own intrinsic f0 independently of the influence of sociolinguistic and context variables. The f0 of /oː/ and /aː/ have the highest and lowest means respectively while the front vowels have almost equal means. It is roughly inversely related to vowel height in this dialect of Oromo and the difference is statistically significant ( $ \chi^{2}$ (4) = 28.72, p < .001). Multiple contrasts show that /aː/ significantly differs from /iː/ and /uː/ at p < .02, and from /oː/ at p < .001.

The effect of gender on f0 is well established in previous acoustic studies but the current study is more concerned with the magnitude of the effect of such a factor. Female speakers’ f0 (235 Hz) is significantly greater than that of male speakers (129 Hz) ( $ \chi^{2}$ (1) = 50.23, p < .001). Gender accounts for 92 $\%$ (R 2 = .93) of the variances of f0, suggesting a strong relationship between the two variables. The ratio of the two averages is 1.8:1, which implies that this acoustic parameter is a robust acoustic correlate of gender in this dialect of Oromo (Figure 5). The interaction of gender and vowel quality is not significant ( $ \chi^{2}$ (4) = 5.11, p < .28).

Figure 5 Raincloud plots of fundamental frequency (Hz) by gender. The error bars represent 95 $\%$ of confidence intervals.

3.3 Formant frequencies

As expected, /aː/ has the highest F1 but /iː/ has the lowest F1. The vowel / /iː/ is typically a high vowel in the dialect and has a big height difference with the mid vowel /eː/ but such a difference is reduced in the case of the back vowels. The main effect of vowel quality on F1 is significant ( $ \chi^{2}$ (4) = 57.18, p < .001) and it accounts for 83 $\%$ (R 2 =.95) of the variances. Post-hoc comparisons indicate that all possible comparisons significantly differ from each other at p < .001. F2 of the vowels increase from the posterior to the anterior of the vocal tract, with the front vowel /iː/ having the highest average followed by the mid-front vowel /eː/. The central low vowel /aː/ has an intermediate F2 between the front and back vowels (Figure 6). As expected, F2 significantly varies as a function of vowel quality ( $ \chi^{2}$ (4) = 62.58, p < .001). It explains 78 $\%$ (R 2 = .78) of the variances, which is less than the contribution of F1.

Figure 6 Raincloud plots of the first three formant frequencies (Hz) by vowel quality. The error bars represent 95 $\%$ of confidence intervals.

Post-hoc comparison tests indicate that except /oː/ and /uː/, all other possible comparisons significantly differ from each other at p < .001. Such vowels as /aː/ and /iː/ have the lowest and highest F3 respectively while the /uː/ and /eː/ have almost equal F3. Furthermore, the vowel /uː/ has a lower F3 than does /aː/. As a result, vowel quality significantly affects (F3: $ \chi^{2}$ (4) = 21, p < .001) but its contribution to the variances is small (R 2 = .26). Post-hoc tests also indicate that /iː/ significantly differs from the back vowels and the central vowel at p < .001.

Past studies reported a significant effect of gender on formant frequencies, with female speakers having higher F1, F2 and F3 but a few of them included effect size. In the current study, the female speakers produced all vowels with higher F1 (520 Hz), F2 (1705 Hz) and F3 (2910 Hz) (Figure 7). As a result, the formant frequencies significantly vary as a function of gender, (F1: $ \chi^{2}$ (1) = 11.90, p < .001; F2: $ \chi^{2}$ (1) = 11.43, p < .001; F3: $ \chi^{2}$ (4) = 32.06, p < .001). It is also observed that in addition to f0, F3 is a good correlate of gender in this dialect of Oromo. The comparison of the contribution of gender to variances of the formant frequencies (F1: R 2 = .46; F2: R 2 = .45; F3: R 2 = .82) confirms such a strong association between gender and the third formant frequency. The interaction of gender with vowel quality is also significant for only the first formant (F1: $ \chi^{2}$ (4) = 11.56, p < .02; F2: $ \chi^{2}$ (4) = 0.8, p < .94; F3: $ \chi^{2}$ (4) = 1.29, p < .86). The significant interaction seems to arise from the fact that female speakers have a higher value of F1 for the vowel /uː/.

Figure 7 Raincloud plots of the first three formant frequencies (Hz) by gender. The error bars represent 95 $\%$ of confidence intervals.

3.4 Classification

Support Vector Machine classifier was used with the objective of establishing how well the vowels could be classified based on different sets of acoustic parameters sampled at the midpoint (t50) of the vowel duration. The classification results indicate that there is a clear difference between the classification accuracy of vowel tokens of both genders (Table 4). When F1 and F2 were used, 80 $\%$ and 93 $\%$ of tokens of vowels of female and male speakers were respectively correctly classified. The addition of F3 led to resulted in a lower classification in both female (77 $\%$ ) and male vowels (87 $\%$ ). When f0 was entered along with the F1 and F2, the classification rates of the female and male vowels also decreased (Table 4). The first two formant frequencies seem to play a great role in the classification of male vowels while those of female vowels could slightly benefit from F3.

Table 4 Percentages of correct classification of vowels of the dialect based on acoustic parameters measured at the midpoint (t50), two points (t20& t80) and three points (t20, t50 and t80) of vowel duration. The mean represents the averages of the correct classification of female (F) and male (M) tokens.

Support Vector Machine was also employed to determine how well the vowels could be separated based on different sets of acoustic parameters sampled at two-time and three-time points of the vowel duration. The classification rates showed some improvement when the parameters sampled at more than a one-time point are used in the classifier. The range for the two-time point samples is 81 $\%$ to 86 $\%$ while it is 82.5 $\%$ to 86.5 $\%$ for the three-time point samples. Tokens of female speakers’ vowels were more accurately classified when the formant frequencies and fundamental frequency measured at more than one point in the vowel duration were entered into the classifier. For instance, the highest rate of correct classification (84 $\%$ ) for this group was obtained when the acoustic parameters were sampled at two (t20 and t80) or three points (t20, t50 and t80) of the vowel duration. Again, F1 and F2 largely contribute to the correct classification of the vowels while the inclusion of f0 and F3 result in a reduced classification in the tokens of male vowels. Overall, tokens of male vowels were more correctly classified in all cases namely t50, t20 and t80 and t20, t50 and t80.

4 Discussion

The current study found that vowel duration significantly differs with gender, with female speakers producing longer vowels. Previous studies also reported significantly longer duration for female speakers of different languages such as American English, Dutch and Amharic (Hillenbrand et al. Reference Hillenbrand, Getty, Clark and Wheeler1995, Adank, van Hout & Smits Reference Adank, van Hout and Smits2004, Derib Ado Reference Ado2011). However, Tujube Amansa (Reference Amansa2018) observed that female speakers had longer duration only for /aː/ and /uː/ though these speakers produced all short vowels with longer duration. The difference between the two studies could be attributed to a methodological disparity. Vowels were embedded in different phonetic environments and their durations were not normalised to reduce the effect of such environments in Tujube Amansa (Reference Amansa2018). The durational variation of vowels with gender could be attributed to the tendency of female speakers to produce clear speech, which is known to have a longer duration for vowels relative to plain speech (Leung et al. Reference Leung, Jongman, Wang and Sereno2016).

Similarly, the study has shown that mean f0 exhibits a significant variation in vowel quality and gender, and this finding is consistent with the results in Tujube Amansa (Reference Amansa2018). Fundamental frequency (R 2 = .93) is the major acoustic correlate of gender, accounting for the lion’s share of the variance while formant frequencies, particularly F3 (R 2 = .36) can also indicate gender in the dialect under investigation. The fundamental frequency is derived from the vibration of the vocal folds, which is known to have an anatomical difference between female and male speakers. One of their major differences which is relevant here is that female speakers’ vocal folds are thinner and shorter; consequently, they produce vowels with higher pitch as compared to male speakers (Simpson Reference Simpson2001, Pisanski et al. Reference Pisanski, Fraccaro, Tigue, O’Connor, Susanne Röder, Andrews, DeBruine, Jones and Feinberg2014). Social factors can enhance the pitch difference (Cartei et al. Reference Cartei, Cowles, Banerjee and Reby2014) as gender roles are clearly differentiated in the speech community under the current study.

Consistent with the findings of previous studies on different languages including Oromo (Hillenbrand et al. Reference Hillenbrand, Getty, Clark and Wheeler1995, Berhe, Moxness & Nyland Reference Behne, Moxness and Nyland1996, Adank, van Hout & Smits Reference Adank, van Hout and Smits2004, Tujube Amansa Reference Amansa2018), in the current study, formant frequencies exhibit a significant variation with vowel quality. Vowel quality contributes greatly to the variances observed in the means of the first (R 2 = .75) and the second formant frequencies (R 2 = .68). These formants greatly contribute to the acoustic description of vowel quality of the dialect but the contribution of the third formant is relatively small (R 2 = .14). An instrumental investigation of the association between the tongue position and formants indicated that vowel height is strongly correlated to F1 and vowel backness to F2 (Ximenes et al. Reference Ximenes, Shaw and Carignan2017, Lawson et al. Reference Lawson, Stuart-Smith and Rodger2019). Given these empirical accounts, it is not surprising that the low vowel, /aː/ has the highest mean for F1 while the front vowel, /iː/ has the largest mean for F2 in the current study. The magnitude of F3 is linked to lip rounding because this articulatory gesture gives rise to the lowering of F3 for the back high vowel and to the rising F3 for the front high vowel (Lawson et al. Reference Lawson, Stuart-Smith and Rodger2019). As a result, the current study demonstrates that the front high vowel /iː/ has a higher F3 while the back high, /uː/ has a lower value for the same formant frequency.

As expected, female and male speakers significantly differ in the second and the third formant frequencies. The contribution of gender to the variance in the third formant frequency is not very small (R 2 = .36) and this is consistent with the contribution of this acoustic measure noted in classifying correctly tokens of female vowels. The observed difference could be related to the shape of the resonant cavity and the size of the larynx. The larynx grows and the shape of the vocal tract changes at puberty in a male person, causing a significant variation in the acoustic features of their vowels (Simpson Reference Simpson2001, Pisanski et al. Reference Pisanski, Fraccaro, Tigue, O’Connor, Susanne Röder, Andrews, DeBruine, Jones and Feinberg2014). The variation can also have a behavioural basis, whereby girls tend to speak like girls and boys do the same (Pepiot Reference Pépiot2012, Cartei et al. Reference Cartei, Cowles, Banerjee and Reby2014).

The study also demonstrates that spectral change separates vowels almost as effectively as the steady state. This finding is consistent with the results of past studies on vowels of North American 30 English, Dutch, and Western Sydney Australian English dialects. Vowels of these dialects were well separated when parameters measured at different time points were employed (Hillenbrand et al. Reference Hillenbrand, Getty, Clark and Wheeler1995, Adank, van Hout & Smits Reference Adank, van Hout and Smits2004, Elvin et al. Reference Elvin, Williams and Escudero2016. However, spectral change does not have an advantage over the midpoint in classifying Oromo vowels of the dialect. The reason could be that enough samples that capture the spectral dynamics of the vowels might not have been taken. In other words, more samples may be needed to characterise formant trajectories of vowels of the dialect. Acoustic features of the vowels may be better represented by taking measurements at multiple points, rather than by sampling a few static targets (Jenkins, Strange & Miranda Reference Jenkins, Strange and Miranda1994). Formant frequencies sampled at 30 time-points of vowels of Western Sydney Australian English yielded a satisfactory result in classifying the sounds of the dialect (Elvin et al. Reference Elvin, Williams and Escudero2016). Future studies on the dialect need to consider taking measurements at several time points of vowel duration to determine if the spectral change has an advantage over the steady state in classifying vowels of the dialect.

Finally, the current study shows that there is a difference between the classification rates of vowel tokens of female and male speakers (Table 4). The difference might not be attributed to anatomical differences because the normalisation procedure used (Z-score) could retain a phonemic variation effectively while reducing physiological differences (Adank, Smits & van Hout Reference Adank, Smits and van Hout2004). A factor that differentially affects the classification rates of vowels of female speakers may be responsible for the gender difference. This could be a possible area of investigation for future studies on the dialect.

5 Conclusion

This study investigates the acoustic characteristics of Oromo vowels of the northern dialect. The results show that, with the exception of duration, all other acoustic parameters significantly vary with vowel quality while f0, F1, F2, F3 and duration significantly differ with only gender. The classification rate of the vowels seems to vary with gender, with vowel tokens of male speakers more accurately classified. The spectral change (time-varying frequency) does not seem to have an advantage over a midpoint in classifying vowels of both genders. The acoustic parameters sampled at the midpoint may be enough to classify the vowels of the dialect but sampling acoustic features at multiple points may be needed to capture formant trajectories of the vowels. The number of participants is not large in the current study and thus a comprehensive acoustic study that will involve a large number of participants may be needed to address the gender difference in classification rates of the vowels.

Acknowledgements

This research was fully supported by Linguistic Capacity Building Project financed by the Norwegian government. The author is grateful to anonymous reviewers and Bodo Winter for helpful feedback.

Footnotes

1 Alternatively, the speakers could have read in isolation the real words containing the target vowels but words are rarely used in isolation in everyday language. The main problem with using isolated words is that there is a confounding effect of phrasal accent. A single pronounced word is also a phrase in which both lexical and phrasal accents are accumulated. As a result, it is very difficult to tease apart these two accent types (Roettger & Gordon Reference Roettger and Gordon2017). The findings obtained from the analysis of such recording may not genuinely reflect the real acoustic features of speech sounds.

2 Various reasons for a model failure to converge and the remedies are discussed by different writers (e.g. Winter Reference Winter2020, Brown Reference Brown2021). The major reason in the current study might be that the model was fitted to sparse data as the number of data is small. When the random slope was removed, the model could easily converge but fitting the model without it could lead to type 1 error (Matuschek et al. Reference Matuschek, Reinhold Kliegl, Baayen and Bates2017). So, a control parameter was used to remedy the model failure. If this did not work, the model would be fitted with an uncorrelated slope-intercept interaction, with any possible failure to estimate the slope being checked subsequently (Winter Reference Winter2020).

References

Abelson, R. Robert. 1995. Statistics as principled argument. Hillsdale, NJ: Erlbaum.Google Scholar
Adank, Patti. 2003. Vowel normalisation: A perceptual-acoustic study of Dutch vowels. Wageningen: Ponsen & Loosen.Google Scholar
Adank, Patti, Smits, Roel & van Hout, Roeland. 2004. A comparison of vowel normalisation procedures for language variation research. The Journal of the Acoustical Society of America 116(5), 30993107.10.1121/1.1795335CrossRefGoogle ScholarPubMed
Adank, Patti, van Hout, Roeland & Smits, Roel. 2004. An acoustic description of vowels of Northern and Southern Standard Dutch. The Journal of the Acoustical Society of America 116(3), 17291738.10.1121/1.1779271CrossRefGoogle ScholarPubMed
Baker, Jonathan. 2012. Migration and mobility in a rapidly changing small town in northeastern Ethiopia: Environment & urbanisation. International Institute for Environment and Development (IIED) 24(1), 345367.Google Scholar
Bartoń, Kamil. 2017. MuMIn: Multi-model inference. R package version 1.40.0. https://cran.r-project.org/package=MuMIn.Google Scholar
Bates, Douglas, Mächler, Martin, Bolker, Ben & Walker, Steve. 2015. Fitting linear mixed-effects models using lme4 . Journal Statistical Software 67(1), 148.10.18637/jss.v067.i01CrossRefGoogle Scholar
Yimam, Baye. 2016. Phonological features of the Amharic variety of South Wollo. In Binyam Sisay Mendisu & Janne Bondi Johannessen (eds.), Multilingual Ethiopia: Linguistic challenges and capacity building efforts: Special issue of Oslo Studies in Language 8(1), 9–30.Google Scholar
Behne, Dawn, Moxness, Bente & Nyland, Anne.1996. Acoustic-phonetic evidence of vowel quantity and quality in Norwegian: Quarterly Progress and Status Report. http://www.speech.kth.se/qpsr.Google Scholar
Boersma, Paul & Weenink, David. 2001. Praat: Doing phonetics by computer (version 6.20).Google Scholar
Bradlow, Ann R. 1995. A comparative acoustic study of English and Spanish vowels. The Journal of the Acoustical Society of America 97, 19161924.10.1121/1.412064CrossRefGoogle ScholarPubMed
Brown, Violet A. 2021. An introduction to linear mixed-effects modelling in R. Advances in Methods and Practices in Psychological Science 4(1),119.10.1177/2515245920960351CrossRefGoogle Scholar
Cartei, Valentina, Cowles, Wind, Banerjee, Robin & Reby, David. 2014. Control of voice gender in pre-pubertal children. British Journal of Developmental Psychology 32, 100106.10.1111/bjdp.12027CrossRefGoogle ScholarPubMed
Chung, Hyunju, Eun Jong Kong, Jan Edwards, Weismer, Gary, Fourakis, Marios & Hwang, Youngdeok. 2012. Cross-linguistic studies of children’s and adults’ vowel spaces. The Journal of the Acoustical Society of America 131(1), 442454.10.1121/1.3651823CrossRefGoogle ScholarPubMed
Clopper, Cynthia G., Pisoni, David B. & de Jong, Kenneth. 2005. Acoustic characteristics of the vowel systems of six regional varieties of American English. The Journal of the Acoustical Society of America 118(3), 16611676.10.1121/1.2000774CrossRefGoogle ScholarPubMed
CSA [Central Statistical Agency]. 2007. Statistical report and housing census. Addis Ababa: Central Statistical Agency.Google Scholar
Ado, Derib. 2011. An acoustic analysis of Amharic vowels, plosives and ejectives. Ph.D. thesis, Addis Ababa University.Google Scholar
DiBenedetto, Maria-Gebriella. 1989. Frequency and time variations of the first formant: Properties relevant to the perceptions of vowel height. The Journal of the Acoustical Society of America 86, 6777.10.1121/1.398221CrossRefGoogle Scholar
Elvin, Jaydene, Williams, Daneil & Escudero, Paola. 2016. Dynamic acoustic properties of monophthongs and diphthongs in Western Sydney Australian English. The Journal of the Acoustical Society of America 140, 576581.10.1121/1.4952387CrossRefGoogle ScholarPubMed
Ferguson, Hargus S. & Kewley-Port, Diane. 2007. Talker differences in clear and conversational speech: Acoustic characteristics of vowels. Journal of Speech, Language and Hearing Research 50, 12411255.10.1044/1092-4388(2007/087)CrossRefGoogle ScholarPubMed
Fox, Robert Allen & Jacewicz, Ewa. 2009. Cross-dialectal variation in formant dynamics in American English vowels. The Journal of the Acoustical Society of America 126, 26032618.10.1121/1.3212921CrossRefGoogle ScholarPubMed
Hillenbrand, James, Clark, Michael & Nearey, Terrance. 2001. Effects of consonant environment on vowel formant patterns. The Journal of the Acoustical Society of America 109, 748763.10.1121/1.1337959CrossRefGoogle ScholarPubMed
Hillenbrand, James, Getty, Laura A., Clark, Michael J. & Wheeler, Kimberlee. 1995. Acoustic analysis of American English vowels. The Journal of the Acoustical Society of America 97, 30993111.10.1121/1.411872CrossRefGoogle ScholarPubMed
Harrison, Xavier A., Lynda Donaldson, Maria Eugenia Correa-Cano, Julian Evans, David N. Fisher, Cecily E. D. Goodwin, Beth S. Robinson, David J. Hodgson & Inger, Richard. 2018. A brief introduction to mixed effects modelling and multi-model inference in ecology. PeerJ 6:e4794, https://doi.org/10.7717/peerj.4794.CrossRefGoogle ScholarPubMed
Jenkins, J. James, Strange, Winifred & Miranda, Salvatore. 1994. Vowel identification in mixed-speaker, silent-centre syllables. The Journal of the Acoustical Society of America 95, 10301043.10.1121/1.410014CrossRefGoogle ScholarPubMed
Jongman, Allard, Fourakis, Marios & Sereno, Joan A.. 1989. The acoustic vowel space of Modern Greek and German. Language and Speech 32(3), 221248.10.1177/002383098903200303CrossRefGoogle ScholarPubMed
Kebede, Hordofa. 2009. Towards genetic classification of Oromo dialects. Ph.D. thesis, University of Oslo.Google Scholar
Kirtley, M. Joelle, Grama, James, Drager, Katie & Simpson, Sean. 2016. An acoustic analysis of the vowels of Hawai’i English. Journal of the International Phonetic Association 46(1), 7989.10.1017/S0025100315000456CrossRefGoogle Scholar
Kong, Ying-Yee, Mullangi, Ala & Kokkinakis, Kostas. 2014. Classification of fricative consonants for speech enhancement in hearing devices. PLoS ONE 9(4).10.1371/journal.pone.0095001CrossRefGoogle ScholarPubMed
Kpodo, Pascal. 2013. An acoustic analysis of Siwu Vowels. Nordic Journal of African Studies 22(3), 177195.Google Scholar
Lamberti, Marcello. 1991. Cushitic and its classifications. Anthropos 86, 552561.Google Scholar
Lawson, Eleanor, Stuart-Smith, Jane & Rodger, Lydia. 2019. A comparison of acoustic and articulatory parameters for the GOOSE vowel across British Isles Englishes. The Journal of the Acoustical Society of America 146(6), 43634381.10.1121/1.5139215CrossRefGoogle ScholarPubMed
Lee, Sungbok, Potamianos, Alexandros & Narayanan, Shrikanth. 1999. The acoustics of children’s speech: Developmental changes of temporal and spectral parameters. The Journal of the Acoustical Society of America 105(3), 14551468.10.1121/1.426686CrossRefGoogle ScholarPubMed
Lenth, Russell V., Buerkner, Paul, Giné-Vázquez, Iago, Maxime Herve, Maarten Jung, Jonathon Love, Fernando Miguez, Riebl, Hannes & Singmann, Henrik. 2022. emmeans: Estimated marginal means, aka least-squares means. https://CRAN.R-project.org/package1/4emmeans.Google Scholar
Leung, K. W Keith, Jongman, Allard, Wang, Yue & Sereno, Joan A.. 2016. Acoustic characteristics of clearly spoken English tense and lax vowels. The Journal of the Acoustical Society of America 140(1), 4558.10.1121/1.4954737CrossRefGoogle ScholarPubMed
Lloret-Romanyach, Maria-Rosa. 1988. Gemination and vowel length in Oromo morphophonology. Ph.D. dissertation, Indiana University.Google Scholar
Matuschek, Hannes, Reinhold Kliegl, Shravan Vasishth, Baayen, Harald & Bates, Douglas. 2017. Balancing Type I error and power in linear mixed models. Journal of Memory and Language 94, 305315.10.1016/j.jml.2017.01.001CrossRefGoogle Scholar
Morrison, Geoffrey S. & Nearey, Terrance M.. 2007. Testing theories of vowel inherent spectral change. The Journal of the Acoustical Society of America 122, 1522.10.1121/1.2739111CrossRefGoogle ScholarPubMed
Nearey, Terrance M. 2013. Vowel inherent spectral change in vowels in North American English. In Stewart Morrison, Geoffrey & Assmann, Peter F. (eds.), Vowel inherent spectral change, 4985. Heidelberg: Springer.10.1007/978-3-642-14209-3_4CrossRefGoogle Scholar
Olson, David L. & Delen, Dursun. 2008. Advanced data mining techniques. Berlin: Springer.Google Scholar
Owens, Jonathan. 1985. A grammar of Harar Oromo. Hamburg: Helmut Buske.Google Scholar
Pépiot, Erwan. 2012. Voice, speech and gender: Male–female acoustic differences and cross-language variation in English and French speakers. Presented at XVèmes Rencontres Jeunes Chercheurs de l’ED 268, Paris, France.Google Scholar
Peterson, Gordon E. & Barney, Harold L.. 1952. Control methods used in a study of the vowels. The Journal of the Acoustical Society of America 24, 175184.10.1121/1.1906875CrossRefGoogle Scholar
Pisanski, Katarzyna, Fraccaro, Paul J., Tigue, Cara C., O’Connor, Jillian J. M., Susanne Röder, Paul W. Andrews, Bernhard Fink, DeBruine, Lisa M., Jones, Benedict C. & Feinberg, David R.. 2014. Vocal indicators of body size in men and women: A meta-analysis. Animal Behaviour 95(4), 8999.10.1016/j.anbehav.2014.06.011CrossRefGoogle Scholar
Roettger, Timo & Gordon, Matthew. 2017. Methodological issues in the study of word stress correlate. Linguistics Vanguard 3(1), 20170006.10.1515/lingvan-2017-0006CrossRefGoogle Scholar
Schoormann, Heike E., Heeringa, Wilbert & Peters, Jörg. 2019. Standard German vowel production by monolingual and trilingual speakers. International Journal of Bilingualism 23(1), 138156.10.1177/1367006917711593CrossRefGoogle Scholar
Simpson, Adrian P. 2001. Dynamic consequences of differences in male and female vocal tract dimensions. The Journal of the Acoustical Society of America 109, 21532164.10.1121/1.1356020CrossRefGoogle ScholarPubMed
Singmann, Henrik, Ben Bolker, Jake Westfall, Frederik Aust, Søren Højsgaard,John Fox, Michael A. Lawrence, Ulf Mertens, Love, Jonathon, Lenth, Russell & Haubo Bojesen Christensen, Rune. 2016. afex: Analysis of factorial experiments. R package version 0.16–1. https://cran.r-project.org/web/packages/afex/index.html.Google Scholar
Solé, Maria-Josep & Ohala, John J.. 2010. What is and what is not under the control of the speaker: Intrinsic vowel duration. In Cécile Fougeron, Barbara Kühnert, D’Imperio, Mariapaola & Vallée, Nathalie (eds.), Papers in Laboratory Phonology 10, 607655. Berlin: de Gruyter.10.1515/9783110224917.5.607CrossRefGoogle Scholar
Strange, Winifred, Andrea Weber, Erika S. Levy, Valeriy Shafiro, Hisagi, Miwako & Nishi, Kanae. 2007. Acoustic variability within and across German, French, and American English vowels: Phonetic context effects. The Journal of the Acoustical Society of America 122, 111129.10.1121/1.2749716CrossRefGoogle ScholarPubMed
Stroomer, Harry. 1987. A comparative study of three southern Oromo dialects in Kenya: Phonology, morphology and vocabulary. Hamburg: Helmut Buske.Google Scholar
Tosco, Mauro. 2000. Is there an “Ethiopian language area”? Anthropological Linguistics 42(3), 329365.Google Scholar
Amansa, Tujube. 2018. An acoustic analysis of Oromo vowels. Ph.D. thesis, Addis Ababa University.Google Scholar
Voigt, Rainier M. 1985. Tone types of nouns in Borana. Journal of African Languages and Linguistics 7, 5962.10.1515/jall.1985.7.1.59CrossRefGoogle Scholar
Wako, Tola. 1981. The phonology of Mecha Oromo. MA thesis, Addis Ababa University.Google Scholar
Wickham, Hadley. 2016. ggplot2: Elegant graphics for data analysis. New York: Springer.10.1007/978-3-319-24277-4_9CrossRefGoogle Scholar
Williams, Daniel & Escudero, Paola & Gafos, Adamantios. 2018. Spectral change and duration as cues in Australian English listeners’ front vowel categorization. The Journal of the Acoustical Society of America 44(3), 215225.10.1121/1.5055019CrossRefGoogle Scholar
Winter, Bodo. 2020. Statistics for linguists: An introduction to using R. New York: Routledge, Taylor & Francis.Google Scholar
Ximenes, Arwen B., Shaw, Jason A. & Carignan, Christopher. 2017. A comparison of acoustic and articulatory methods for analysing vowel differences across dialects: Data from American and Australian English. The Journal of the Acoustical Society of America 142(1), 363377.10.1121/1.4991346CrossRefGoogle Scholar
Yang, Byunggon. 1996. A comparative study of American English and Korean vowels produced by male and female speakers. Journal of Phonetics 24, 245260.10.1006/jpho.1996.0013CrossRefGoogle Scholar
Figure 0

Table 1 An inventory of Oromo short vowel phonemes and their corresponding long ones (Owens 1985).

Figure 1

Figure 1 Location of Oromia Zone where the northern dialect of Oromo is spoken (Baker 2012).

Figure 2

Table 2 List of real words used for recording vowels.

Figure 3

Table 3 Duration, f0, F1, F2 and F3 averages and standard deviations (SD) for the vowel tokens.

Figure 4

Figure 2 Distribution of female (left) and male (right) speakers’ vowel tokens on an F1 × F2 vowel space. The formants are measured at 50$\%$ of vowel duration and normalised with the Lobanov procedure. Vowel tokens of the female speakers exhibit a greater variation and this is also evident in Table 3.

Figure 5

Figure 3 Raincloud plot of vowel duration (ms) by vowel quality. The error bars represent 95$\%$ of confidence intervals.

Figure 6

Figure 4 Raincloud plot of vowel duration (ms) by gender. The error bars represent 95$\%$ of confidence intervals.

Figure 7

Figure 5 Raincloud plots of fundamental frequency (Hz) by gender. The error bars represent 95$\%$ of confidence intervals.

Figure 8

Figure 6 Raincloud plots of the first three formant frequencies (Hz) by vowel quality. The error bars represent 95$\%$ of confidence intervals.

Figure 9

Figure 7 Raincloud plots of the first three formant frequencies (Hz) by gender. The error bars represent 95$\%$ of confidence intervals.

Figure 10

Table 4 Percentages of correct classification of vowels of the dialect based on acoustic parameters measured at the midpoint (t50), two points (t20& t80) and three points (t20, t50 and t80) of vowel duration. The mean represents the averages of the correct classification of female (F) and male (M) tokens.