1 Introduction
In the Autosegmental Metrical (AM) model, intonational contours are broken down into high (H) and low (L) tones and their combinations (Bruce Reference Bruce1977, Pierrehumbert Reference Pierrehumbert1980, Beckman & Pierre humbert Reference Beckman and Pierrehumbert1986, Ladd Reference Ladd1996). Intonational contours consist of these tonal targets, and interpolations between these points. Tonal events are divided into pitch accents and boundary tones. Pitch accents are prominence-lending events, while boundary tones mark the edges of phrases. Pitch accents are marked with a * to indicate which tone is associated with the stressed syllable. Boundary tones are marked with $\%$ , signalling the end of an Intonational Phrase, while they can be preceded by a phrase accent, marked by -, which signals the end of an Intermediate Phrase.
1.1 Intonational adjustment
Research on intonation has demonstrated that languages exhibit intonational adjustment when it is not possible for intonational contours to be realized with the required phonetic space. This occurs, for example, phrase-finally, or on monosyllabic words, or on words with voiceless consonants, since these reduce the available voiced space. Thus far, languages have been found to adjust to this complication in two ways: tune-to-text adjustment, through compression and truncation, or text-to-tune adjustment, such as schwa-epenthesis. The current investigation seeks to determine what type of adjustment is used in Lebanese Arabic.
Tune-to-text adjustment appears to be more common, or at least has been documented more often. Compression occurs when the tonal contour is fully realized on a smaller amount of material meaning that no tonal targets have been removed, but the timing of these is adjusted (Bannert & Bredvad-Jensen Reference Bannert and Anne-Christine1975, Reference Bannert and Anne-Christine1977; Grabe et al. Reference Grabe, Brechtje, Francis and Kimberley2000). Truncation occurs when, instead of the entire contour being realized, some of the tonal targets are deleted (Erikson & Alstermark Reference Erikson and Alstermark1972). This means that the contour begins as if the segmental material is available and simply ends at the point where material is no longer available, so the original contour is not fully realized. In Stockholm Swedish, Erikson & Alstermark (Reference Erikson and Alstermark1972) found that accent2 words with phonologically short vowels exhibited truncation, where the expected intonational fall was cut off early. In cases where a short vowel was followed by a voiceless consonant, this fall was almost completely unrealized. In Danish, in an analysis of f0 patterns on short and long stress groups, that is, a stressed syllable and the following unstressed syllables, Grønnum (Reference Grønnum1989) found that short stress groups were truncated versions of the long stress groups. Figure 1 shows an f0 rise being truncated versus compressed.
An examination of this question in British English was conducted by Grabe et al. (Reference Grabe, Brechtje, Francis and Kimberley2000), where they used three surnames in phrase-final position, Sheafer, Sheaf and Shift, to gradually reduce scope for voicing by reducing the number of syllables and vowel length. This study showed that varieties of the same language can employ different methods of pitch contour accommodation. Cambridge and Newcastle English were found to exhibit compression while Leeds English and Belfast English were shown to have truncation. Similarly, in an examination of varieties of Swedish, Bannert & Bredvad-Jensen (Reference Bannert and Anne-Christine1975, Reference Bannert and Anne-Christine1977) note that, similar to English, different varieties can use different methods. Grabe et al. (Reference Grabe, Brechtje, Francis and Kimberley2000) found that in all the varieties of British English examined, each variety employed one method, regardless of pitch contour type. In contrast, an examination of Northern Standard German (with speakers from Braunschweig) showed that this variety used truncation for falls and compression for rises (Grabe Reference Grabe1998, Grabe & Post Reference Grabe and Brechtje2002).
In Dutch, Hanssen, Peters & Gussenhoven (Reference Hanssen, Jörg and Carlos2007) also used surnames in phrase-final position, and gradually reduced the amount of sonorant material by using the target words /loːm, loːf, lɔm, lɔf/. Similar to Grabe (Reference Grabe1998) and Grabe et al. (Reference Grabe, Brechtje, Francis and Kimberley2000), Hanssen et al. (Reference Hanssen, Jörg and Carlos2007) used the measure rate of f0 change, where the f0 difference between the f0 maximum and minimum is divided by the temporal distance between them. As the sonorant material gets shorter, an increased rate of change is indicative of compression, while either a decreased rate, or no change, are indicative of truncation. The authors found that in fall-rises, rate of f0 change decreased. For both intonational falls and rises, as the amount of sonorant material decreased, rate of f0 change increased, indicating compression. However, they also note that for falls and rises there was a reduction in f0 change, which is indicative of truncation. They conclude that since strictly speaking, truncation suggests that the original contour ends abruptly, rather than a reduction in the difference between the f0 maximum and minimum, perhaps the term ‘undershoot’ should be used in this case. Undershoot has generally been defined as tones not reaching their targets (Bruce Reference Bruce1977; Arvaniti, Ladd & MennenReference Arvaniti, Robert Ladd and Ineke1998, Reference Arvaniti, Robert Ladd and Ineke2006).
Recent findings on Luganda also use the terms compression and undershoot, with undershoot referring to f0 excursion size (Myers, Selkirk & Fainleib Reference Myers, Elisabeth and Yelena2018). Myers et al. (Reference Myers, Elisabeth and Yelena2018) measured rise excursion, fall excursion and the timing of these relative to syllable duration. The authors compared long and short high-tone spans caused by high tone spreading. In the short span, the rise and fall occurred on the same syllable, while in the long span, the rise and fall were a number of syllables apart, so in the short span, the time pressure was greater and the rise and fall affected one another. The results showed no evidence of undershoot (no difference in f0 excursion) but evidence of compression. Short high- tone spans had an earlier end of the f0 rise due to the time pressure of having to reach the high tone earlier. The time pressure also resulted in short spans having a later f0 fall with respect to their relevant syllable than long spans, because of the presence of the rise in the same syllable.
Rathcke (Reference Rathcke2016) investigated pitch accommodation strategies in Russian and German. In this study, the type of consonant preceding and following the stressed vowel, and the location of the stressed, nuclear-accented syllable in relation to the end of the phrase was systematically varied, in order to induce pressure. It should be noted that truncation is defined by Rathcke as undershoot, rather than it necessarily meaning that the underlying contour (the contour found when there is enough material to be realized) is only partially realized. Different pitch contour types behaved differently, with H*L $\%$ showing a combination of compression and truncation for both languages. This meant that one token involving this contour could simultaneously involve both an increased rate of change as well as tonal targets being undershot. The author notes that Russian was more sensitive to whether the accented syllable was followed by another syllable, while German was more sensitive to the type of consonant following the accented vowel. For example, for H*L $\%$ , German was found to have a difference between the final and penultimate syllable only in the condition where both had a voiceless fricative on either side of the vowel, and not in the condition where sonorants flanked the vowel. The author concludes that it is too simplistic to divide languages in ‘compressing’ or ‘truncating’, since the same language may do both, and pitch contour type also plays a role.
It also appears that high and low tones may behave differently under time pressure. With the finding that Northern Standard German has truncation in falls and compression in rises, Grabe (Reference Grabe1998: 141) notes: ‘Rather than relating to intonational phrase boundary specifications, the results may reflect a more general asymmetry between high and low tones’. The difference between high and low tones is also manifested more generally in the finding that low tones tend to be more stable in terms of alignment than high tones, as found in Mexican Spanish (Prieto, van Santen & Hirschberg Reference Prieto, Jan and Julia1995), English (Arvaniti & Garding Reference Arvaniti, Gina, Jennifer and José Ignacio2007) and Romani (Arvaniti Reference Arvaniti2016). Arvaniti & Garding (Reference Arvaniti, Gina, Jennifer and José Ignacio2007) and Arvaniti (Reference Arvaniti2016) note that low tones tend to be less stable in scaling, since they are found to be truncated or undershot more than high tones (Ladd Reference Ladd2008). This asymmetry between high and low tones is discussed by Pierrehumbert (Reference Pierrehumbert1980), who notes that when being manipulated in terms of scaling, for example in focus, L tones are more constrained than H tones, because L tones are already near the baseline of the speaker’s range. In contrast, speakers generally do not speak near the ceiling of their range, so this gives more space for H tones to be realized higher (Pierrehumbert Reference Pierrehumbert1980: 69).
Other factors have also been found to play a role in pitch contour accommodation. Ohl & Pfitzinger (Reference Ohl and Pfitzinger2009) undertook a similar experiment to that of Grabe (Reference Grabe1998), also on German, but rather than just having /f/ as the final sound in the syllable, they also had a condition with /ʃ/. They found that these two voiceless fricatives did not induce the same effect. Instead, in falling contours, truncation was more extreme before /f/ than before /ʃ/. They conclude that it is too simple to classify languages or even dialects as exhibiting truncation and/or compression, because different word groups in their experiment showed different patterns. Niebuhr (Reference Niebuhr2012) notes that voiceless sounds at the end of a phrase induce particular intonation/pitch impressions and that the results from Ohl & Pfitzinger (Reference Ohl and Pfitzinger2009) may have to do with an interaction between these fricatives and f0. In another experiment on German, Niebuhr (Reference Niebuhr2012) found that the fricatives /f s ʃ x/ had a higher mean center of gravity and higher noise energy levels in phrase-final position when in rising contours (questions) than when in falling ones (statements). This indicates that even voiceless segments have the ability to have a certain type of ‘intonation’ and, by extension, could therefore interact with f0 in different ways.
In contrast, some languages have been found to exhibit text-to-tune adjustment. In Bari Italian, Grice et al. (Reference Grice, Michelina, Alessandro and Roettger2015) found schwa-epenthesis on rising intonational contours in consonant-final English loanwords. They also found a significant effect of speaker and word, demonstrating that this epenthesis did not occur in all cases. The authors describe this process as ‘facilitating the production of functionally relevant tunes’ (Grice et al. Reference Grice, Michelina, Alessandro and Roettger2015: 1). In European Portuguese, the northern variety tends to employ truncation, while southern varieties can use schwa-epenthesis (Frota et al. Reference Frota, Marisa, Flaviane, Gisela, Aline, Carolina, Pedro, Marina, Pilar and Frota2015). West Greenlandic has also been shown to add a final mora in interrogatives, such that a final syllable that is monomoraic in a declarative becomes bimoraic in an interrogative (Rischel Reference Rischel1974, Gussen hoven Reference Gussenhoven, Broe and Pierrehumbert2000). In Catalan and Spanish, for some speakers, the more complex intonational contour of a rise-fall in narrow contrastive focus in final position caused the final syllable to lengthen (Prieto & Ortega-LlebarÍa Reference Prieto, Marta, Viga´rio, Sónia and Maria Jão2009). This only occurred in words with final stress, and where the full contour was realized. There were also cases of truncation for two of the Catalan speakers, and when truncation occurred, no final lengthening was found. The authors conclude that compression and truncation are intonational realization strategies that interact dynamically with timing. The only research examining accommodation strategies in Arabic, in the Semitic language family, is on Tunisian (Bouchhioua, Hellmuth & Almbark Reference Bouchhioua, Sam and Rania2019). This study found that some speakers added an epenthetic vowel to the end of a final lexical item in polar questions with a rise-fall contour. However, this pattern differed from the epenthesis found in other languages because it only occurred with one type of contour, and it also occurred in sequences where there was sufficient segmental material to produce the relevant contour. As such, it was not considered to be text-to-tune adjustment (Hellmuth Reference Hellmuth, In Marisa, Sónia and Pedro2017).
1.2 Intonation in Arabic
Lebanese Arabic lies in the subfamily of Levantine Arabic, which also includes Palestinian, Syrian and Jordanian Arabic. Arabic prosody has been the subject of some phonetic and phonological research, with work on the intonation of focus in Egyptian Arabic (Hellmuth Reference Hellmuth, Rüdiger and Hansjörg2006a, b) and one variety of Lebanese Arabic (Chahal Reference Chahal2001, Chahal & Hellmuth Reference Chahal, Sam and Sun-Ah2014), and work on lexical stress and focus in Jordanian Arabic (de Jong & Zawaydeh Reference de Jong and Bushra1999, Reference de Jong and Bushra2002). H and L tones are known to associate to the lexically stressed syllable in Arabic (Blodgett, Owens & Rockwood Reference Blodgett, Jonathan and Trent2007, Chahal Reference Chahal and Versteegh2007), and Egyptian Arabic has been found to have a pitch accent on all content words (Hellmuth Reference Hellmuth2006b), as opposed to only prominent words in the phrase.
An in-depth phonetic examination of Lebanese Arabic intonation was conducted by Chahal (Reference Chahal2001) withspeakers from the city of Tripoli, in northern Lebanon, and found that pitch accents (H* and L+H*) are used to mark prominence and are associated with stressed syllables. Chahal (Reference Chahal2001) found that the most common phrase-final declarative edge-tone complex in Lebanese Arabic is L-L $\%$ and for polar questions the contour is L* H-H $\%$ (Chahal Reference Chahal and Versteegh2007, Chahal & Hellmuth Reference Chahal, Sam and Sun-Ah2014).
1.3 Aims of the current study
While some work on Beiruti Arabic has been conducted on consonant gemination and voicing (Khattab Reference Khattab2007, Al-Tamimi & Khattab Reference Al-Tamimi and Ghada2018), no work has examined the intonation of this dialect. Although it is likely that phrase-final intonation is similar in Beirut and Tripoli, previous work did not specifically look at adjustments of the intonational contour due to lack of segmental mate rial. The current study examines phrase-final intonation accommodation strategies in the Lebanese Arabic spoken in Beirut, in order to determine whether truncation, compression, a combination of these, or even some type of epenthesis, is employed under time pressure. Since previous work has found that different contours in the same language variety may use different strategies (Grabe Reference Grabe1998, Grabe & Post Reference Grabe and Brechtje2002), the current study examines contours in statements as well as questions, and also examines individual speakers.
2 Method
In order to investigate the above questions, we examined rises and rise-falls by eliciting statements and questions. The amount of segmental material was manipulated by word length, using disyllabic and monosyllabic words in phrase-final position. The monosyllabic condition would induce an increase in time pressure. This approach was favored instead of using words ending in non-sonorants due to the difficulty of finding native Arabic names ending in voiceless fricatives. The measures examined are word duration, f0 maxima and minima (height and timing), f0 excursion and rate of f0 change, which can distinguish between compression and truncation (Grabe Reference Grabe1998, Hanssen et al. Reference Hanssen, Jörg and Carlos2007). These measures are discussed further in Section 2.4.
Based on work on another variety of Lebanese Arabic (Chahal Reference Chahal2001), the target words were expected to have the following melodies: for statements, a rise-fall, either H* L-L $\%$ or L+H* L-L $\%$ , and for questions, a rise, L* H-H $\%$ . Each sentence was coded for both the contour (rise vs. rise-fall) and the pitch accent and edge tones on the target word (H* L-L $\%$ , L+H* L-L $\%$ , L* H-H $\%$ ). These were both included in the statistical analysis to examine the possible differences between the two types of overall contour but also to determine whether the specific pitch accent and edge tones have an effect on tonal timing or height.
2.1 Stimuli
Stimuli were short statements and questions, such as D1/D2 and Q1/Q2 below, ending in either a disyllabic word with initial stress or a monosyllabic word. Pairs were chosen that controlled for vowel and consonant but varied in terms of the segmental material available for the intonational pitch contour. As in Grabe (Reference Grabe1998), the target words chosen were names. The target words consisted of sonorants and formed quasi-minimal pairs, such as Reema/Reem (see Table 1). All target words contained long vowels.
There were two sentence types – statements and questions – and two word lengths – disyllabic and monosyllabic, always in utterance-final position. The sentences were presented as set out just below; 3 = /ʕ/ and 2 = /ʔ/:
Statement
Karim w Zeina e3din bel cafe 3am yetfarajo 3al nes li mer2in. Faj2a, Karim b2ul:
-
D1. ‘Hay rfi2itna Reema!’
-
D2. ‘Hay rfi2itna Reem!’
Translation:
Karim and Zeina are sitting in a cafe watching people walk by. Suddenly, Karim says:
-
D1. ‘There’s our friend Reema!’
-
D2. ‘There’s our friend Reem!’
Question
Karim w Zeina e3din bel cafe 3am yetfarajo 3al nes li mer2in. Faj2a, Karim b2ul:
-
Q1. ‘Msh hay rfi2itna Reema?’
-
Q2. ‘Msh hay rfi2itna Reem?’
Translation:
Karim and Zeina are sitting in a cafe watching people walk by. Suddenly, Karim says:
-
Q1. ‘Isn’t that our friend Reema?’
-
Q2. ‘Isn’t that our friend Reem?’
This gave a total of 12 tokens (2 word lengths × 2 sentence types × 3 pairs of names). Each speaker read out each sentence once. The Arabic orthography is often associated with Classical Arabic or Modern Standard Arabic, the latter of which is used in more formal situations, and may thus affect how participants speak. In order to elicit the spoken Lebanese dialect, as opposed to the written standard form, sentences were presented in a Latinized version of the Arabic script, which is used by young people for texting. There is no standard form of the transliteration system, but people who use it are accustomed to there being various ways to spell the same word. The system used for the experiment was checked with a small number of students who all agreed that it was clear. Participants were also asked to speak as they would to their friends, and appeared to have no issue producing colloquial Lebanese Arabic for the experiment.
2.2 Participants
Participants were 16 (eight females, eight males) adult native speakers of Lebanese Arabic, all from Beirut, aged 18–22 years. They were students at the American University of Beirut and were paid US$10 for their time. They were all fluent in English and many also proficient in French.
2.3 Procedure
Participants were seated in a DemVox sound booth in the Department of English at AUB. They were recorded with a Zoom H5 recorder at a sampling rate of 44.1 kHz. The sentences were presented on paper, as one block, with all the statements together and all the questions together. This was to encourage participants to keep the same overall intonation pattern for each sentence type. This was one block in a larger experiment examining intonation.
2.4 Measurements
Recordings were manually labelled in Praat (Boersma & Weenink Reference Boersma and David2018), with target words labelled for segment boundaries and f0 maxima and minima (see Figure 2). F0 measurements were taken in semitones to allow for pooling of speakers. Pitch floor and ceiling were set, respectively, at 50 Hz and 325 Hz for males, and at 100 Hz and 450 Hz for females. The measurements depended on the type of contour. Target words in questions had a rise contour while target words in statements had a rise-fall contour.
Word duration was examined to ensure that monosyllabic words are indeed shorter than disyllabic words, and also to determine whether a complex contour (i.e. rise-fall) induces lengthening in a monosyllabic word.
F0 maximum was the highest point in the f0 contour of the target word, and f0 minimum, the lowest point. These were examined to determine whether f0 targets are reached or undershot when there is less segmental material. For words with a rise-fall contour, both the initial and final f0 minima of the target word were measured (Figure 2). (When examining the f0 minimum height across word lengths in rise-falls, only the final f0 minimum was used.) The f0 excursion in the target word was calculated for rises as the f0 maximum minus f0 minimum, while for rise-falls, f0 excursion for both the rise and the fall were calculated and then added together, as in Hanssen et al. (Reference Hanssen, Jörg and Carlos2007). F0 excursion was then divided by the duration of the contour, as in Grabe (Reference Grabe1998) and Hanssen et al. (Reference Hanssen, Jörg and Carlos2007). For rises, the duration was between the initial f0 minimum and the final f0 maximum, and for rise-falls, the duration was from the initial f0 minimum to the final one. This calculation results in a measure of rate of f0 change, allowing to discern between compression and truncation. A reduction in f0 excursion would indicate truncation. This could be instantiated by either or both of the f0 maximum and minimum not reaching their targets in the monosyllabic condition, meaning that the f0 maximum would be lower, and/or the f0 minimum would be higher, in the monosyllabic condition. If the rate of f0 change in monosyllabic words remains the same as in disyllabic words, or decreases, this indicates truncation. If the f0 maxima and minima reach the same levels as in the disyllabic condition but rate of f0 change increased, this indicates compression.
Tonal alignment was measured in order to provide insight into the effects of time pressure on the timing of the intonational tones. The timing of the f0 maxima and the initial f0 minima were measured in relation to the onset of the stressed vowel (for disyllabic words, this is the vowel in the initial syllable). These were then divided by vowel duration to get a relative measure, and multiplied by 100 to obtain a percentage.
There was a total of 178 tokens (12 tokens × 16 speakers = 192, with 14 disregarded due to speech errors (such as using list intonation) or interference (coughing, hitting the recorder, paper noise, etc.). Figure 3 shows the measurements on a disyllabic and monosyllabic word.
Linear mixed effects regression tests were run in R (R Development Core Team 2008), with the independent variables Contour (rise, rise-fall), Word Length (disyllabic, monosyllabic) and Pitch Accent & Edge Tones (L* H-H $\%$ , H* L-L $\%$ , L+H* L-L $\%$ ; shortened to ‘Pitch Accent’ throughout the text).Footnote 1 The dependent variables were f0 excursion, f0 maxima and minima height, word duration, rate of f0 change and alignment of f0 maxima and minima. For each dependent variable, models were tested with each of the independent variables separately and with interactions, to determine the best model. This was tested using the anova function in R. A random intercept for Speaker was included. A further possible random intercept of Set was explored, and added if it improved the model. Set refers to which of the three pairs of target words from Table 1 above the token was from.
2.5 Hypotheses
-
Phonological contours. Questions are expected to have the contour L* H-H $\%$ , a rise, while statements are expected to have a rise-fall, either H* L-L $\%$ or L+H* L-L $\%$ . In terms of acoustic measurements, the difference between the two rise-falls is expected to be in the timing of the f0 maximum, which might be later for L+H* than for H*. The timing of these f0 landmarks may become earlier in the monosyllabic condition, due to time pressure.
-
F0 excursion. If Lebanese Arabic behaves similarly to Northern Standard German, having truncation in falls and compression in rises (Grabe Reference Grabe1998), we would expect a reduced f0 excursion in monosyllabic words in rise-falls, and no such reduction in rises.
-
F0 maxima and minima. A reduction in f0 excursion in the monosyllabic condition could be caused by the f0 maximum being lower or the f0 minimum being higher, or both. It is predicted that one or both of these will occur for rise-falls. The f0 maximum for rises is the final H $\%$ , so this is expected to be higher and later than the f0 maximum for rise-falls, for both word lengths.
-
Word duration. Word duration is expected to be shorter for monosyllabic than disyllabic words, due to monosyllabic words having fewer segments.
-
Rate of f0 change. If rate of f0 change increases in the monosyllabic condition, but f0 excursion and maximum and minimum are not different, this indicates compression. This is predicted for rises. For rise-falls, since truncation is predicted, there should be no change in this measure for monosyllabic words.
3 Results
3.1 Overall patterns
Target words in statements had a rise-fall contour (as in Figures 2 and 4). Target words in questions had a rising contour (see Figures 5 and 6). Similar to Chahal (Reference Chahal2001) on the Tripoli dialect, and as predicted here, the contours found in statements were H* L-L $\%$ and L+H* L-L $\%$ , with 71 $\%$ having a H* pitch accent and 29 $\%$ having L+H*, and the proportions of these were similar across disyllabic and monosyllabic words. In questions, 100 $\%$ were L* H-H $\%$ .
3.2 F0 excursion
The linear regression for f0 excursion examined Word Length * Contour, as this model was found to be the best one. In the models below, for Word Length, disyllabic was the reference level, and monosyllabic was compared to it. For Contour, rises were the reference level. (For Pitch Accent, H* L-L $\%$ was the reference level.) In this type of test, the other levels are compared against the reference level, and the polarity of the coefficient indicates whether the compared level has a higher or lower value than the reference level. The value of the coefficient shows the magnitude of the difference. For example, in Table 2, the line for Length.Mono has a coefficient of −0.18. This means that the monosyllabic condition has on average a smaller f0 excursion by 0.18 semitones than the reference level, disyllabic words, although the difference is not significant. SE is standard error. An alpha-level of 0.05 was chosen. Since a number of the measures are dependent on one another, this must be corrected for. F0 excursion is based on f0 maximum and minimum height, and rate of change is based on f0 excursion as well as timing of the f0 maximum and minimum, so this gives six measures that are somewhat related. To adjust for this, a Bonferroni correction was used, so 0.05/6 = 0.0083, meaning that the new alpha-level for these measures was 0.0083. When there was a significant interaction, pairwise comparisons were conducted using the lsmeans package and a Tukey adjustment.
* in the last column means statistically significant
For f0 excursion, there is no main effect of Word Length or Contour but there is a significant interaction of Word Length * Contour. The pairwise results show that for rises, there is no effect of Word Length, but for rise-falls, as hypothesized, disyllabic words have a larger f0 excursion than monosyllabic words. Furthermore, there is no difference in f0 excursion size for disyllabic words based on Contour, but for monosyllabic words, there is a smaller excursion for a rise-fall than for a rise. These results are shown in Figure 7.
3.3 F0 maxima and minima: Height
In order to determine whether f0 targets are undershot when there is a reduction in segmental material, f0 maxima and minima were examined. For rise-falls, the f0 maximum measured was that on the target word. For questions, the f0 maximum was also on the target word but was the final tone in the sentence, thus the H $\%$ . The f0 minimum measured in rises was the initial (and only) f0 minimum in the target word. For rise-falls, the f0 minimum measured for height was the final one in the target word, so this was considered the final L $\%$ . (The f0 minimum measured for timing in rise-falls was the initial f0 minimum.)
As above, models were compared using the anova function. For f0 maximum, the best model was found to be the one with just the independent variable of Contour, showing that Word Length did not have a significant effect on the height of the f0 maximum (because including it in the model did not improve it). For f0 maximum (Table 3), there is a significant main effect of Contour. Since Contour has only two levels, a pairwise comparison is not necessary to show that the rise-fall contour has a significantly lower f0 maximum than the rise, as predicted. This can be seen in Figure 8, left panel, where the left two boxes (rises) are higher than the right two boxes (rise-falls).
* = statistically significant
For f0 minimum, the best model was one with Word Length * Contour. For f0 minimum, there is no main effect of Word Length or Contour but there is an interaction of Word Length * Contour. Pairwise tests show that there is no effect of Word Length for rises, but for rise-falls, the f0 minimum is significantly higher in monosyllabic words. These results are in Table 4 and Figure 8, right panel.
* in the last column means statistically significant
3.4 F0 maxima and minima: Timing
The timing of the and the initial f0 minimum was measured from the onset of the stressed vowel, and divided by vowel duration and multiplied by 100. This means that at 0, the f0 landmark is at the beginning of the vowel, at 100 it is at the end of the vowel, and higher than that it is after the end of the vowel. For both of these measures, a model including the random intercept of Speaker as well as Set was the best one.
For f0 minimum, the best model was one with Contour as the independent variable, and as shown in Table 5, the f0 minimum timing is later for rises than rise-falls, as shown in Figure 9.
* = statistically significant
For f0 maximum, the best model was one with Word Length * Contour. The results in Table 6 show that there is a significant main effect of both independent variables and a significant interaction. Pairwise tests show that all pairs are significantly different from one another, that is, for both rises and rise-falls, relative f0 maximum timing is significantly later in disyllabic than monosyllabic words. Comparing the two contours, both disyllabic and monosyllabic words have a significantly later f0 maximum timing for rises than rise-falls, as predicted.
* in the last column means statistically significant
These can be seen in Figure 10, which has the same axes for both measures for ease of comparison.
3.5 Word duration
The best model included just the independent variable Word Length, meaning that Contour did not significantly affect the data. For word duration (Table 7) there is a significant main effect of Word Length, with disyllabic words being longer than monosyllabic words, as expected. This can be seen in Figure 11.
* = statistically significant.
3.6 Rate of f0 change
For rate of f0 change, f0 excursion was divided by the duration between the f0 minimum and maximum in rises, and between the initial and final f0 minima in rise-falls, as described in Section 2.4.
The best was one with Word Length * Pitch Accent. As shown in Table 8, there is a significant main effect of Pitch Accent but not Word Length, and there is an interaction. The pairwise tests show that for both pitch accents that comprise rise-falls (H* L-L $\%$ and L+H* L- L $\%$ ), there is no effect of Word Length, but for words with L* H-H $\%$ , which were target words in rises, monosyllabic words have an increased rate of f0 change compared to disyllabic words. This can be seen in the middle of Figure 12. This result suggests that rises undergo compression.
* in the last column means statistically significant
3.7 Individual speakers
Figures 13 and 14 show pitch tracks for female and male speakers, respectively. These tracks were made from the [liːn] set, so are made of one repetition per speaker,Footnote 2 showing statements and questions in the disyllabic (black) and monosyllabic (grey) conditions. These pitch tracks were created using the Pitch Dynamics script (DiCanio Reference DiCanio2016) and took measurements of f0 (in Hertz) at ten timepoints in the target word. The figures show these timepoints normalized across the x-axis. Work by Prieto & Ortega-LlebarÍa (2009) and Cangemi et al. (Reference Cangemi, Dina El, Simon, Stefan, Martine, Jon, Alejna, Stephanie and Nanette2016) found that there can be speaker-specific differences in the realization of focus, so it is useful to look at each speaker individually to determine if they show the same patterns. It can be seen that overall there is a similar pattern for all speakers: for statements (solid lines) all speakers have a rise-fall, while for questions (dotted lines) they have a rise. In the monosyllabic statement condition, truncation is clearly visible in these tokens for Speakers 20, 12, 19, 27, 29, where the contour appears to be cut off before it reaches the f0 level seen in the disyllabic condition (bearing in mind that the contours are time-normalized).Footnote 3 Compression is exhibited in the fact that although the contours are time-normalized, monosyllabic rises reach the same final f0 height as disyllabic words (one exception to this is Speaker 18).
3.7.1 Individual speakers: F0 excursion
In order to examine possible individual variation, each measure was looked at by speaker. (Since the number of tokens per speaker was small, statistical tests were not run to examine individual speakers.) Figures are presented for rises and rise-falls separately due to the high number of speakers.
Figure 15 shows that for rises, there is in fact a lot of variation across speakers, with some having a wider excursion on monosyllabic words and some on disyllabic words. In rise-falls, Figure 16 shows that most speakers follow the overall pattern found, with a wider excursion on disyllabic than monosyllabic words. However, Speakers 11, 2 and 22 show the opposite pattern.
3.7.2 Individual speakers: F0 maxima and minima – height
The results for f0 minimum height show that in rises (Figure 17), for most speakers disyllabic and monosyllabic words overlap, while in rise-falls (Figure 18), most speakers (except Speakers 2 and 10) follow the pattern of having a higher f0 minimum in monosyllabic words.
For f0 maximum height, boxplots are presented for word lengths separately (Figures 19 and 20) so that each figure can show the difference between the two contours (hence the colouring difference from previous plots), since the main finding for this measure was the higher f0 maximum in rises, due to the H $\%$ . In both word lengths this is the case for all speakers except Speakers 12 and 20.
3.7.3 Individual speakers: F0 maxima and minima – timing
For f0 maximum relative timing, the overall pattern of the pooled results appears to hold, whereby it is later for disyllabic words and later for rises (Figures 21 and 22).
For f0 minimum timing, the main finding was based on the contours and not word lengths, so the different word lengths are presented separately (Figures 23 and 24). Generally, the same pattern as for the pooled results can be seen – that it is later for rises than rise-falls.
3.7.4 Individual speakers: Word duration
Figure 25 shows, as expected, that for rises, disyllabic words are longer than monosyllabic words for all speakers. Figure 26 shows almost the same pattern for rise-falls, except that one speaker (Speaker 7) has longer monosyllabic than disyllabic words.
3.7.5 Individual speakers: Rate of f0 change
Although the best model for this was Word Length * Pitch Accent, the main significant result was an increased rate of change in monosyllabic words for words that have L* H-H $\%$ , which are the target words in rises. For this reason, and for ease of reading the boxplots, the results here are split by Contour rather than Pitch Accent.
Figure 27 shows that all speakers have an increased rate of change for monosyllabic words in rises, indicating compression, as found in Section 3.6. This means that even the speakers who were shown to have a wider f0 excursion in monosyllabic rises (Figure 15 above) accommodate by compressing. The picture in Figure 28 is more complex. While most speakers show no change or a decreased rate of change in monosyllabic words for rise-falls, indicating truncation, there are three speakers who show a clear increase in rate of change: Speakers 11, 2 and 22. This suggests that these three speakers in fact use compression in their rise-falls.
The speakers who show an increased rate of change in their rise-falls are the same speakers who show an increased f0 excursion in their rise-falls. These speakers, at least in the tokens examined in the current experiment, have a wider pitch range for the rise-fall contour in monosyllabic words, so their rate of f0 change is higher in these shorter words. Because they have a wider f0 excursion, they have to compress in order to reach the targets in a shorter amount of time.
4 Discussion
4.1 General patterns
Target words in statements had a rise-fall contour, the tonal composition being H* L-L $\%$ and L+H* L-L $\%$ . Target words in questions had a rising contour, L* H-H $\%$ . This is in line with what Chahal (Reference Chahal2001) found for the Tripoli dialect, indicating that at least for focus and boundaries, the pitch accents in the Beirut dialect are the same.
4.2 F0 excursion
Word Length was not found to have any effect on rises, but for rise-falls, the f0 excursion was found to be larger for disyllabic than monosyllabic words. This means that when a complex contour is realized on a shorter word, it is not possible for the full f0 excursion to take place. This suggests truncation, in that perhaps one or both of the f0 targets (maximum or minimum) is undershot when less segmental material is available. For rises, there was no difference in f0 excursion between monosyllabic and disyllabic words. This suggests that compression occurred, because the same f0 contour was realized on less material.
4.3 F0 maxima and minima: Height
Based on the results for f0 excursion in rises, it is expected that since there is no effect of Word Length, f0 maximum and minimum are likely to be the same also.Footnote 4 This is true for both, where no effect of Word Length was found in rises for either f0 maximum or f0 minimum. Once again, this suggests compression, since the scaling of f0 targets does not change under time pressure.
The differences found in f0 excursion for rise-falls could be due to both a higher f0 maximum and lower f0 minimum, or to just one of these. Word Length was not found to have any effect on f0 maximum in rise-falls, but f0 minimum was significantly higher in monosyllabic words than disyllabic words. This explains the difference in f0 excursion as truncation of the contour, where in shorter words, the final f0 minimum in the target word is not quite reached, or in other words, the rise-fall ends lower when the target word is longer.
The f0 maximum in rise-falls was found to be significantly lower than that of rises. This indicates the presence of a H $\%$ in rises. F0 minimum was measured at the beginning of the target word in rises (since this was the only f0 minimum), and at the end of the target word in rise-falls.
4.4 F0 maxima and minima: Timing
For the initial f0 minimum there was no significant effect of Word Length, but its timing was later for rises than rise-falls. For f0 maximum, its alignment was significantly later in disyllabic words and in rises. As predicted, it is later in rises because it is the final H $\%$ . The results here mean that the H* is later in longer words, which suggests that surfacing on a short word makes it move earlier. Time pressure in the monosyllabic condition affected the timing of the f0 maximum but not of the f0 minimum. This result also shows that it is not the case that in monosyllabic words, the tones all get displaced to the left, also known as tonal repulsion (Gordon Reference Gordon and Harry2014). These results also support previous work that finds that low tones have more stable alignment that high tones (e.g. Arvaniti Reference Arvaniti2016), since time pressure in monosyllables did not cause them to move earlier.
The hypothesis that f0 maximum would be later in L+H* L-L $\%$ than H* L-L $\%$ was not confirmed, again because including Pitch Accent in the model did not significantly improve it. However, this warrants further discussion because it calls into question the phonetic difference between H* L-L $\%$ and L+H* L-L $\%$ . The major contour difference in this measure between rises and rise-falls likely had the effect of overriding (in the model) a smaller difference between H* L-L $\%$ and L+H* L-L $\%$ , which meant that adding this to the model did not produce a significant effect. In an exploratory analysis, including only Pitch Accent as an independent variable without Contour created a model that was highly significantly better than the null model, indicating that Pitch Accent does have a significant effect. The mean values for f0 maximum relative timing for H* L-L $\%$ and L+H* L-L $\%$ disyllabic words were 56 $\%$ and 76 $\%$ , and monosyllabic words 34 $\%$ and 51 $\%$ , so there is a clear alignment difference between these two pitch accents, although it did not improve the best model due to the stronger effect of Contour.
4.5 Word duration
Disyllabic words were found to be longer than monosyllabic words, as expected due to their having more segments. This solidifies the arguments throughout, that there is in fact less segmental material available in monosyllabic words.
The lack of any effect of Contour (based on the best regression model not including this) indicates that having a complex contour (rise-fall) did not induce lengthening. This is contrary to what was found by Prieto & Ortega-LlebarÍa (2009) for Spanish and Catalan. However, they only found this in words with final stress and only among some speakers. Furthermore, they also note that final lengthening only occurred among speakers who did not use truncation, so since in the current study truncation was found in the rise-fall contour, no lengthening is required.
4.6 Rate of f0 change
In rises (where the target words had L* H-H $\%$ ), the finding that monosyllabic words had a greater rate of f0 change than disyllabic words is indicative of compression, as hypothesized, meaning that rises are faster when they occur on shorter words.
The lack of a significant difference in rate of f0 change across rise-fall conditions indicates truncation, which was also suggested by the findings for f0 minimum. This means that monosyllabic words with a rise-fall in final position accommodate by abruptly ending the contour before reaching the final boundary L $\%$ . This finding also further corroborates the assertion by Arvaniti (Reference Arvaniti2016) that low tones tend to be truncated more than high tones.
4.7 Individual variation
The findings for word duration, f0 maxima and f0 minima were generally consistent across speak ers. For f0 excursion and rate of change in rise-falls there were three speakers who showed a pattern that was different from the other 13 speakers, whereby they in fact had a wider f0 excursion on monosyllabic words with rise-falls. This is evidence that not all speakers produce the same pattern. This different pattern from the majority was found only in rise-falls, so it appears that it is only for the complex contour that people use varying strategies. For rises, all speakers showed the same pattern of compression.
5 Overall findings
This paper presents several findings, including evidence for both compression and truncation, depending on the contour type. For Lebanese Arabic, rise-fall contours (L+H* L-L $\%$ , H* L-L $\%$ ) were found to employ truncation, while rises (L* H-H $\%$ ) showed compression. This is reminiscent of what was shown for Northern Standard German (Grabe Reference Grabe1998, Grabe & Post Reference Grabe and Brechtje2002), in which falls had truncation and rises had compression. The finding that the same language can employ more than one strategy is also supported by findings from Rathcke (Reference Rathcke2016) and Arvaniti & Ladd (Reference Arvaniti and Robert Ladd2009), and the current paper extends this finding by providing an analysis from a non-Indo-European language. The intonational contours found for the Beirut dialect echoed those found in the Tripoli dialect by Chahal (Reference Chahal2001).
No text-to-tune accommodation has been found here, which is consistent with the findings of the only other investigation of a variety of Arabic to-date – Tunisian Arabic (Hellmuth Reference Hellmuth, In Marisa, Sónia and Pedro2017, Bouchhioua et al. Reference Bouchhioua, Sam and Rania2019). An examination of individual speakers indicated that while most speakers showed the same pattern, three of 16 patterned differently from the others, but only in rise-falls, highlighting the importance of looking at individual variation across intonation patterns.
The present paper adds to phonetic and phonological research examining a variety of Arabic – Lebanese Arabic – which has not been studied for compression or truncation before. Future research will examine whether the accommodation strategies found here are the same in other dialects within Lebanon.
Acknowledgements
The research used for this investigation was conducted with the support of the American University of Beirut’s University Research Board, Award No. 103367. Pilot results of this study were presented at LabPhon16, Lisbon, June 2018. Special thanks to the editors and reviewers at JIPA for their comments, which greatly improved this work.