Setswana (also known as ‘Tswana’ or, more archaically, ‘Chuana’ or ‘Sechuana’) is a Bantu language (group S.30; ISO code tsn) spoken by an estimated four million people in South Africa. There are a further one million or more speakers in Botswana, where it is the dominant national language, and a smaller number of speakers in Namibia. The recordings accompanying this article were mostly produced with a 21-year-old male speaker from the area of Taung, North-West province, South Africa. Some of the accompanying recordings are of a 23-year-old female speaker from Kuruman (approximately 150 km west of Taung). The observations reported here are based on consulting with both these speakers, as well as a third speaker, from Kimberley. All three were speakers of South African Setswana varieties. For discussion of some differences between these varieties and more Northern and Eastern Setswana dialects – including those spoken in Botswana – see (Doke Reference Doke1954, Cole Reference Cole1955, University of Botswana 2001).
Previous descriptive work on Setswana includes Jones & Plaatje (Reference Jones and Plaatje1916), Jones (Reference Jones1928), Cole (Reference Cole1955), Chebanne, Creissels & Nkhwa (Reference Chebanne, Creissels and Nkhwa1997), University of Botswana (2001), as well as a specimen from the 1949 ‘Principles of the IPA’ (IPA 2010: 348–349).
Consonants
The consonants are shown in the table below, followed by example words for each. The transcriptions given here do not indicate effects of predictable vowel length, discussed with suprasegmental features later on. Consonants in parentheses are of marginal status, and are discussed below.
Footnote 1 Voiceless unaspirated plosives and affricates may be realized as ejectives. One of the speakers consulted for this Illustration systematically produced these sounds as clear ejectives, but the other two speakers did not realize them as ejectives in most tokens. Variable realization as ejectives is also reported in other Setswana data, from previous work, see Coetzee & Pretorius (Reference Coetzee and Pretorius2010: 406), IPA (2010). The alveolar lateral-release affricate /tɬ/ tends to sound more clearly ejective than other stops and fricatives, and its aspirated counterpart /tɬʰ/ seems to have less aspiration than the other affricates, /tsʰ/ and /tʃʰ/.
There is considerable variation in the labial consonants: /f/ varies with /h/ quite generally in most dialects, e.g. [kʰù tsà fà tsà] ~ [kʰù tsà hà tsà] ‘make (oneself) sad’, and [fá] ~ [há] ‘give’.Footnote 2 The voiced stop /b/ may be realized without full closure, as [β]. It also varies phonologically with /ts/ in some words (Cole Reference Cole1955: 83), e.g. [mà b χ ] ~ [mà ts χ ] ‘arms’. This variation does not seem to obscure phonemic distinctions, however. In post-nasal context (see discussion below), /h/ that varies with /f/ hardens to [pʰ] (e.g. [χ h tsà] ‘to finish’ > [m pʰ tsà] ‘finish me’), while other invariant /h/ hardens to /kʰ/ ([χ hú mà] ‘be rich’ > [kʰú m ] ‘wealth’).
The glottal stop is non-phonemic, but can be found at the start of underlying vowel-initial morphemes. This may be at the start of a word before a vowel-initial stem (as in [ʔáχà] ‘build’ in the list above), or between a vowel-initial stem and a vowel-final prefix (as in /χ - mà/ ‘to stand’ [χ ʔ mà] ~ [χ mà]). In both contexts, the glottal stop appears to be optional, and our primary consultant normally did not produce glottal stops in the latter context.
The voiced alveolar plosive /d/ occurs only before the phonemic high vowels /i/ and /u/. Previous phonemic analyses generally view it as an allophone of /l/, since [l] does not occur before /i/ and /u/. Before raised mid-high vowels [ ], as well as the glide /w/, we find [l] and not [d], e.g. [m l m ] ‘cultivator’ and [b dʒà lwá] ‘beer’.
Intervocalically, /h/ is often reduced or elided (e.g. /χ -hù dù wà/ ‘to stir’ is produced as [χòù dù wà]). The trill /r/ normally has 2–3 taps (see section on doubled consonants below for further discussion). The velar nasal has a somewhat restricted distribution: it is most commonly found word-finally, before a velar consonant, or before a /w/ (though with some exceptions, such as [ŋàŋà] ‘dispute’, seen in the list of consonants above).
Nasal consonants may occur before other consonants. Nasals in this context are syllabic, and may undergo penultimate syllable lengthening; see discussion under ‘Doubled consonants’ below for further details. Word-final /ŋ/ commonly assimilates to the place of a following consonant in the next word in connected speech.
The status of the uvular consonants
The phoneme given here as /χ/ normally varies in articulation between a uvular fricative [χ]Footnote 3 and a voiceless uvular trill [ ], though some tokens (especially in connected speech) are closer to [x] or are reduced to [h]. An example of the trilled articulation in the word /n χà/ ‘snake’ is illustrated in Figure 1. Four periodic lines (which do not match the period of glottal pulses) are clearly distinguishable in the spectrogram during the word-medial /χ/, marked with dashed lines. These correspond to three distinct peaks in the waveform, indicating three separate contacts with the uvula – i.e. a trilled articulation.Footnote 4
Most previous descriptions (generally based on other varieties of Setswana) describe the phoneme /χ/ as a velar fricative /x/ rather than uvular (Doke Reference Doke1954, Cole Reference Cole1955, IPA 2010, Gouskova, Zsiga & Tlale Boyer Reference Gouskova, Zsiga and Boyer2011). It is cognate with aspirated velar stops in related languages (compare /áχà/ ‘build’ with Xhosa /akʰa/). While occasional tokens of /χ/ do sound close to velar [x], the normal realization sounds distinctly post-velar in our data. In post-nasal context, /χ/ hardens to the uvular affricate /qχ/ (e.g. [χá] ‘fetch water’ > [ ] ‘water pot’).
The uvular affricate /qχ/ has been characterized in some previous work as an aspirated uvular plosive /qʰ/ (Chebanne et al. Reference Chebanne, Creissels and Nkhwa1997, University of Botswana 2001), or as an aspirated velar affricate /kxʰ/ (Doke Reference Doke1954, Cole Reference Cole1955, IPA 2010, Gouskova et al. Reference Gouskova, Zsiga and Boyer2011).Footnote 5 In our data, the release of this consonant seems fricated rather than just aspirated, though. The quality of the interval between the burst and the following vowel is much more like the fricative /χ/ than the aspiration following stops (and it may involve the same trilling as /χ/; the included recording of [ ] ‘water pot’ illustrates this very clearly). This can be seen in Figure 2, showing both /qχ/ and /kʰ/, with the interval from the burst to the start of the following vowel indicated by dashed lines. The release of /qχ/, though clearly voiceless, has periodic noise, which closely resembles the trilling normally seen in the uvular fricative /χ/. This indicates that some uvular constriction is retained after release of the closure, unlike the aspiration seen with /kʰ/. The duration of the release intervals are fairly close, though measurement of three tokens of each suggests that the release of /qχ/ may be slightly shorter than /kʰ/, as shown in Table 1.
Doubled consonants
The liquid and nasal consonants /m n ɲ ŋ r l/ may be ‘doubled’; we refer to them in description as ‘doubled’ and represent them as 〈mm〉 rather than 〈mː〉 in order to abstract away from issues of phonological representation. Examples are given in (1) below.
-
(1)
The distinction between doubled and singleton consonants may be phonologically contrastive for at least the nasals, as in [m ná] ‘man’ vs. [m nà] ‘jealousy’. The typical origin of these doubled consonants appears to be historical loss of a penultimate vowel between two identical sonorants (University of Botswana 2001).
Doubled nasals typically show a noticeable dip or drop in intensity in the middle of the segment. This is illustrated in Figure 3, with the dip marked by the arrow.
The duration of doubled sonorants in penultimate-syllable position, like those in (1), normally seems to be approximately three times that of singletons. Comparing the recordings of [m nà] ‘jealousy’ and [m ńná] ‘man’ included with this article, we find that the singleton [n] has a duration of 99 ms, while the doubled [nn] is 310 ms. Comparing the singleton [n] in [χ nwá] ‘to drink’ to the doubled [nn] in [χ ńná] ‘to live’, we find a similar ratio: 137 ms in the singleton nasal, 437 ms in the doubled one. Additionally, while the singleton trill [r] normally involves 2–3 contacts, the doubled trills involve 6–8 contacts (rather than 4–6).Footnote 6 This may be a reflection of penultimate syllable lengthening (discussed later on).
Though previous descriptions imply that doubling of /l/ is quite common (Cole Reference Cole1955: 29, University of Botswana 2001: 25), it was rare in the data we recorded from our consultants. One token of /m -l l / ‘fire’ as [m l.l ] was collected from a sentence from our primary consultant. The duration of the doubled liquid in this form is 156 ms, substantially longer than singleton [l]s. (When the same word was pronounced in isolation, as [m l l ], the singleton [l]s had durations of approximately 80 ms to 110 ms). A clear dip in intensity can be seen midway through the doubled [ll], comparable to that observed in the doubled nasals. This is shown in Figure 4.
Some previous work on Setswana describes the doubled consonants as syllabic (Cole Reference Cole1955: 28, Chebanne et al. Reference Chebanne, Creissels and Nkhwa1997, IPA 2010). This description alone is not sufficient, however: the doubled liquids are not observed before another consonant – a restriction that does not follow from simply interpreting them as syllable nuclei. Coetzee (Reference Coetzee, Horwood and Kim2001) and Gouskova et al. (Reference Gouskova, Zsiga and Boyer2011: 2123) analyze the doubled consonants as geminates, comprising the nucleus of a syllable and the onset of the following one, e.g. [ .má] ‘mother’. This interpretation seems consistent with our observations. The dip in intensity in doubled nasals and [ll] suggests that these segments are not simply lengthened, but consist on some level of multiple consonantal units. This also seems to be corroborated by the observation that doubled consonants in penultimate position are three times as long as their singleton counterparts. Penultimate syllables are normally lengthened (discussion below). If the doubled consonants are geminates in which the first half of the geminate is syllabic, then only half of the doubled segment should undergo penultimate lengthening. Thus, we might expect lengthening of a doubled consonant to produce a sequence with three abstract units of length: one from each half of the geminate, plus one more unit of length from further lengthening of just the half of the geminate that is syllabic (i.e. /m.ma/ → [ ]).
Post-nasal consonant hardening
Previous work on Tswana observes a series of changes that apply to root-initial consonants in ‘post-nasal’ context, conditioned by several prefixes.Footnote 7 These changes include: devoicing and/or lenition of voiced plosives; fortition of /h/ and /f/Footnote 8 to aspirated plosives [kʰ] and [pʰ], respectively; affrication of other fricatives /χ/, /s/ and /ʃ/ to [qχ], [tsʰ], and [tʃʰ]; fortition of /r/ to [tʰ] and of /l/ to [t]; and insertion of [k] before vowel-initial roots. Examples are given in (2).
-
(2)
We did not observe any consistent phonetic differences between the plosives and affricates derived in this way, as compared to their underlying counterparts in root-initial contexts. VOT values for the stops derived from liquids (/r/ → [tʰ] and /l/ → [t]) seem comparable to underlying /tʰ/ and /t/, suggesting that the hardening is a categorical shift (though more rigorous quantitative examination would be needed to confirm this). For further discussion of post-nasal consonant changes, see Hyman (Reference Hyman, Hume and Johnson2001), Zsiga, Gouskova & Tlale (Reference Zsiga, Gouskova, Tlale, Davis, Deal and Zabbal2006), Coetzee, Lin & Pretorius (Reference Coetzee, Lin, Pretorius, Trouvain and Barry2007), Coetzee & Pretorius (Reference Coetzee and Pretorius2010), Solé, Hyman & Monaka (Reference Solé, Hyman and Monaka2010), Gouskova et al. (Reference Gouskova, Zsiga and Boyer2011), Boyer & Zsiga (Reference Boyer and Zsiga2013).
Vowels
There are seven contrastive vowels, /i ɛ a ɔ u/. Previous descriptions of Setswana characterize the high-mid vowels as high or semi-open, e.g. as /ɪ/ and /ʊ/ (Cole Reference Cole1955, Chebanne et al. Reference Chebanne, Creissels and Nkhwa1997; see also Dichabe Reference Dichabe1997, le Roux & le Roux Reference Le Roux and Roux2008, le Roux Reference Le Roux2012). In our data, the majority of tokens seem to be lower than this notation suggests, but still higher than [e] and [o], and we therefore transcribe them as raised close-mid [ ] and [ ] (a finding similar to that of le Roux & le Roux Reference Le Roux and Roux2008). In the linguistics literature and some older materials, the orthographic convention is to use e and o for the close-mid vowels [ ] and [ ], while ê and ô are used to represent open-mid [ɛ] and [ɔ]. In the modern standard orthography, however, the diacritic is normally left out, and both sets of vowels are represented as e and o (see Chebanne et al. Reference Chebanne, Mokitimi, Matlosa, Nakin, Nkolola, Mokgoatšana and Machobane2003 for more details of the orthographic conventions).
A formant-scaled plot of the vowels is given below. Measurements were taken at the midpoint of each vowel, and averaged across three tokens of each of the example words listed above.
Both pairs of mid vowels (open-mid /ɛ ɔ/ and close-mid / /) have raised allophones [ ] and [ ], respectively. Previous work reports that these raised allophones are conditioned by a following higher vowel, or by the consonants /s ʃ ts tsʰ tʃ tʃʰ tɬ tɬʰ ɲ/, and by the locative suffix [- ] (Cole Reference Cole1955). For our primary consultant, this raising was subject to variation, happening clearly in some lexical items, but not others, and to varying degrees. Some previous sources report the raised allophones of the open-mid vowels in contexts where they are not derived, thus leading to an analysis with nine phonemic vowels (Chebanne et al. Reference Chebanne, Creissels and Nkhwa1997, Dichabe Reference Dichabe1997; see also Khabanyane Reference Khabanyane1991 on the same issue in the closely related vowel system of Sesotho).
It is not clear whether the raised allophones of the close-mid vowels / / and / / are consistently distinct from the underlying high vowels /i/ and /u/ for our speakers. Based on F1 measurements of an assortment of tokens, as well as impressionistic transcriptions, it seems that raising of the close-mid vowels / / and / / can result in vowels that sound very close to [i] and [u] respectively. Raising of open-mid /ɛ/ and /ɔ/ to [ ] and [ ] respectively very often produces vowels that are slightly lower than [ ] and [ ], but the F1 and F2 ranges of both vowels overlap to a very high degree. For more detailed consideration of the vowel raising patterns, see Dichabe (Reference Dichabe1997), le Roux & le Roux (Reference Le Roux and Roux2008).
While the raised vowels [ ] (from / /) are phonetically high, they do not behave phonologically like underlying high vowels with respect to the distribution of [l] and [d]. They occur with [l], systematically; [d] is found only with underlying /i u/.
Vowels in word-final position are often devoiced, e.g. /χ -hétsà/ → [χ h ts ] ‘to finish’, particularly for words at the end of a phrase. They may also be centralized or otherwise reduced.
Suprasegmental features
Tone
Setswana has two phonemic tones, high and low. The tonal phonology of Setswana is quite complex, and tonal processes can cause high and low tones to manifest on the same syllable, resulting in a surface rising or falling tone (see Jones Reference Jones1928 and Chebanne et al. Reference Chebanne, Creissels and Nkhwa1997 for much more detailed discussion of tone patterns). Some examples of lexical tone distinctions are given in (3).
-
(3)
Footnote 9 In utterance-initial contexts, high and low tones may start from approximately the same pitch. They are still distinguished by contours, however: low tones involve a fall, while high tones maintain the same pitch, or involve a rise. This is illustrated in Figure 6, with pitch tracks of [b ːdú] ‘rotten’ and [b ːnà] ‘see’, both said in isolation. These words have different tones on each syllable. The starting pitch is approximately the same in both words, however: the tonal difference on the first syllable is realized only towards the end of the syllable. (In the second syllable of each word, the tonal distinction does come with a clear difference in absolute pitch over the full duration of the syllable, though.)
Penultimate lengthening
The penultimate syllable of a prosodic phrase is normally lengthened (Cole Reference Cole1955: 55, University of Botswana 2001: 31, IPA 2010: 349). These lengthened syllables are also sometimes described as stressed (University of Botswana 2001: 32). Suffixation can produce alternations, affecting which syllable is lengthened, as in [ŋàːŋà] ‘argue’ vs. [ŋàŋ ːlà] ‘argue for’ (seen in the list below); the first vowel [a] in these two words shows a difference in duration of approximately 70 ms. Penultimate lengthening can apply to syllabic sonorant consonants as well as vowels, as in [ ːtɬá] ‘point’. The degree of lengthening depends on position in the utterance: words in isolation or at the end of an utterance exhibit more lengthening than phrase-final words in the middle of an utterance. We follow the convention in previous work (Cole Reference Cole1955: 55) of transcribing the former as full length, and the latter as half length (i.e. [aː] vs. [aˑ]); examples can be seen in the recorded passage at the end of this illustration.
-
(4)
Syllable structure
Syllables in Setswana are most commonly V or CV in shape. Syllabic sonorants also occur, but such syllables may not have a distinct consonantal onset (i.e. occurs, but not C ). Syllable-internal consonant clusters are limited to a consonant followed by [w], such as [χ ː.nwá] ‘to drink’, [b .dʒàː.lwá] ‘beer’, [ .ŋw ] ‘one’, and [s .tʃʰwá.ntʃʰ ] ‘picture’. Sonorants in such clusters can be identified as non-syllabic, because they do not exhibit penultimate lengthening.
Vowel–vowel sequences are permitted morpheme-internally, as in [tɬó.ù] ‘elephant’, [tà.ú] ‘lion’, and [p . ] ‘bull’. Hiatus between a root and a prefix may optionally show insertion of a glottal stop, though, as noted earlier, and elision of /h/ can also lead to hiatus. In connected speech, these sequences are very often reduced to, and do not show glottal stops. Intervocalic consonants may also be lenited or reduced.
Transcription of a recorded passage
This is a translation of ‘The North Wind and the Sun’, the standard text used for Illustrations of the IPA. The transcription below is based on a recording from our primary consultant; a second recording of the same story from another, female, speaker is also included with the accompanying recordings. The repetitions of [la] and [ʔ n ] in lines 2 and 3 were identified by the speaker as errors made while telling the story, and not deliberate repetitions. The orthographic version follows the modern convention in leaving out diacritics on mid vowels.
Broad phonetic transcription
Orthographic version
Pheho ya bokoni le letsatsi ba ne ba ganetsana gore, ke mang o ne a na le matla a a gaisang. Ba bona moeti mme letsatsi la raya pheho la re yo o tla a apolang moeti jase e ne. . .e ne o. . . ke ene o na leng matla a a gaisang. Letsatsi le ne la itšhuba mo morago ga leru. Ke fa pheho ya bokoni ya butšwela ka matla. Moeti o ne a goga jase a itshereletsa, pheho ya bokoni ya ba ya itlhoboga. Letsatsi le ne la tšhwa la hisa ka mogote mme moeti a be a apolola jase. Letsatsi le pheho ya bokoni ba ne ba dumalana gore letsatsi ke lone le le naleng matla a a gaisang.
Acknowledgements
This work was supported by a grant from the Rhodes University Research Council. Authors’ names are listed in alphabetical order. We owe great thanks to our Setswana consultants for all their help; for other helpful discussion and input, we also thank Iyad Issa, Seunghun Lee, Aaron Braver, and Andries Coetzee, and an anonymous reviewer.
Supplementary material
To view supplementary material for this Illustration, please visit http://dx.doi.org/10.1017/S0025100316000050.