Hostname: page-component-6bf8c574d5-86b6f Total loading time: 0 Render date: 2025-03-03T20:32:52.053Z Has data issue: false hasContentIssue false

San Juan Piñas Mixtec

Published online by Cambridge University Press:  27 February 2025

Maxine Van Doren*
Affiliation:
University of California San Diego, Department of Linguistics
Claudia Duarte Borquez
Affiliation:
University of California San Diego, Department of Linguistics
Claudia Juárez Chávez
Affiliation:
University of California San Diego, Department of Linguistics
Gabriela Caballero
Affiliation:
University of California San Diego, Department of Linguistics
*
*Corresponding author. Email: [email protected]
Rights & Permissions [Opens in a new window]

Extract

San Juan Piñas Mixtec (endonym: Tò’ō Ndá’ví; henceforth SJPM) (ISO 639-3: vmc) is a previously undocumented Oto-Manguean language of the Mixtecan branch spoken in the municipality of Santiago Juxtlahuaca in Oaxaca, Mexico (shown in the map in Figure 1). According to a 2020 census conducted by the Mexican government (INEGI 2020), there are 717 inhabitants in the town of San Juan Piñas, almost all of whom speak SJPM as their native language. Additionally, speakers are found in diaspora communities in the western states of Baja California (Mexico), California, Oregon, Washington, and other places in Mexico and the United States. There are about half a million speakers of all Mixtec varieties in Mexico (INEGI 2020), and between 100,000 and 150,000 speakers of Mixtec in California (Kresge 2007). While elderly speakers in San Juan Piñas tend to be monolingual, younger speakers are bilingual in SJPM and Spanish. In diaspora communities in the United States, younger SJPM speakers shift to English and/or Spanish as their primary language(s) of communication.

Type
Illustration of the IPA
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press on behalf of The International Phonetic Association

San Juan Piñas Mixtec (endonym: Tò’ō Ndá’ví; henceforth SJPM) (ISO 639-3: vmc) is a previously undocumented Oto-Manguean language of the Mixtecan branch spoken in the municipality of Santiago Juxtlahuaca in Oaxaca, Mexico (shown in the map in Figure 1). According to a 2020 census conducted by the Mexican government (INEGI 2020), there are 717 inhabitants in the town of San Juan Piñas, almost all of whom speak SJPM as their native language. Additionally, speakers are found in diaspora communities in the western states of Baja California (Mexico), California, Oregon, Washington, and other places in Mexico and the United States. There are about half a million speakers of all Mixtec varieties in Mexico (INEGI 2020), and between 100,000 and 150,000 speakers of Mixtec in California (Kresge Reference Kresge2007). While elderly speakers in San Juan Piñas tend to be monolingual, younger speakers are bilingual in SJPM and Spanish. In diaspora communities in the United States, younger SJPM speakers shift to English and/or Spanish as their primary language(s) of communication.

Figure 1. (Left) Map of Mexico and (Right) close-up map of region identifying landmarks of San Juan Piñas and Oaxaca de Juárez (capital). Map created with ggmap (Kahle & Wickham, Reference Kahle and Wickham2013).

The Mixtecan branch (which also includes Cuicatec and Triqui) is one of eight branches of the Oto-Manguean language stock. There are approximately 60 varieties of Mixtec within 18 mutually unintelligible dialect clusters (Josserand Reference Josserand1983) originally spoken in the states of Oaxaca, Guerrero, and Puebla. Josserand (Reference Josserand1983) classifies Mixtec varieties into 12 dialectal areas. SJPM belongs to the Southern Baja region within this classification. Only one variety belonging to the Central Baja area has been previously illustrated phonetically, namely San Sebastián del Monte Mixtec (Cortés, Mantenuto & Steffman Reference Cortés, Mantenuto and Steffman2023).

There is a long tradition of phonological documentation of Mixtec languages spanning several decades since Pike’s (Reference Pike1944, Reference Pike1948) seminal work. More recently, there is work that addresses phonological and phonetic phenomena in Mixtec varieties (Gerfen Reference Gerfen1999, Reference Gerfen2001; Gerfen & Baker Reference Gerfen and Baker2005; DiCanio, Amith, & Castillo García Reference DiCanio, Amith and Castillo García2014; DiCanio et al. Reference DiCanio, Nam, Amith, Castillo García and Whalen2015; DiCanio, Benn & Castillo García Reference DiCanio, Benn and Castillo García2018; DiCanio et al. Reference DiCanio, Zhang and Whalen2020; Carroll Reference Carroll2015; Penner Reference Penner2019; Eischens Reference Eischens2022, inter alia). However, Mixtec languages are highly diversified, and many varieties remain undocumented. Additionally, although several language varieties spoken in the municipality of Santiago Juxtlahuaca share the same ISO code listed above, there are significant tonal and lexical differences between them.

Audio recordings in this illustration come from one of the authors, Claudia Juárez Chávez, who was born in San Juan Piñas, Oaxaca and has lived in the United States for the last 20 years. She speaks SJPM and Spanish fluently, using SJPM with community members and as part of her activities developing SJPM language resources for language revitalization and reclamation as well as teaching weekly language lessons. She speaks Spanish (L2) for most activities of daily living. The data analyzed throughout this paper were elicited during weekly elicitation sessions beginning January 2020. The examples include an orthographic representation, still under development, which reflect Claudia Juárez Chávez’s spelling preferences. This representation shares certain aspects of the conventions developed by the Ve’e Tu’un Savi (Mixtec Language Academy), the Summer Institute of Linguistics (SIL) and Mexico’s INALI (Instituto Nacional de Lenguas Indígenas) (Caballero, Juárez Chávez & Yuan Reference Caballero, Juárez Chávez and Yuan2024).

Recordings were conducted across multiple locations: prior to March 2020, all recordings were conducted at UC San Diego in a quiet room using a lavalier microphone and a Marantz PMD660 Portable Solid State Recorder at a 44.1k Hz sampling rate and 16-bit quantization. After March 2020, recordings were completed in a quiet room at the speaker’s home using the same microphone and recording device. Additional recordings were collected across two elicitation sessions in a sound-attenuated booth using a standing Octava MK-319 condenser microphone and Focusrite Scarlett 2i2 pre-amp and digitizer at a 44.1k Hz sampling rate and 16-bit quantization. All forms used for quantification were produced in isolation (see supplemental materials for word list and audio files used for quantification).Footnote 1

Consonants

SJPM has 19 consonant phonemes in native and borrowed vocabulary. The consonant inventory of SJPM is provided above.

SJPM exhibits similar phonemic contrasts to those found in other, related Mixtec varieties (Josserand Reference Josserand1983). Across Mixtec varieties, few voicing contrasts are noted (Gerfen Reference Gerfen1999; Marlett & Gittlen Reference Marlett and Gittlen1985; Sicoli Reference Sicoli2005). SJPM also makes frequent use of plosive and fricative consonants.

As documented in other Mixtec languages, the minimal phonological word in SJPM is bimoraic (the minimal size of content words in the language). This bimoraic phonological unit, which can be monosyllabic ((C)VV) or disyllabic ((C)VCV)), corresponds to the canonical morphological root, referred to as a ‘couplet’ in Mixtecanist studies, following Pike (Reference Pike1948). We describe the properties of this root template below, but allude to the couplet throughout the paper, as the distribution of phonological patterns in the language is sensitive to this unit.

Plosives and Affricates

Plosives contrast at bilabial, alveolar, velar and labialized velar places of articulation. The labial plosive /p/ occurs only in words borrowed from Spanish (e.g., /pa5ɲo5/ ‘shawl’, from Spanish paño). Alveolar and velar plosives may be voiceless or prenasalized.Footnote 2 The prenasalized velar /ɡ/ is a marginal phoneme that has only been found in few words to date. The voicing of the stop release is variable in prenasalized stops and affricates; for example, /ⁿd/ may be produced phonetically as [ⁿd̥] or [ⁿd], though it is generally produced as voiced (Figure 2). Similarly, affricates contrast between voiceless and prenasalized. As with prenasalized stops, the prenasalized affricate /ⁿd͡ʒ/ may be realized with a voiced or voiceless affricate (Figure 2). In general, the speaker produces /ⁿd͡ʒ/ as voiceless. It is unclear what conditions this variation.

Figure 2. Waveforms and spectrograms illustrating variable voicing during release of /ⁿd/ and /ⁿd͡ʒ/. Top row shows the initial syllable, /ⁿdo/ of /ⁿdoʒo3/ ‘spring’, produced as [ⁿd] on the left and [ⁿd̥] on the right. Stop burst and VOT are segmented in green. Bottom row illustrates the sequence /i5Ɂⁿd͡ʒa35/ in /ko1ʃi5Ɂndʒa35/ ‘not stingy,’ with /ⁿd͡ʒ/ produced as [ⁿd͡ʒ] on left and [ⁿd̥͡ʒ] on the right. Stop release and fricative portion of /ⁿd͡ʒ/ is segmented in orange.

The phonological status of prenasalized obstruents is debated in the Mixtecanist literature. Specifically, there is debate about whether prenasalized obstruents should be analyzed (i) as underlyingly voiced with nasalization enhancing voicing or (ii) as allophones of nasal consonants (i.e., as orally released nasals) (Marlett Reference Marlett1992; Iverson & Salmons Reference Iverson and Salmons1996; DiCanio et al. Reference DiCanio, Zhang and Whalen2020). Arguments for the latter approach involve the timing of nasality and distribution of prenasalized consonants (Martlett, Reference Marlett1992; DiCanio et al. Reference DiCanio, Zhang and Whalen2020). For example, in Yoloxóchitl Mixtec, prenasalized stops are restricted to words with oral vowels and alternate with nasals; when an oral vowel is affixed to a root with a nasal consonant (e.g., /n/ or /m/), the oral counterparts surface (e.g., [ⁿd], [𠁭b]) (DiCanio et al. Reference DiCanio, Zhang and Whalen2020). In SJPM, prenasalized consonants are restricted in their distribution to roots and enclitics with oral vowels. However, there is no evidence that they alternate with nasals (i.e., /n/ or /ɲ/). Moreover, there is no prenasalized bilabial stop counterpart to /m/. For the purpose of this illustration, we use the commonly adopted system of transcription of prenasalization (/ⁿd/, /ɡ/ and /ⁿd͡ʒ/).

To investigate voice onset time (VOT), 20 tokens each of [t, k, kʷ], 39 tokens of [ⁿd] and 36 tokens of [ⁿd͡ʒ] were analyzed in couplet-initial position. Words used for quantification of VOT of voiceless plosives were gathered from a larger corpus of data collected during the elicitation period with the third author. Prenasalized segments were recorded in a sound-attenuated booth, as described in the introduction; these consist of 13 unique words for [ⁿd] and 12 for [ⁿdʒ], each repeated three times.

All tokens were spoken in isolation or with a noun classifier preceding (see supplemental materials for full word list). Plosives in medial position were not included as /kʷ/ occurs infrequently in medial position. All voiceless plosives were more frequently followed by /i/ and /a/ as these vowels are very frequent in SJPM. Tokens used to calculate VOT for /kʷ/ were never followed by /o/. Prenasalized plosives showed a different pattern: [d͡ʒ] was most frequently followed by [a], and never by [i], while [ⁿd] was most frequently followed by [i] and [o]. For voiceless plosives, VOT was annotated from the release of the stop burst to the onset of voicing in Praat (Boersma & Weenik Reference Boersma and Weenink2020). Negative VOT was annotated from the onset of voicing (first positive peak in the waveform) to immediately before but not including the release burst. Mean VOT was found to be longest following [kʷ] with a duration of 38.06 ms (SD = 23.94 ms). The mean VOT for [k] and [t] were 30.98 ms (SD = 21.81 ms) and 9.91 ms (SD = 3.5 ms), respectively. Findings are illustrated in Figure 3. Negative VOT between [ⁿd] and [ⁿd͡ʒ] did not differ substantially with a mean of –106.7 ms (SD = 24.8 ms) and –105.7 ms (SD = 25.0 ms), respectively.

Figure 3. Positive and Negative VOT of voiceless stops [t, k, kʷ] and pre-nasalized consonants [ⁿd, ⁿd͡ʒ] in couplet-initial position. Large circles represent the mean VOT (in ms) for each stop and error bars represent one standard deviation. Values for individual tokens are represented by smaller circles.

To analyze voicing during the closure of prenasalized stops, strength of excitation (SoE) was calculated in VoiceSauce (Shue et al., Reference Shue, Keating, Vicenik and Yu2011). SoE is a measure of the strength of voicing at the point of glottic closure. Lower SoE values indicate weaker voicing (Murty & Yegnanarayana, Reference Murty and Yegnanarayana2008; Murty et al., Reference Murty, Yegnanarayana and Joseph2009). SoE was found to rise at the onset of voicing, with a peak in SoE approximately halfway through prenasalization, followed by a drop in SoE, particularly during the last 25% of closure (Figure 4). While both consonants show a similar trajectory, it appears that voicing for [ⁿd͡ʒ] begins to weaken earlier than [ⁿd]. Additionally, just before stop release, SoE is markedly lower for [ⁿd͡ʒ] compared to [ⁿd]. That is, [ⁿd͡ʒ] appears more likely to be released as voiceless compared to [ⁿd] given its substantially weaker voicing as it reaches the stop release.

Figure 4. Log SoE over the duration of prenasalization for [ⁿd] (dark blue) and [ⁿd͡ʒ] (purple). Lighter colors (ribbons) represent 95% confidence intervals.

Voiceless plosives are optionally preaspirated when they occur in couplet-medial position (e.g., CVCV) and unaspirated elsewhere, a phenomenon also documented in other Mixtec varieties (e.g., Ayutla Mixtec (Pankratz & Pike Reference Pankratz and Pike1967)).Footnote 3 This variation is demonstrated in Figure 5. Figure 5 (left) demonstrates preaspiration of the couplet-medial consonant in [ka3ʰka3] ‘to walk.’ However, in Figure 5 (right) there is no preaspiration of /k/ in [ⁿda3-ko3o3] /ⁿda3-ko3o3/ ‘to leave’ since [k] is in couplet-initial position (e.g., CV-CVV, with CVV being the monosyllabic couplet). The voiceless postalveolar affricate /tʃ͡/ also surfaces as [ʰtʃ͡] in couplet-medial position.

Figure 5. Left shows waveform and spectrogram of [ka3ʰka3] ‘to walk.’ Aspiration on couplet-medial [ʰk] is indicated with superscript h. Right shows spectrogram and waveform of [ⁿda3ko3o3] ‘to leave.’ [k] is not preaspirated as it is in couplet-initial position, although it is word-medial. Light noise at the [k] closure onset is not audibly preaspiration but instead is attributed to echo.

Preaspiration was calculated from 20 couplet-medial tokens each of [ʰt, ʰk] and 17 of [ʰt͡ʃ]. Preaspiration was segmented following the vowel of the first mora from the onset of broadband noise in the spectrogram to the end of clear broadband aspiration noise. The right boundary also corresponded to the beginning of a period of silence, consistent with closure for the stop. Bilabial /p/ was excluded from the analysis of preaspiration as it does not occur frequently, and therefore, too few tokens were available for analysis. Likewise, /kʷ/ was excluded as it does not occur frequently in couplet-medial position. Similar to VOT measures, preaspiration was found to be longest preceding [ʰk], with an average duration of 79.99 ms (SD = 22.35 ms), followed by [ʰt] and [ʰt͡ʃ], which had similar preaspiration durations of 60.1 ms (SD = 24.17 ms) and 61.1 ms (SD = 16.51 ms), respectively. Results are illustrated in Figure 6.

Figure 6. Preaspiration measures of voiceless stops and affricate [ʰt, ʰt͡ʃ, ʰk]. Large circles represent the mean duration of preaspiration for each stop, and error bars represent one standard deviation. Values for individual tokens are represented by smaller circles.

In addition to optional preaspiration, velar stops /k/ and /kʷ/ also demonstrate lenition intervocalically, particularly during fast speech. As a result, they may surface as voiced velar stops [ɡ] and [ɡʷ] or voiced velar fricatives [Ɣ] and [Ɣʷ] as in examples (1)Footnote 4 and (2) below. Other Mixtec varieties have patterns of variable lenition of voiceless velar stops, including Acatlán Mixtec (Pike & Wistrand Reference Pike, Wistrand and Brend1974), Silacayoapan Mixtec (North & Shields Reference North, Shields and Merrifield1977), San Miguel el Grande Mixtec (Pike Reference Pike1944), and Yoloxóchitl Mixtec (DiCanio et al. Reference DiCanio, Zhang and Whalen2020), among others. For more recent work examining prosodic factors and quantifying lenition, see DiCanio et al. Reference DiCanio, Wei-Rong Chen, Amith and Castillo García2022.

Nasals

Nasals /m, n, ɲ/ may occur in word-medial and word-initial positions; however, /m/ commonly occurs word-initially, while /ɲ/ more frequently occurs word-medially.

Tap

The alveolar tap, /ɾ/, has a very restricted distribution in SJPM as in other varieties of Mixtec (e.g., Ixpantepec Nieves Mixtec (Carroll Reference Carroll2015), Yoloxóchitl Mixtec (DiCanio et al. Reference DiCanio, Zhang and Whalen2020)). It occurs primarily in function words, including noun classifiers (for the 3rd person singular masculine classifier /ɾa/, the liquid noun classifier /ɾa5/, and the conjunction /ɾa3/). The alveolar tap has an allophone, the alveolar trill [r], which occurs in word-initial position. This allophonic variation is illustrated in example (3), which shows the realization of the third person singular masculine noun class marker /ɾa/: the alveolar tap allophone [ɾ] surfaces in post-vocalic position (e.g., in contexts where the /ɾa/ morpheme is a pronominal enclitic) (3a), while the alveolar trill allophone surfaces word-initially (e.g., when the /ɾa/ morpheme is realized as a classifier preceding a noun in a noun phrase) (3b).Footnote 5

Fricatives

Fricatives are also commonly attested in SJPM. Fricatives contrast at the labio-dental, alveolar, and postalveolar places of articulation. Although /f/ and /h/ are listed in the consonant chart, their distribution is very limited. To date, /f/ has only been found to occur in one loanword, /ka5fe5/ ‘coffee, brown’. Likewise, /h/ has only been found in the SJPM word for yes /hā3ā1/. Despite being rarely attested in the language, they are included in this illustration in the consonant inventory to account for all consonants that occur in SJPM.

To illustrate fricative characteristics, mean power spectral slices were calculated from 20 tokens each of [v, s, ʃ, ʒ]. The two infrequent fricatives, /f/ and /h/, were excluded as there were not enough tokens. Mean spectral slices are illustrated in Figure 7. Fricatives were annotated from beginning to end of clear frication noise, and power spectra were calculated in Praat (Boersma & Weenik Reference Boersma and Weenink2020). The sibilant fricatives ([s, ʃ, ʒ]) demonstrate peak spectral energy patterns that we would expect based on place of articulation. For [s], the peak occurs around 7.5 kHz, higher than both [ʃ] and [ʒ], which demonstrate greater energy at approximately 5 kHz (though this value may be slightly higher for [ʒ] based on visualization). This is consistent with a more anterior place of articulation for [s] compared to [ʃ] and [ʒ]. Spectral energy for [v] is overall low relative to other fricatives, which is expected given that it is a non-sibilant fricative. Other than the low frequency energy, no clear peak is seen; instead, the spectrum is relatively flat as expected for labial consonants. Both [v] and [ʒ] demonstrate relatively high amplitude in lower frequency energy, likely due to voicing.

The voiced fricative, /ʒ/, undergoes lenition during fast speech, often in the word-medial position. This appears to be gradient, where the consonant can be produced as the fricatives [ʒ] or [ʝ] or approximant [j] (4a and 4b).

Figure 7. Mean spectral slices of [v, s, ʃ, ʒ] averaged from 20 tokens each.

Figure 8. (Left) Log SoE over proportion time for [v, ʒ, l], consonants represented in color. (Right) Cepstral peak prominence over proportion time for [v, ʒ, l], consonants represented in color. Lighter colors (ribbons) represent 95% confidence intervals.

Additionally, voiced fricatives /v/ and /ʒ/ are produced with considerable pre-voicing, which is demonstrated in Figure (8) using strength of excitation (SoE; Murty & Yegnanarayana Reference Murty and Yegnanarayana2008) as a measure of voicing. Higher SoE indicates greater strength of voicing. SoE was measured in VoiceSauce (Shue et al., Reference Shue, Keating, Vicenik and Yu2011) over the duration of the fricative, from the onset of voicing to the onset of clear vowel formants. SoE was measured from 20 tokens of [ʒ] and [v] in the word-initial position. An additional 20 tokens of [l] were analyzed for comparison to an approximant, which is known to be heavily voiced. Results indicate that [ʒ] has rising SoE at the onset, followed by a dip in SoE during the middle 50% of production, and finally, a rise in SoE as the speaker transitions to the vowel. This suggests strong voicing at the onset, followed by weaker voicing during the middle portion of the consonant, a pattern consistent with pre-voicing.

Cepstral peak prominence (CPP) was also calculated to quantify frication noise. An increase in frication noise would be expected to increase the noise floor (regression line) in the cepstrum and, therefore, reduce the prominence of the cepstral peak relative to the regression line, resulting in overall lower CPP. Thus, a decrease in CPP indicates more noise in the signal (greater frication noise). Rising CPP was seen at the onset of [ʒ]; CPP peaks during the first half of [ʒ], followed by a drop in CPP that coincides with the drop in SoE. This suggests that frication noise is minimal during the pre-voicing of [ʒ] and increases during the final 50% of the consonant as strength of voicing decreases. Like SoE, this pattern is consistent with pre-voicing.

A different pattern was seen for [v]. Although [v] does show a decrease in SoE during the final 50% of the consonant compared to the first half, it maintains an overall greater strength of voicing than [ʒ]. Additionally, [v] demonstrates a steady rise in CPP over the course of the consonant, rather than a drop, indicating little frication noise throughout the duration of the consonant. Although [v] is produced with little frication noise, it falls short of SoE and CPP values similar to an approximant (i.e., [l], shown for comparison). Given that [v] is a non-sibilant fricative, the CPP results are in line with expectations of relatively quiet frication noise during production.

Vowels

SJPM has five oral vowels (/i, e, a, o, u/) and three phonemically nasal vowels (/ĩ, ā, ō/).Footnote 6 Phonemically nasal vowels can occur in the second syllable of a disyllabic word (the final syllable of the couplet) or on both vowels of CVɁV and CVV words, consistent with what has been documented for other varieties of Mixtec (Gerfen, Reference Gerfen1999). In other words, phonemically nasal vowels in SJPM do not occur in the first syllable of disyllabic (CVCV) couplets.

Figure 9. Plot of F1 and F2 values (Hz) of oral vowels with 1 standard deviation ellipses. Vowel labels are centered on the mean F1 and F2 values, and points represent individual tokens. Vowels are represented by color.

Roots of open class words in SJPM are canonically bimoraic. These bimoraic roots show a surface contrast between short vowels in disyllabic ((C)VCV) roots and long vowels in monosyllabic ((C)VV) roots, as attested across Mixtec languages (Di Canio & Bennett Reference DiCanio, Bennett, Gussenhoven and Chen2021) (see Phonotactics, below). Long vowels also surface in bisyllabic ((C)VCVV) roots. In these root forms, long vowels are restricted to occur in the final syllable of the word, e.g., /tu1kʷa1a5/ ‘orange.’ Monomoraic roots containing a single vowel (i.e., CV forms) do not form minimal pairs with any bimoraic ((C)V(Ɂ)V) forms, and typically correspond to functional morphemes and closed-class words.

Acoustic analysis to measure formant values of oral vowels was conducted. Nasal vowels were excluded due to nasality interfering with formant tracking. Oral vowel formants were measured from the final syllable of 167 bimoraic tokens, consisting of 44 [i] vowels, 27 [e], 47 [a], 27 [o] and 22 [u]. The number of vowels per vowel category was not balanced as they were pulled from the database; therefore, they represent the relative frequency in SJPM, demonstrating that /i/ and /a/ occur relatively frequently compared to /e/, /o/, and /u/. With the exception of /i/, vowels were typically preceded by alveolar, postalveolar, or velar consonants; /i/ was also frequently preceded by /v/. Vowels were manually segmented in Praat (Boersma & Weenink Reference Boersma and Weenink2020) from beginning to end of clear formants. Mean F1 and F2 were calculated in VoiceSauce (Shue e al. Reference Shue, Keating, Vicenik and Yu2011) using the Snack algorithm (Sjölander Reference Sjölander2004). Formant values from the middle one-third of the vowel were used to calculate mean F1 and F2 to reduce the effect of formant transitions in varying phonetic environments. Vowels with F1 and F2 values greater than 2 standard deviations away from the mean were excluded, as these were taken to be outliers. In addition, 8 tokens for /i/ were excluded as they were found to have mistracked F2 values (i.e., below 1000 Hz). Figure 9 demonstrates the acoustic vowel space. There is notably wide intra-speaker variation in vowel production and considerable overlap in F1 and F2 of [o] and [u]. The overlap in the back vowels may be due to less rounding on [u], which would result in overall higher formants compared to a more rounded [u]. To further investigate acoustic differences of [o] and [u], we compared F3 values of the two vowels (Figure 10). F3 was found to be lower for [o] (mean = 2670.51, sd = 163.93) compared to [u] (mean = 2831.91, sd = 137.68), suggesting less rounding for [u] compared to [o].

Figure 10. F3 values (Hz) of [o] and [u]. Large circles represent mean F3 for each vowel, and error bars represent one standard deviation. Values for individual tokens are represented by smaller circles.

In addition to phonemically nasal vowels, the first person singular enclitic /=e1/ surfaces as allophonically nasalized [ē1] when the root-final vowel is nasal (5–6). Additional conditions for allophonically nasal vowels are described below.

Phonemically nasal vowels in SJPM are restricted to the root-final position. In CVCV roots, phonemically nasal vowels can only occur if the second consonant is voiceless, and in (C) VɁV or (C) VV roots, both vowels must be nasal if the final vowel is nasal. However, in oral (C) VɁV or (C) VV forms that are inflected with a nasal enclitic, it is unclear if nasality spreads to the root. A thorough investigation of the distribution of nasality in SJPM remains to be undertaken.

SJPM also has evidence of allophonic nasality, similar to other documented varieties of Mixtec (Gerfen Reference Gerfen1999). Vowels demonstrate perseverative nasality when following nasal consonants. To demonstrate this, the difference between the amplitude of the first formant and the first nasal pole (A1-P0) was calculated automatically in Praat for vowels in CVCV words where the second vowel was phonemically oral (CVCV), nasal (CVC), or followed a nasal consonant (CVNV) (Styler & Scarborough Reference Styler and Scarborough2017). A lower value of A1-P0 indicates increased nasality (Styler Reference Styler2017). Vowels in this analysis were taken from words produced in isolation found in the database. Vowels were excluded from analysis if they were flagged as likely errors by the script. Oral vowels were the same as those used to calculate vowel formants in Figure 9. In total, 15 allophonic nasal vowels, 24 phonemic nasal, and 96 oral vowels were included in the final analysis. These results are shown in Figure 11.

Figure 11. Mean A1-P0 (dB) values for phonemic nasal (green circles), allophonic nasalized (orange triangles), and oral vowels (purple squares) at the vowel onset, vowel midpoint, and vowel offset. Vowel onset corresponds approximately to the 3% time point of the total vowel duration, vowel midpoint corresponds to the 50% time point, and vowel offset corresponds approximately to 97% timepoint.

Phonemic nasal and allophonic nasalized vowels demonstrated lower A1-P0 values (mean of 0.31 dB and 0.34 dB, respectively) compared to oral vowels, suggesting vowels following nasal consonants are similar in degree of nasality to phonemically nasal vowels. Additionally, the trajectory of both types of nasal vowels is similar: both decrease in A1-P0 from the vowel onset to the midpoint. Oral vowels had relatively high A1-P0 (mean = 6.63 dB), as expected. All vowels (phonemic nasal, allophonic nasal, and oral) demonstrate reduced A1-P0 at vowel offset compared to onset. This is believed to be due to the speaker’s use of breathy voicing at the end of each token. Since words were spoken in isolation, it’s unclear if this is phrase-final breath (e.g., similar to that used in Spanish, Duarte Borquez et al. 2024) or if the speaker’s use of breathy voice is related to some other phenomenon. Nevertheless, breathy voice results in lower amplitude of A1, thus lowering the overall value of A1-P0. Therefore, we do not take the lower A1-P0 at the end of oral vowels to indicate that oral vowels are becoming more nasal.

Glottalization

SJPM has contrastive glottalization, as shown in (7). Surface glottalization patterns are similar to those documented in other varieties of Mixtec, which have been variously analyzed phonologically as a glottal stop phoneme (Pike Reference Pike1948; Hunter & Pike 1969; Pike & Cowan 1967; Pankratz & Pike Reference Pankratz and Pike1967), vowel glottalization (Josserand Reference Josserand1983; Gerfen Reference Gerfen1999), or a prosodic property of root morphemes (Marlett Reference Marlett1992; Macaulay & Salmons Reference Macaulay and Salmons1995).

Only root couplets may exhibit glottalization, while function morphemes (affixes, clitics, particles), which are monomoraic, are never glottalized. Furthermore, glottalization in roots is restricted to occur in couplet-medial position. In preconsonantal position, glottalization is exclusively attested in root morphemes that have a voiced medial consonant, a pattern also attested in other varieties of Mixtec (e.g., Ixpantepec Nieves Mixtec (Carroll Reference Carroll2015)). In this illustration, we adopt Macaulay & Salmons’ (Reference Macaulay and Salmons1995) and Gerfen’s (Reference Gerfen1999) analysis that glottalization occurs as a feature of root templates rather than a consonantal segment, and we assume this glottalization feature is associated with the couplet-initial vowel. This analysis is motivated by the restricted distribution of glottalization in the language: if glottalization were to be analyzed as a consonantal segment, it would be the only possible coda and the only segment yielding consonant clusters. Additionally, in monomorphemic monosyllabic root couplets with glottalization, both vowels must have the same vowel quality and nasality,Footnote 7 as also attested in other documented Mixtec language varieties (e.g., San Sebastián del Monte Mixtec (Cortés et al. Reference Cortés, Mantenuto and Steffman2023); see also Gerfen (Reference Gerfen1999)). However, when inflected, the second vowel of the couplet may change.Footnote 8 This is unlike (C)VCV couplets, which may have different vowels in each mora in the uninflected forms.

Phonetically, glottalization may be implemented as a full glottal stop with no voicing and a period of silence (‘tenate’ in Figure 12), creakiness over a portion of the couplet-initial vowel (‘to check by touch’ in Figure 12), or “light” glottalization with full voicing and a drop of F0 and intensity during the period of glottalization (‘lion’ in Figure 12). Regardless of strength of voicing, pitch and intensity decrease during glottalization (Figure 12). This variation is consistent with the variation in the voicing of glottals cross-linguistically as well as in other Mixtec varieties (e.g., Coatzospan Mixtec (Gerfen & Baker Reference Gerfen and Baker2005), San Sebastián del Monte Mixtec (Cortés et al. Reference Cortés, Mantenuto and Steffman2023); see also Garellek et al. Reference Garellek, Chai, Huang and Van Doren2023)). To our current knowledge, this variation is not predictable. Regardless of the phonetic implementation of the glottalization, it is produced in phase with the first mora, discussed further below.

Figure 12. Waveforms, spectrograms, pitch tracks, and intensity tracks illustrating variation of glottalization. Glottalized portion is indicated by vertical boundaries.

In our analysis, we assume that gestures for vowel articulation and glottalization are overlapping rather than sequential, with glottalization generally phased with the second half of the first mora. To illustrate the phasing of glottalization, we calculated strength of excitation (SoE) in VɁC and VɁV contexts. SoE was calculated over the entire vowel in 35 words in VɁC context and over both vowels in 45 words in VɁV context.

For VɁC words (Figure 13, left), SoE is strongest at the onset of the vowel and lowest during the second half of the vowel, indicating that glottalization is strongest during the second half of the vowel. There is a slight rise at the end of the vowel immediately before the onset of the consonant. There are two possible reasons for this. First, as in many languages, the phasing of glottalization may be variable (Borroff Reference Borroff2007). Although we generally see glottalization phased with the second half of the vowel, it’s likely the case that the speaker’s phonetic implementation of glottalization is variable. Second, the strength of glottalization weakens toward the end of the gesture, and in the VɁC context, it always occurs where C is a voiced consonant. As a result of this, if glottalization weakens before the onset of the consonant, SoE will increase, and an increase in voicing will be perceived. Additionally, due to both of these factors, listeners may perceive what appears to be a copy vowel but is, in fact, a continuation of the vowel gesture (as in [u1Ɂvi1] “painful”). When the glottalization gesture extends beyond the vowel into the following consonant, no intrusive vowel is perceived.

Figure 13. Log SoE in pre-consonantal, VɁC (left), and intervocalic, VɁV (right), sequences. Gestural timing schema below the x-axis indicates ideal gestural timing between vowel articulation and glottalization. Lighter colors (ribbons) represent 95% confidence intervals.

In VɁV words, SoE is strongest at the onset of the first mora and drops precipitously during the first quarter of the VɁV sequence followed by an increase in voicing strength during the second half, indicating the glottalization is associated with the first mora. These results are illustrated in Figure 13. A gestural timing schema is included in Figure 13 to illustrate the idealized phasing of vowel articulation and glottalization; though, as previously discussed, phonetic implementation and phasing of glottalization may be variable.

Tone

SJPM has a complex system of lexical tone, with three level tones, H (V5), M (V3), and L (V1). Following Caballero, Duarte Borquez, Juárez Chávez & Yuan (to appear), we analyze these three level tones to be tone feature primitives of SJPM.

To illustrate f0 trajectories of level tones, f0 was calculated over the entire duration of couplet-final vowels in VoiceSauce using the STRAIGHT algorithm (Shue et al. Reference Shue, Keating, Vicenik and Yu2011; Kawahara et al. Reference Kawahara, de Cheveign and Patterson1998). F0 values include 10 tokens each of high, mid, and low tones from recordings collected during elicitation sessions from 2020–2023. An additional set of 8 low tone, 10 mid tone, and 12 high tone words were recorded in a sound-attenuated booth; each was repeated three times with the exception of “middle” and “navel,” which were repeated twice. As shown in Figure 14, the three level tones of SJPM are distinguished by pitch height (for H and M tones), and by pitch height and trajectory (for L tones), since the L tone has a downward pitch trajectory. We note there is considerable intra-speaker variability in just these few tokens, resulting in overlap of f0 of some tokens of H and M tones.

Figure 14. f0 track (Hz) of High, Mid, and Low lexical tones. Thin lines represent individual tokens, thick lines represent the mean across all tokens.

The three level tones of SJPM may combine to form contours. All logical possible combinations of level tones are attested in bimoraic stems, whether monosyllabic or disyllabic, where each mora has a single tone associated with it. Table 1 shows examples of these bitonal melodies in disyllabic and monosyllabic bimoraic stems.

Table 1. Bitonal melodies in monosyllabic and disyllabic bimoraic stems

We show the acoustic realization of two-tone melodies in monosyllabic bimoraic stems in Figure 15. Data for this figure were taken from three repetitions each of the words in Table 2; tokens were produced in isolation in a sound-attenuated booth.

Crowding of tones on single morae is permitted in SJPM: bimoraic stems sponsor up to four tones, whether monosyllabic (8a) or disyllabic (8b), while monomoraic function morphemes (like the negative proclitic ko 15), sponsor up to two tones (8c).

Figure 15. f0 trajectories (Hz) for bitonal melodies in monosyllabic bimoraic stems.

SJPM licenses two lexical rising contour tones in single morae, namely LM (/V13/) (9a) and LH (/V15/) (9b–c). Falling contour tones (ML /V31/ and HL /V51/) are attested only in grammatically derived tonal melodies and are also licensed in single morae, as shown in (10a–b).

SJPM also exhibits downstep and upstep that leads to surface tone levels differing from their phonemic ones. Downstep involves the realization of tones with lower pitch than other tones of the same phonological category (Clements Reference Clements1979; Gussenhoven Reference Gussenhoven2004; Hyman Reference Hyman2017), while upstep involves an upward shift of the tonal register that may be followed by a downward shift (Snider Reference Snider, van der Hulst and Smith1988, Reference Snider1990). In SJPM, there is downstep of M tone to level [2] and H tone to level [4] as well as upstep of H tones to level [6] (Duarte Borquez Reference Duarte Borquez2022, Duarte Borquez, Juárez Chávez & Caballero to appear). These register effects are analyzed in Duarte Borquez (Reference Duarte Borquez2022), and Duarte Borquez, Juárez Chávez & Caballero (to appear) as resulting from different association patterns of floating L tones sponsored by some roots. Level [6] tone for the upstepped tone is cross-linguistically unusual and not commonly used in IPA-based tonal descriptions, but a sixth ‘super-high’ tonal level is nonetheless attested in the description of other tonal languages (e.g., Quiotepec Chinantec (Castellanos Cruz Reference Castellanos Cruz2014), White Hmong (Garellek & Esposito Reference Garellek and Esposito2023); see also discussion in Zhu (Reference Zhu2012)). While downstep is widely attested cross-linguistically and has been reported in several Mixtec varieties (see Daly & Hyman Reference Daly and Hyman2007 for an overview), to the best of our knowledge, upstep is only documented in three other varieties of Mixtec, namely Acatlán Mixtec (Pike & Wistrand Reference Pike, Wistrand and Brend1974), Peñoles Mixtec (Daly & Hyman, Reference Daly and Hyman2007), and San Jerónimo de Xayacatlán Mixtec (Rueda Chaves, Reference Chaves and Edinson2019).

Table 2. Bitonal melodies on monosyllabic bimoraic words used for acoustic analysis in Figure 15 across three vowel categories: /i/ (blue/bottom), /o/ (purple/middle), and /a/ (black/top) (color online). Note that N/A indicates no token was found with the vowel and tone pattern pairing.

Downstep is exemplified in (11) with the M-toned enclitic =va3 ‘emph’, which surfaces with a lower pitch when attaching to roots bearing a floating L tone (11a-b). A lowered pitch is not attested when the same enclitic attaches to other roots (11c-d).

To further illustrate the f0 of upstepped and downstepped tones, f0 was calculated in VoiceSauce (Shue et al., Reference Shue, Keating, Vicenik and Yu2011) using the vowels in examples 11–12. Compare f0 tracks of “flower!” (Figure 16, top), which has downstep on the final mora, to “rabbit!” (Figure 16, bottom), which does not.

Figure 16. f0 tracks (Hz) of [i3ta3] ‘flower’ (top left), [i3ta3va2] ‘flower!’ (top right), [le3so3] ‘rabbit’ (bottom left) and [le3so3va3] ‘rabbit!’ (bottom right) (f0 between vowels is interpolated for visualization purposes and does not represent real f0). Note that all tones are phonemically mid level tones /3/, but downstep mid [2] is phonetically implemented with a lower f0 in ‘flower!’; compare with ‘rabbit!,’ with phonetically all level [3] tones.

As seen in this Figure 16, the f0 patterns of mid-tone roots in isolation in these examples exhibit a difference between the root with a posited low tone ([i3ta3] ‘flower’, in the top left), where a downtrend in the second vowel is attested, and the root with no floating low tone ([le3so3] ‘rabbit’, in the bottom left), where no downtrend is attested. We note that this downtrend could be attributed to the presence of the floating low tone or microprosody (or a combined effect of both), but we leave this question for future research. We note, however, that the downtrend attested in [i3ta3] ‘flower’ is not as clear and perceptually salient as the register drop attested in a following mora, if present, as exemplified in the final vowel in [i3ta3va²] ‘flower!’ in the top right panel.

Upstep is attested in /H.H𠁌/ sequences, where the second H tone is realized with a higher pitch. The H-toned TBU that follows the upstepped tone becomes downstepped. In contrast, no register effects are attested in sequence of H tones in the absence of floating L tones. This is exemplified in (12), with stems attaching a H-toned enclitic, =ɲa5 ‘3sg.f’. Figure 17 (top) illustrates upstepped level [6] tone followed by downstepped level [4] tone (compare to Figure 17 (bottom), with all level [5] tones). As shown in Figure 17, high-toned roots with and without floating low tones are distinguished in isolation by the presence and absence of upstep, respectively, in the second mora.

Figure 17. f0 tracks (Hz) of [ti5ku↑6] ‘needle’ (top left), [ti5ku↑6ɲa↓4] ‘her needle’ (top right), [ⁿdʒ͡u5ma5] ‘little fish’ (bottom left), and [ⁿdʒ͡u5ma5ɲa5] ‘her little fish’ (bottom right) (f0 between vowels is interpolated for visualization purposes and does not represent real f0). Note that all tones are phonemically high level tones /5/.

Phonotactics

Bimoraic root templates are the unit of analysis of tone patterns, several phonotactic constraints, and the domain for some phonological processes in SJPM, and as documented in other varieties of Mixtec (Gerfen Reference Gerfen1999; Carroll Reference Carroll2015; Penner Reference Penner2019). Attested monomorphemic word structures and their syllable structure are provided in Table 3 (a period indicates a syllable boundary).

Table 3. Attested monomorphemic word structures and their syllable structure

The syllable structure of SJPM, as in other Mixtec varieties, is (C)V, an open syllable with an optional onset. Codas are disallowed. As mentioned above, monomorphemic open-class words are minimally bimoraic (analyzed here as root templates), either monosyllabic with a long vowel ((C)VV) or disyllabic with two short vowels ((C)VCV), though trimoraic monomorphemic words are not uncommon (as mentioned above, long vowels are restricted to occur in the final syllable of trimoraic words). In contrast, function words, affixes and clitics are canonically monomoraic. Bimoraic templatic roots may be contrastively glottalized as exemplified above, with glottalization associating with the first vowel of the couplet. The vowels in uninflected monosyllabic ((C)VV or (C)VɁV) roots must be identical.

Consonant cluster onsets may occur in certain contexts involving vowel deletion in synchronic patterns of reduction in fast speech. As a result, both a “long form,” with a full vowel, and a “short form,” with a consonant cluster in fast speech, are often attested. The only consonant clusters attested in SJPM involve a sibilant-plosive sequence, a pattern also attested in other Mixtec varieties (e.g., Chalcatongo Mixtec; Macaulay Reference Macaulay1996). The resulting sibilant-plosive clusters in SJPM include [ʃt], [ʃkʷ], [ʃk], [st], [sk] and [sⁿd]. Some of these clusters are exemplified in (13). In addition, SJPM has a few words with underlying clusters where no “long form” alternative exists as in (14).

As shown in these examples, the deleted vowel in each instance bears a L tone, and the tone of the deleted vowel is also deleted in the reduced form. The resulting cluster may be located at the onset of the couplet, as in (13a) and (13c), where vowel deletion targets the vowel preceding the couplet (e.g., /si1(ⁿdo1ko1)/ becomes [(sⁿdo1ko1)]). Vowel deletion may also target the couplet-initial vowel as in (13b), where /ka3(si1ki5)/ becomes [(ka3ski5)]. In this second environment, the resulting cluster is in the couplet-medial position.

Illustrative Passage

The story “The North Wind and the Sun” was introduced in Spanish and translated line by line, as this story does not come naturally to the L1 speaker of SJPM and aspects of the story are not present in traditional narratives in SJPM (namely, inanimate objects speaking). Two lines including repetitions were edited to remove repeated words, /ɾa3 ka5tʃ͡i5 ɲa1 ⁿdʒ͡a5 ku5u3 ɲa3 ⁿda3ku5 ka3 no1o5 ⁿdi3Ɂi3 ⁿda1a1/ ‘… who is the strongest …’ and /ki1ʃa5a5 ɲa1 ti13vi3a1/ ‘… begins to blow …’. The transcription presented here corresponds to the recorded text. The phonemic transcriptions do not reflect downstep, upstep and other tonal register effects. The narrow phonetic transcription encodes upstep and downstep.

Broad phonemic transcription

Narrow phonetic transcription

Orthographic representation

Glossed phonemic transcription

‘El viento le pregunta al sol “¿quién de nosotros es más fuerte?”’

‘The wind asks the sun: “who is the strongest?”’

‘(Estaban) platicando entre ellos y vieron a un hombre con una chamarra puesta.’

‘They were talking among themselves and they saw a man with a jacket.’

‘Y dijo: “hoy vamos a probar quién de nosotros es el más fuerte”’

‘And it said: “today we are going to see who is the strongest.”’

‘Se pusieron de acuerdo y dijo: “¿quién es el que es mucho más fuerte para hacer que el hombre se quite su chamarra?”’

‘They agreed and said: “who is the strongest to make the man take off his jacket?”’

‘Primero empezoá el viento a soplar fuerte, muy fuerte, pero el hombre se abrochoá su chamarra.’

‘First the wind started to blow hard, really hard, but the man zipped his jacket.’

‘Y luego empezó el sol a estar caliente’

‘And then the sun started to be hot’

‘y el hombre empezó a tener calor, se quitoá su chamarra.’

‘and the man began to get hot, he took off his jacket.’

‘Entonces el sol es el más fuerte porque ganoá.’

‘And the sun is the strongest because it won.’

Free Spanish translation

El viento le pregunta al sol: “¿quién de nosotros es más fuerte?” Estaban platicando entre ellos y vieron a un hombre con una chamarra puesta. Y entonces uno de ellos dijo: “hoy vamos a probar quién de nosotros es el más fuerte”. Se pusieron de acuerdo que decidirían viendo quién era el más fuerte para hacer que el hombre se quitara su chamarra. Primero empezó el viento a soplar fuerte, muy fuerte, pero el hombre se abrochó su chamarra. Y luego empezó el sol a estar caliente y brillar y el hombre empezó a tener calor y se quitó su chamarra. Entonces el sol es el más fuerte porque ganó.

Free English Translation

The wind asks the sun: “who is the strongest?”. They were talking among themselves and they saw a man with a jacket. And one of them says: “today we are going to see who is the strongest.” They agreed they would decide who the strongest was to make the man take off his jacket. First the wind started to blow hard, really hard, but the man zipped his jacket. And then the sun started to be hot and the man began to get hot and he took off his jacket. Therefore the sun is the strongest because it won.

Acknowledgements

We’d like to acknowledge our team of research assistants for their help in segmenting and annotating the sound files used for analysis: Colin Gazaui, Kajsa Goldsmith-Morgan, Joy Iwamoto, Bethany Lacanienta, Tiffany Wu, and Yage G. Xin. We’d also like to thank Michelle Yuan for comments and suggestions, and Marc Garellek and Will Styler for their advising on acoustic analysis and valuable feedback on this manuscript. Finally, we thank JIPA’s editorial team, audio manager André Radtke, and two anonymous reviewers whose thorough feedback strengthened the quality of this work.

Footnotes

1 Supplementary materials can be found at https://osf.io/crsvd/

2 Some Mixtec languages have prenasalized stops at the bilabial and velar place of articulation (e.g., Yoloxóchitl Mixtec (DiCanio et al. Reference DiCanio, Zhang and Whalen2020)). Prenasalized coronal stops are widespread across the language group (Josserand Reference Josserand1983).

3 In addition to Ayutla Mixtec, another Mixtec variety documented to have consonantal preaspiration in couplet-medial position is Acatlán Mixtec (Pike & Wistrand Reference Pike, Wistrand and Brend1974). In Yoloxóchitl Mixtec, on the other hand, there is lengthening of couplet-medial consonants (DiCanio et al. Reference DiCanio, Zhang and Whalen2020).

4 Each glossed example provides, from top to bottom: (i) an orthographic representation; (ii) a phonetic transcription in IPA; (iii) a phonemicized transcription (also in IPA) with morpheme breaks; (iv) glosses; and (v) English and Spanish free translations.

5 This morpheme has variable tone realization (L or M) depending on the tonal context when used as an enclitic, but surfaces consistently with a L tone as a classifier (Caballero, Juárez Chávez & Yuan Reference Caballero, Juárez Chávez and Yuan2024; Duarte Borquez Reference Duarte Borquez2022; Duarte Borquez & Juárez Chávez Reference Duarte Borquez and Juárez Chávez2022).

6 For typographical ease, /e/ is used throughout the manuscript to represent the mid front unrounded vowel, though it is often perceived as [ɛ]. Likewise, /a/ is used in place of /ɐ/ for the mid-open central vowel.

7 There are only three exceptions to this generalization in the developing SJPM corpus, namely the monomorphemic stems ki1򀋀a 5 ‘chili plant,’ ri1򀋀a5 ‘salsa’ (i.e., in /ka1ʃi1 ri1Ɂa5/, ‘to grind the salsa’), and ʃã5򀋀õ1 ‘fifteen’.

8 Specifically, vocalic enclitics may replace or fuse with the final root vowel.

References

Boersma, Paul & Weenink, David. 2020. Praatː Doing phonetics by computer. https://www.fon.hum.uva.nl/praat/ [last accessed 8 November 2024].Google Scholar
Borroff, Marianne L. 2007. A Landmark Underspecification Account of the Patterning of Glottal Stop. Retrieved from https://doi.org/doi:10.7282/T34F1PMJCrossRefGoogle Scholar
Caballero, Gabriela, Juárez Chávez, Claudia & Yuan, Michelle. 2024. The representation of tone in San Juan Piñas Mixtec (Tò’ōn Ndá’ví). In Gabriela de la Cruz Sanchez, Ryan Walter Smith, Luis Irizarry, Tianyi Ni and Heidi Harley (eds.), Proceedings of the 39th West Coast Conference on Formal Linguistics. Cascadilla Press, pp. 294–302. Somerville, MA: Cascadilla Proceedings Project.Google Scholar
Caballero, Gabriela, Duarte Borquez, Claudia, Juárez Chávez, Claudia & Yuan, Michelle. To appear. Lexical and grammatical tone in San Juan Piñas Mixtec (Tò’òn Ndá’ví). Phonological Data & Analysis.Google Scholar
Carroll, Lucien S. 2015. Ixpantepec nieves Mixtec word prosody. PhD thesis, University of California, San Diego.Google Scholar
Castellanos Cruz, Miguel. 2014. Complejidad fonológica en el chinanteco de Quiotepec: nasalidad, fonación y tono. PhD thesis, Centro de Investigaciones y Estudios Superiores en Antropología Social.Google Scholar
Clements, George N. 1979. The description of terraced-level tone languages. Language 55: 536558.CrossRefGoogle Scholar
Cortés, Félix, Mantenuto, Iara & Steffman, Jeremy. 2023. San Sebastián del Monte Mixtec. Journal of the International Phonetic Association 53(3): 1182–1203.CrossRefGoogle Scholar
Daly, John P. & Hyman, Larry. 2007. On the representation of tone in Peñoles Mixtec. International Journal of American Linguistics 73:165207.CrossRefGoogle Scholar
DiCanio, Christian, Amith, Jonathan & Castillo García, Rey. 2014. The phonetics of moraic alignment in Yoloxóchitl Mixtec. Paper presented at the 4th International Symposium on Tonal Aspects of Languages.Google Scholar
DiCanio, Christian & Bennett, Ryan. 2021. Mesoamerica. In Gussenhoven, Carlos & Chen, Aoju, (eds.), The Oxford handbook of language prosody, pp. 408427. Oxford: Oxford University Press.Google Scholar
DiCanio, Christian, Nam, H., Amith, Jonathan D., Castillo García, Rey & Whalen, Doug H.. 2015. Vowel variability in elicited versus spontaneous speech: Evidence from Mixtec. Journal of Phonetics 48: 4559.CrossRefGoogle Scholar
DiCanio, Christian, Benn, Joshua & Castillo García, Rey. 2018. The phonetics of information structure in Yoloxóchitl Mixtec. Journal of Phonetics 68: 5068.CrossRefGoogle Scholar
DiCanio, Christian T., Zhang, Caicai, Whalen, Doug H. & Rey Castillo García. 2020. Phonetic structure in Yoloxóchitl Mixtec consonants. Journal of the International Phonetic Association 50(3): 333365.CrossRefGoogle Scholar
DiCanio, Christian, Wei-Rong Chen, Joshua Benn, Amith, Jonathan D. & Castillo García, Rey. 2022. Extreme stop allophony in Mixtec spontaneous speech: Data, prosody, and modelling. Journal of Phonetics 92: 118.CrossRefGoogle ScholarPubMed
Duarte Borquez, Claudia. 2022. The representation of tone in San Juan Piñas Mixtec. Comprehensive paper, UC San Diego.Google Scholar
Duarte Borquez, Claudia & Juárez Chávez, Claudia. 2022. The representation of tone in San Juan Piñas Mixtec: The role of underspecification. Paper presented at the 2022 Annual Meeting of the Society for the Study of Indigenous Languages of the Americas.Google Scholar
Duarte Borquez, Claudia, Juárez Chávez, Claudia, & Caballero, Gabriela. To appear. Tonal upstep and downstep in San Juan Piñas Mixtec (Tò’ōn Ndá’ví). In: D. K. E. Reisinger (ed.), Proceedings of WSCLA 26, UBC Working Papers in Linguistics.Google Scholar
Duarte Borquez, C., M. Van Doren, & M. Garellek. (2024). Utterance-final voice quality in American English and Mexican Spanish bilinguals. Languages, 9(3): 70.Google Scholar
Eischens, Benjamin. 2022. Tone, phonation, and the phonology-phonetics interface in San Martín Peras Mixtec. Doctoral dissertation, University of California Santa Cruz.Google Scholar
Garellek, M., & Esposito, Christine. 2023. Phonetics of White Hmong vowel and tonal contrasts. Journal of the International Phonetic Association 53: 213232.CrossRefGoogle Scholar
Garellek, Marc, Chai, Yuan, Huang, Yaqian & Van Doren, Maxine. 2023. Voicing of glottal consonants and non-modal vowels. Journal of the International Phonetic Association 52(2): 305331.CrossRefGoogle Scholar
Gerfen, Chip. 1999. Phonology and phonetics in Coatzospan Mixtec. In Studies in natural language and linguistic theory, Vol.48. Dordrecht: Springer.Google Scholar
Gerfen, Chip. 2001. Nasalized fricatives in Coatzospan mixtec. International Journal of American Linguistics 67(4): 449466.CrossRefGoogle Scholar
Gerfen, Chip & Baker, Kirk. 2005. The production and perception of laryngealized vowels in Coatzospan Mixtec. Journal of Phonetics 33(3): 311334.CrossRefGoogle Scholar
Gussenhoven, Carlos. 2004. The phonology of tone and intonation. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
Hunter, Georgia & Eunice Pike. 1969. The phonology and tone sandhi of Molinos Mixtec. Linguistics, 47: 24–40.Google Scholar
Hyman, Larry M. 2017. Synchronic vs. diachronic naturalness: Hyman and Schuh (1974) revisited. UC Berkeley Phonetics and Phonology Lab Annual Report 13(1): 226242.Google Scholar
INEGI (Instituto Nacional de Estadística y Geografía). 2020. Censo de Población y Vivienda. Mexico City, Mexico.Google Scholar
Iverson, Gregory. K., & Salmons, Joseph. C. 1996. Mixtec prenasalization as hypervoicing. International Journal of American Linguistics 62(2): 165175.CrossRefGoogle Scholar
Josserand, Judy Kathryn. 1983. Mixtec Dialect History. PhD. dissertation, Tulane University.Google Scholar
Kahle, David, & Wickham, Hadley. 2013. ggmap: Spatial Visualization with ggplot2. In The R Journal 5(1): 144–161.CrossRefGoogle Scholar
Kawahara, Hideki, de Cheveign, Alain, & Patterson, Roy D.. 1998. An instantaneous-frequency-based pitch extraction method for high quality speech transformation: Revised TEMPO in the STRAIGHT suite. In Proceedings ICSLP’98, Sydney, Australia, December 1998.Google Scholar
Kresge, Lisa 2007. Indigenous Oaxacan communities in California: An overview. California Institute for Rural Studies 1107.Google Scholar
Macaulay, Monica. 1996. A grammar of Chalcatongo Mixtec. Berkeley: University of California Press.Google Scholar
Macaulay, Monica & Salmons, Joseph C.. 1995. The phonology of glottalization in Mixtec. International Journal of American Linguistics 61(1): 3861.CrossRefGoogle Scholar
Marlett, Stephen. 1992. Nasalization in Mixtec languages. International Journal of American Linguistics 58: 425435.CrossRefGoogle Scholar
Marlett, Stephen A. & Gittlen, Laura. 1985. Ñumí Mixtec syllable structure and morphology. Work Papers of the Summer Institute of Linguistics, University of North Dakota, 29: 175–194.Google Scholar
Murty, K. Sri Rama & Yegnanarayana, Bayya. 2008. Epoch extraction from speech signals. IEEE Transactions on Audio, Speech, and Language Processing 16: 16021613.CrossRefGoogle Scholar
Murty, K. Sri Rama, Yegnanarayana, Bayya, & Joseph, M. Anand. 2009. Characterization of glottal activity from speech signals. IEEE Signal Processing Letters 16(6): 469472.CrossRefGoogle Scholar
North, J., & Shields, J. 1977. Silacayoapan Mixtec phonology. In Merrifield, W. (ed.), Otomanguean phonology, pp. 3568. Summer Institute of Linguistics Academic Publishing.Google Scholar
Pankratz, Leo & Pike, Eunice V.. 1967. Phonology and morphotonemics of Ayutla Mixtec. International Journal of American Linguistics 33(4): 287299.CrossRefGoogle Scholar
Penner, Kevin. 2019. Prosodic structure in Ixtayutla Mixtec: Evidence for the foot. Doctoral thesis, University of Alberta.Google Scholar
Pike, Eunice & John Cowan. 1967. Huajapan mixtec phonology and morphophonemics. Anthropological Linguistics, 9(5): 1–15.Google Scholar
Pike, Eunice V. & Wistrand, Kent. 1974. Step-up terrace tone in Acatlán Mixtec. In Brend, Ruth (ed.), Advances in tagmemics, pp. 83104. Amsterdam: North-Holland Publishing Company.Google Scholar
Pike, Kenneth l. 1944. Analysis of a Mixteco text. International Journal of American Linguistics 10(4): 113138.CrossRefGoogle Scholar
Pike, Kenneth L. 1948. Tone languages. Ann Arbor, MI: University of Michigan Press.Google Scholar
Chaves, Rueda, Edinson, John. 2019. La interacción entre el tono y el acento en el mixteco de San Jerónimo de Xayacatlán. Doctoral thesis, El Colegio de México.Google Scholar
Shue, Yen-Liang, Keating, Patrician, Vicenik, Chad, & Yu, Kristine. 2011. VoiceSauce: A program for voice analysis. In Proceedings of ICPhS XVII, pp. 1846–1849.Google Scholar
Sicoli, Mark A. 2005. Otomanguean languages. In P. Strazny (ed.), Encyclopedia of linguistics. (Vol 2, pp. 797–800). New York: Fitzroy Dearborn.Google Scholar
Sjölander, Kåre. 2004. The snack sound toolkit [computer program]. https://www.speech.kth.se/snack/.Google Scholar
Snider, Keith L. 1988. Towards the representation of tone: A three-dimensional approach. In van der Hulst, Harry and Smith, Norval (eds.), Features, segmental structure and harmony processes, Vol. 1, pp. 237269. Dordrecht: Foris Publications.Google Scholar
Snider, Keith. L. 1990. Tonal upstep in Krachi: Evidence for a register tier. Language 66(3): 453474.CrossRefGoogle Scholar
Styler, Will. 2017. On the acoustical features of vowel nasality in English and French. The Journal of the Acoustical Society of America 142: 24692482.CrossRefGoogle ScholarPubMed
Styler, Will & Scarborough, Rebecca. 2017. Nasality Automeasure Script Package. Available at https://github.com/stylerw/styler_praat_scripts.Google Scholar
Uchihara, Hiroto & Mendoza, Juana. 2021. Minimality, maximality and perfect prosodic word in Alcozauca Mixtec. Natural Language and Linguistic Theory 40: 599649.CrossRefGoogle Scholar
Zhu, X., & . 2012. Multiregisters and four levels: A new tonal model/ Journal of Chinese Linguistics 40(1): 117.Google Scholar
Figure 0

Figure 1. (Left) Map of Mexico and (Right) close-up map of region identifying landmarks of San Juan Piñas and Oaxaca de Juárez (capital). Map created with ggmap (Kahle & Wickham, 2013).

Figure 1

Figure 2. Waveforms and spectrograms illustrating variable voicing during release of /ⁿd/ and /ⁿd͡ʒ/. Top row shows the initial syllable, /ⁿdo/ of /ⁿdoʒo3/ ‘spring’, produced as [ⁿd] on the left and [ⁿd̥] on the right. Stop burst and VOT are segmented in green. Bottom row illustrates the sequence /i5Ɂⁿd͡ʒa35/ in /ko1ʃi5Ɂndʒa35/ ‘not stingy,’ with /ⁿd͡ʒ/ produced as [ⁿd͡ʒ] on left and [ⁿd̥͡ʒ] on the right. Stop release and fricative portion of /ⁿd͡ʒ/ is segmented in orange.

Figure 2

Figure 3. Positive and Negative VOT of voiceless stops [t, k, kʷ] and pre-nasalized consonants [ⁿd, ⁿd͡ʒ] in couplet-initial position. Large circles represent the mean VOT (in ms) for each stop and error bars represent one standard deviation. Values for individual tokens are represented by smaller circles.

Figure 3

Figure 4. Log SoE over the duration of prenasalization for [ⁿd] (dark blue) and [ⁿd͡ʒ] (purple). Lighter colors (ribbons) represent 95% confidence intervals.

Figure 4

Figure 5. Left shows waveform and spectrogram of [ka3ʰka3] ‘to walk.’ Aspiration on couplet-medial [ʰk] is indicated with superscript h. Right shows spectrogram and waveform of [ⁿda3ko3o3] ‘to leave.’ [k] is not preaspirated as it is in couplet-initial position, although it is word-medial. Light noise at the [k] closure onset is not audibly preaspiration but instead is attributed to echo.

Figure 5

Figure 6. Preaspiration measures of voiceless stops and affricate [ʰt, ʰt͡ʃ, ʰk]. Large circles represent the mean duration of preaspiration for each stop, and error bars represent one standard deviation. Values for individual tokens are represented by smaller circles.

Figure 6

Figure 7. Mean spectral slices of [v, s, ʃ, ʒ] averaged from 20 tokens each.

Figure 7

Figure 8. (Left) Log SoE over proportion time for [v, ʒ, l], consonants represented in color. (Right) Cepstral peak prominence over proportion time for [v, ʒ, l], consonants represented in color. Lighter colors (ribbons) represent 95% confidence intervals.

Figure 8

Figure 9. Plot of F1 and F2 values (Hz) of oral vowels with 1 standard deviation ellipses. Vowel labels are centered on the mean F1 and F2 values, and points represent individual tokens. Vowels are represented by color.

Figure 9

Figure 10. F3 values (Hz) of [o] and [u]. Large circles represent mean F3 for each vowel, and error bars represent one standard deviation. Values for individual tokens are represented by smaller circles.

Figure 10

Figure 11. Mean A1-P0 (dB) values for phonemic nasal (green circles), allophonic nasalized (orange triangles), and oral vowels (purple squares) at the vowel onset, vowel midpoint, and vowel offset. Vowel onset corresponds approximately to the 3% time point of the total vowel duration, vowel midpoint corresponds to the 50% time point, and vowel offset corresponds approximately to 97% timepoint.

Figure 11

Figure 12. Waveforms, spectrograms, pitch tracks, and intensity tracks illustrating variation of glottalization. Glottalized portion is indicated by vertical boundaries.

Figure 12

Figure 13. Log SoE in pre-consonantal, VɁC (left), and intervocalic, VɁV (right), sequences. Gestural timing schema below the x-axis indicates ideal gestural timing between vowel articulation and glottalization. Lighter colors (ribbons) represent 95% confidence intervals.

Figure 13

Figure 14. f0 track (Hz) of High, Mid, and Low lexical tones. Thin lines represent individual tokens, thick lines represent the mean across all tokens.

Figure 14

Table 1. Bitonal melodies in monosyllabic and disyllabic bimoraic stems

Figure 15

Figure 15. f0 trajectories (Hz) for bitonal melodies in monosyllabic bimoraic stems.

Figure 16

Table 2. Bitonal melodies on monosyllabic bimoraic words used for acoustic analysis in Figure 15 across three vowel categories: /i/ (blue/bottom), /o/ (purple/middle), and /a/ (black/top) (color online). Note that N/A indicates no token was found with the vowel and tone pattern pairing.

Figure 17

Figure 16. f0 tracks (Hz) of [i3ta3] ‘flower’ (top left), [i3ta3va2] ‘flower!’ (top right), [le3so3] ‘rabbit’ (bottom left) and [le3so3va3] ‘rabbit!’ (bottom right) (f0 between vowels is interpolated for visualization purposes and does not represent real f0). Note that all tones are phonemically mid level tones /3/, but downstep mid [2] is phonetically implemented with a lower f0 in ‘flower!’; compare with ‘rabbit!,’ with phonetically all level [3] tones.

Figure 18

Figure 17. f0 tracks (Hz) of [ti5ku↑6] ‘needle’ (top left), [ti5ku↑6ɲa↓4] ‘her needle’ (top right), [ⁿdʒ͡u5ma5] ‘little fish’ (bottom left), and [ⁿdʒ͡u5ma5ɲa5] ‘her little fish’ (bottom right) (f0 between vowels is interpolated for visualization purposes and does not represent real f0). Note that all tones are phonemically high level tones /5/.

Figure 19

Table 3. Attested monomorphemic word structures and their syllable structure