Hostname: page-component-78c5997874-94fs2 Total loading time: 0 Render date: 2024-11-13T12:25:28.656Z Has data issue: true hasContentIssue true

Analyzing geospatial variation in articulation rate using crowdsourced speech data

Published online by Cambridge University Press:  18 April 2017

Adrian Leemann*
Affiliation:
Department of Linguistics and English Language, Lancaster University
*
*Address for correspondence: Adrian Leemann, Department of Linguistics and English Language, County South, Lancaster University, LA1 4YL, United Kingdom, +44 77 153 999 45, [email protected]
Rights & Permissions [Opens in a new window]

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

Most recent studies on the geographical distribution of acoustic features analyze comparatively few speakers and localities, both of which may be unrepresentative of the diversity found in larger or more spatially fragmented populations. In the present study we introduce a new paradigm that enables the crowdsourcing of acoustic features through smartphone devices. We used Dialäkt Äpp, a free iOS app that allows users to record themselves, to crowdsource audio data. Nearly 3,000 speakers from 452 localities in German-speaking Switzerland provided recordings; we measured articulation rates for these speakers using a metric based on duration intervals between consecutive vowel onsets. Results revealed distinct regional differences in articulation rate between major dialect regions and individual localities. The specification of 452 localities enabled analyses at an unprecedented spatial resolution. Results further revealed a robust effect of gender, with women articulating significantly more slowly than men. Both the geographical patterns and the effect of gender found in this study corroborate similar findings on Swiss German previously reported in a very limited set of localities, thus verifying the validity of the crowdsourcing framework. Given the application of this new framework, a large bulk of the discussion is devoted to discussing methodological caveats.

Type
Articles
Copyright
Copyright © Cambridge University Press 2017 

References

Allen, George D. 1972. The location of rhythmic stress beats in English: An experimental study I. Language and Speech 15(1). 72-100.CrossRefGoogle ScholarPubMed
App Annie. 2013. http://www.appannie.com/ (19 April 2016).Google Scholar
Baayen, R. Harald. 2008. Analyzing linguistic data: A practical introduction to statistics using R. Cambridge: Cambridge University Press.Google Scholar
Baayen, R. Harald. 2009. languageR: Data sets and functions with “Analyzing linguistic data: A practical introduction to statistics using R.” R package version 0.955.CrossRefGoogle Scholar
Bates, Douglas M & Maechler, Martin. 2009. lme4: Linear mixed-effects models using S4 classes. R package version 0.999375-32.Google Scholar
Berthele, Raphaël. 2006. Wie sieht das Berndeutsche so ungefähr aus? Über den Nutzen von Visualisierungen für die kognitive Laienlinguistik. In Hubert Klausmann (ed.), Raumstrukturen im Alemannischen. Beiträge der 15. Arbeitstagung zur alemannischen Dialektologie, Schloss Hofen (Vorarlberg) vom 19.-21.9.2005, 163-176. Graz-Feldkirch: Neugebauer.Google Scholar
Birnbaum, Michael H. 2004. Human research and data collection via the Internet. Annual Review of Psychology 55. 803-832.CrossRefGoogle ScholarPubMed
Blaxter, Tam. 2016. Geospatial temporal visualisation. Manuscript draft. Cambridge, UK: University of Cambridge Ph.D. thesis.Google Scholar
Boersma, Paul & Weenink, David. 2016. Praat: doing phonetics by computer. http://www.praat.org/ (19 April 2016).Google Scholar
Brown, Bruce L., Giles, Howard & Thakerar, Jitendra N.. 1985. Speaker evaluations as a function of speech rate, accent and context. Language & Communication 5(3). 207-220.Google Scholar
Byrd, Dani. 1992. Preliminary results on speaker-dependent variation in the TIMIT database. Journal of the Acoustical Society of America 92(1). 593-596.CrossRefGoogle ScholarPubMed
Christen, Helen. 2004. Dialekt-Schreiben oder sorry ech hassä Text schribä. In Elivra Glaser, Peter Ott & Rudolf Schwarzenbach (eds.), Alemannisch im Sprachvergleich: Beiträge zur 14. Arbeitstagung für alemannische Dialektologie in Männedorf (Zürich) vom 16.-18.9.2002, 71-87. Stuttgart: Franz Steiner.Google Scholar
Crystal, Thomas H & House., Arthur S. 1982. Segmental durations in connected speech signals: Preliminary results. Journal of the Acoustical Society of America 72. 705-716.Google Scholar
Cummins, Fred & Port, Robert. 1998. Rhythmic constraints on stress timing in English. Journal of Phonetics 26(2). 145-171.Google Scholar
De Decker, Paul & Nycz, Jennifer. 2011. For the Record: Which digital media can be used for sociophonetic analysis? University of Pennsylvania Working Papers in Linguistics 17(2). 51-59.Google Scholar
Estellés-Arolas, Enrique & González-Ladrón-De-Guevara, Fernando. 2012. Towards an integrated crowdsourcing definition. Journal of Information Science 38(2). 189-200.Google Scholar
Ferguson, Charles. 1959. Diglossia. Word 15. 325-340.Google Scholar
Fleischer, Jürg & Schmid, Stephan. 2006. Zurich German. Journal of the International Phonetic Association 36(2). 243-253.CrossRefGoogle Scholar
Hahn, Matthias & Siebenhaar, Beat. 2016. Sprechtempo und reduktion im Deutschen (SpuRD. In Oliver Jokisch (ed.), Elektronische Sprachsignalverarbeitung 2016, 198-205. Dresden: TUDpress.Google Scholar
Hewlett, Nigel & Rendall, Monica. 1998. Rural versus urban accent as an influence on the rate of speech. Journal of the International Phonetic Association 28. 63-71.CrossRefGoogle Scholar
Hughes, Thad, Nakajima, Kaisuke, Ha, Linne, Vasu, Atul, Moreno, Pedro & LeBeau, Mike. 2010. Building transcribed speech corpora quickly and cheaply for many languages. Proceedings of Interspeech 2010. 1914-1917.CrossRefGoogle Scholar
Jacewicz, Ewa, Fox, Robert A., O’Neill, Caitlin & Salmons, Joseph. 2009. Articulation rate across dialect, age, and gender. Language Variation and Change 21(2). 233-256.CrossRefGoogle ScholarPubMed
Jacewicz, Ewa, Fox, Robert A. & Wei, Lai. 2010. Between-speaker and within-speaker variation in speech tempo of American English. The Journal of the Acoustical Society of America 128(2). 839-850.CrossRefGoogle ScholarPubMed
Jessen, Michael. 2007. Forensic reference data on articulation rate in German. Science & Justice 47(2). 50-67.Google Scholar
Keller, Kathrin. 2008. “Hützt’s z’Zuzwil?” Zu den Silbenstrukturen des Schweizerdeutschen, empirisch analysiert an zwei Dialekten. Lizentiatsarbeit am Institut für Sprachwissenschaft der Universität Bern 2. Mai 2008.Google Scholar
Kilgarriff, Adam. 2005. Language is never, ever, ever, random. Corpus Linguistics and Linguistic Theory 1(2). 263-276.CrossRefGoogle Scholar
Kohler, Klaus J., Schafer, Kurt, Thon, Werner & Timmermann, Gerd. 1981. Sprechgeschwindigkeit in produktion und perzeption. Arbeitsberichte Kiel 16. 137-205.Google Scholar
Kohler, Klaus J. 1982. Rhythmus im Deutschen in experimentelle untersuchungen von zeitstrukturen im Deutschen. Le rythme en allemand. Arbeitsberichte-Institut für Phonetik 19. 89-105.Google Scholar
Kohler, Klaus J. 2001. The investigation of connected speech processes. Theory, method, hypotheses and empirical data. Arbeitsberichte des Instituts für Phonetik der Universität Kiel 35. 1-32.Google Scholar
Künzel, Hermann J. 1997. Some general phonetic and forensic aspects of speaking tempo. International Journal of Speech Language and the Law 4(1). 48-83.CrossRefGoogle Scholar
Labov, William. 1996. When Intuitions Fail. Proceedings of the 32nd Regional Meeting of the Chicago Linguistic Society 32. 77-106.Google Scholar
Leemann, Adrian. 2012. Swiss German intonation patterns. Amsterdam: Benjamins.CrossRefGoogle Scholar
Leemann, Adrian & Kolly, Marie-José. 2013. Dialäkt Äpp. https://itunes.apple.com/ch/app/dialakt-app/id606559705?mt=8 (19 April 2016).Google Scholar
Leemann, Adrian, Kolly, Marie-José & Britain, D, David. 2016. English Dialects. https://itunes.apple.com/gb/app/english-dialects/id882340404?mt=8, https://play.google.com/store/apps/details?id=ch.uk_regional&hl=en_GB (19 April 2016).Google Scholar
Leemann, Adrian, Kolly, Marie-José & Dellwo, Volker. 2014. Crowdsourcing regional variation in speaking rate through the iOS app ‘Dialäkt Äpp’. Speech Prosody 7. 217-221.Google Scholar
Leemann, Adrian, Kolly, Marie-José, Goldman, Jean-Philippe, Dellwo, Volker, Hove, Ingrid, Almajai, Ibrahim & Wanitsch, Daniel. 2015. Voice Äpp: a mobile app for crowdsourcing Swiss German dialect data. Proceedings of of Interspeech 2015, 2804-2808.CrossRefGoogle Scholar
Leemann, Adrian, Kolly, Marie-José, Purves, Ross, Britain, David & Glaser, Elvira. 2016. Crowdsourcing language change with smartphone applications. PloS ONE 11(1). e0143060.CrossRefGoogle ScholarPubMed
Leemann, Adrian & Siebenhaar, Beat. 2007. Intonational and temporal features of Swiss German. Proceedings of the ICPhS, Saarbrücken. 957–960.Google Scholar
Leemann, Adrian & Siebenhaar, Beat. 2010. Statistical modeling of F0 and timing of Swiss German dialects. Proceedings of Speech Prosody 2010, Chicago, 11-14. May.CrossRefGoogle Scholar
Löffler, Heinrich. 2005. Germanistische Soziolinguistik. 3rd ed. Berlin: ESV.Google Scholar
Lötscher, Andreas. 1983. Schweizerdeutsch: Geschichte, Dialekt, Gebrauch. Frauenfeld: Huber.Google Scholar
McGraw, Ian. 2013. Collecting speech from crowds. In Maxine Eskenazi, Gina-Anne Levow, Helen Meng, Gabriel Parent & David Suendermann (eds), Crowdsourcing for speech processing: Applications to data collection, transcription and assessment, 38-71. Hoboken, NJ: John Wiley & Sons.Google Scholar
Meyerhoff, Miriam. 2014. Variation and Gender. In S. Ehrlich, M. Meyerhoff & J. Holmes (eds), The handbook of language, gender, and sexuality, 87-102. Hoboken, NJ: John Wiley & Sons.Google Scholar
Morton, John, Marcus, Steven & Frankish, Clive. 1976. Perceptual centers (P-centers). Psychological Review 83(5). 405-408.Google Scholar
Munro, Robert, Bethard, Steven, Kuperman, Victor, Lai, Vicky T., Melnick, Robin, Potts, Christopher, Schoebelen, Tyler & Tily, Harry. 2010. Crowdsourcing and language studies: the new generation of linguistic data. Proceedings of the NAACL HLT 2010 workshop on creating speech and language data with Amazon’s Mechanical Turk. 122-130.Google Scholar
Pfitzinger, Hartmut R. 1998. Local speech rate as a combination of syllable and phone rate. Proceedings of the ICSLP 3. 1087-1090.CrossRefGoogle Scholar
Pfitzinger, Hartmut R. 1999. Local speech rate perception in German speech. Proc. of the XIVth Int. Congress of Phonetic Sciences 2. 893-896.Google Scholar
QGIS. 2016. QGIS Geographic Information System. Open Source Geospatial Foundation Project. http://qgis.osgeo.org (19 April 2016).Google Scholar
Quene, Hugo. 2008. Multilevel modeling of between-speaker and within-speaker variation in spontaneous speech tempo. Journal of the Acoustical Society of America 123(2). 1104-1113.Google Scholar
RCore Team. 2016. R: A language and environment for statistical computing, version 3.0.0. R Foundation for Statistical Computing. http://www.R-project.org (19 April 2016).Google Scholar
Ris, Roland. 1992. Innerethik der deutschen Schweiz. In P. Hugger (ed.), Handbuch der schweizerischen Volkskultur, vol. 2, 749-766. Zürich: Offizin.Google Scholar
Reips, Ulf-Dietrich. 2002. Standards for Internet-based experimenting. Experimental Psychology 49(4). 243-256.Google Scholar
Roach, Peter. 1998. Myth 18: Some languages are spoken more quickly than others. In Laurie Bauer & Peter Trudgill (eds.), Language Myths, 150-158. London: Penguin.Google Scholar
Robb, Michael P., Maclagan, Margaret A. & Chen, Yang. 2004. Speaking rates of American and New Zealand varieties of English. Clinical Linguistics & Phonetics 18(1). 1-15.Google Scholar
Schwab, Sandra & Avanzi, Matthieu. 2015. Regional variation and articulation rate in French. Journal of Phonetics 48. 96-105.CrossRefGoogle Scholar
Schwarzenbach, Rudolf. 1969. Die Stellung der Mundart in der deutschsprachigen Schweiz. Studien zum Sprachgebrauch der Gegenwart. Frauenfeld: Huber.Google Scholar
Sieber, Peter & Sitta, Horst. 1986. Mundart und Standardsprache als Problem der Schule. Aarau: Sauerländer.Google Scholar
Siebenhaar, Beat. 2000. Sprachvariation, Sprachwandel und Einstellung. Der Dialekt der Stadt Aarau in der Labilitätszone zwischen Zürcher und Berner Mundartraum. Stuttgart: Franz Steiner Verlag.Google Scholar
Siebenhaar, Beat & Wyler, Alfred. 1997. Dialekt und Hochsprache in der deutschsprachigen Schweiz. Zürich: Pro Helvetia.Google Scholar
Siebenhaar, Beat, Keller, Brigitte Zellner & Keller, Eric. 2001. Phonetic and timing considerations in a Swiss High German TTS system. In Eric Keller, Gérard Bailly, Alex Monaghan, Jacques Terken and Mark Huckvale (eds.), Improvements in speech synthesis:COST 258, 165-175. Chichester, UK: John Wiley & Sons.Google Scholar
Simpson, Adrian P. 1998. Phonetische Datenbanken des Deutschen in der empirischen Sprachforschung und der phonologischen Theoriebildung. Arbeitsberichte des Instituts für Phonetik und digitale Sprachverarbeitung der Universität Kiel 33.Google Scholar
Sprachatlas der deutschen Schweiz (Atlas). 1962-2003. Bern (I-VI), Basel (VII-VIII). Francke.Google Scholar
Staub, Friedrich. 1874. Die Vokalisierung des N bei den schweizerischen Alemannen. Halle 1874 (unter dem Titel: Ein schweizerisch-alemannisches Lautgesetz auch in Deutsche Mundarten 7, 1877, S. 18-36, 191-207, 333-389).Google Scholar
Trouvain, Jürgen. 2003. Tempo variation in speech production. Implications for speech synthesis. Saarbrücken, Germany: Universität des Saarlandes Ph.D. thesis.Google Scholar
Trouvain, Jürgen, Koreman, Jacques, Erriquez, Attilio & Braun, Bettina. 2001. Articulation rate measures and their relation to phone classification in spontaneous and read German speech. Proceedings of the ISCA Workshop on Adaptation Methods for Speech Recognition 2001. 155-158.Google Scholar
Trudgill, Peter. 1972. Sex, covert prestige and linguistic change in the urban British English of Norwich. Language in Society 1(2). 179-195.Google Scholar
Ulbrich, Christiane. 2005. Phonetische Untersuchungen zur Prosodie der Standardvarietäten des Deutschen in der Bundesrepublik Deutschland, in der Schweiz und in Österreich. Frankfurt: Peter Lang.Google Scholar
Verhoeven, Jo, Pauw, Guy De & Kloots, Hanne. 2004. Speech rate in a pluricentric language: A comparison between Dutch in Belgium and the Netherlands. Language and Speech 47(3). 297-308.Google Scholar
de Vries, Nic J., Davel, Marelie H., Badenhorst, Jaco, Basson, Willem D., de Wet, Febe, Barnard, Etienne & de Waal, Alta. 2014. A smartphone-based ASR data collection tool for under-resourced languages. Speech Communication 56. 119-131.CrossRefGoogle Scholar
Weber, Albert & Dieth, Eugen. 1987. Zürichdeutsche Grammatik: ein Wegweiser zur guten Mundart vol. 1. Zurich: Verlag Hans Rohr.Google Scholar
Weiss, Richard. 1947. Die Brünig-Napf-Reuss-Linie als Kulturgrenze zwischen Ost- und Westschweiz auf volksmundlichen Karten. Geographica Helvetica 2(3). 153-175.CrossRefGoogle Scholar
Werlen, Iwar. 1978. Zur Einschätzung von schweizerdeutschen Dialekten. In Iwar Werlen (ed.), Probleme der schweizerdeutschen Dialektologie 2. Kolloquium der Schweizerischen Geisteswissenschaftlichen Gesellschaft, 195-257. Freiburg: University of Freiburg.Google Scholar
Werlen, Iwar. 2012. Zu Staub zurückkehren. Oder: warum Hanf nicht Haif ist - Gedanken zum Staubschen Gesetz. Presentation, University of Bern.Google Scholar
Whiteside, Sandra P. 1996. Temporal-based acoustic-phonetic patterns in read speech: some evidence for speaker gender differences. Journal of the International Phonetic Association 26. 23-40.Google Scholar
Yuan, Jiahong, Liberman, Mark & Cieri, Christopher. 2006. Towards an integrated understanding of speaking rate in conversation. Paper presented at the International Conference on Spoken Language Processing (Interspeech 2006), Pittsburgh, PA.CrossRefGoogle Scholar

Leemann supplementary material

Leemann supplementary material 1

Download Leemann supplementary material(Audio)
Audio 62.3 KB

Leemann supplementary material

Leemann supplementary material 10

Download Leemann supplementary material(Audio)
Audio 55.5 KB

Leemann supplementary material

Leemann supplementary material 11

Download Leemann supplementary material(Audio)
Audio 70.3 KB

Leemann supplementary material

Leemann supplementary material 12

Download Leemann supplementary material(Audio)
Audio 56.7 KB

Leemann supplementary material

Leemann supplementary material 13

Download Leemann supplementary material(Audio)
Audio 93.8 KB

Leemann supplementary material

Leemann supplementary material 2

Download Leemann supplementary material(Audio)
Audio 64.9 KB

Leemann supplementary material

Leemann supplementary material 3

Download Leemann supplementary material(Audio)
Audio 52.9 KB

Leemann supplementary material

Leemann supplementary material 4

Download Leemann supplementary material(Audio)
Audio 75.6 KB

Leemann supplementary material

Leemann supplementary material 5

Download Leemann supplementary material(Audio)
Audio 105.2 KB

Leemann supplementary material

Leemann supplementary material 6

Download Leemann supplementary material(Audio)
Audio 74.4 KB

Leemann supplementary material

Leemann supplementary material 7

Download Leemann supplementary material(Audio)
Audio 63.1 KB

Leemann supplementary material

Leemann supplementary material 8

Download Leemann supplementary material(Audio)
Audio 76.8 KB

Leemann supplementary material

Leemann supplementary material 9

Download Leemann supplementary material(Audio)
Audio 73.2 KB