Analyzing geospatial variation in articulation rate using crowdsourced speech data

Adrian Leemann

doi:10.1017/jlg.2016.11

Analyzing geospatial variation in articulation rate using crowdsourced speech data

Published online by Cambridge University Press: 18 April 2017

Adrian Leemann

Show author details

Adrian Leemann*: Affiliation:
Department of Linguistics and English Language, Lancaster University
*: *Address for correspondence: Adrian Leemann, Department of Linguistics and English Language, County South, Lancaster University, LA1 4YL, United Kingdom, +44 77 153 999 45, [email protected]

Article contents

Abstract
References

Rights & Permissions

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

Most recent studies on the geographical distribution of acoustic features analyze comparatively few speakers and localities, both of which may be unrepresentative of the diversity found in larger or more spatially fragmented populations. In the present study we introduce a new paradigm that enables the crowdsourcing of acoustic features through smartphone devices. We used Dialäkt Äpp, a free iOS app that allows users to record themselves, to crowdsource audio data. Nearly 3,000 speakers from 452 localities in German-speaking Switzerland provided recordings; we measured articulation rates for these speakers using a metric based on duration intervals between consecutive vowel onsets. Results revealed distinct regional differences in articulation rate between major dialect regions and individual localities. The specification of 452 localities enabled analyses at an unprecedented spatial resolution. Results further revealed a robust effect of gender, with women articulating significantly more slowly than men. Both the geographical patterns and the effect of gender found in this study corroborate similar findings on Swiss German previously reported in a very limited set of localities, thus verifying the validity of the crowdsourcing framework. Given the application of this new framework, a large bulk of the discussion is devoted to discussing methodological caveats.

Type: Articles
Information: Journal of Linguistic Geography , Volume 4 , Issue 2 , September 2016 , pp. 76 - 96

DOI: https://doi.org/10.1017/jlg.2016.11 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2017

References

Allen, George D. 1972. The location of rhythmic stress beats in English: An experimental study I. Language and Speech 15(1). 72-100.CrossRef Google Scholar PubMed

App Annie. 2013. http://www.appannie.com/ (19 April 2016).Google Scholar

Baayen, R. Harald. 2008. Analyzing linguistic data: A practical introduction to statistics using R. Cambridge: Cambridge University Press.Google Scholar

Baayen, R. Harald. 2009. languageR: Data sets and functions with “Analyzing linguistic data: A practical introduction to statistics using R.” R package version 0.955.CrossRef Google Scholar

Bates, Douglas M & Maechler, Martin. 2009. lme4: Linear mixed-effects models using S4 classes. R package version 0.999375-32.Google Scholar

Berthele, Raphaël. 2006. Wie sieht das Berndeutsche so ungefähr aus? Über den Nutzen von Visualisierungen für die kognitive Laienlinguistik. In Hubert Klausmann (ed.), Raumstrukturen im Alemannischen. Beiträge der 15. Arbeitstagung zur alemannischen Dialektologie, Schloss Hofen (Vorarlberg) vom 19.-21.9.2005, 163-176. Graz-Feldkirch: Neugebauer.Google Scholar

Birnbaum, Michael H. 2004. Human research and data collection via the Internet. Annual Review of Psychology 55. 803-832.CrossRef Google Scholar PubMed

Blaxter, Tam. 2016. Geospatial temporal visualisation. Manuscript draft. Cambridge, UK: University of Cambridge Ph.D. thesis.Google Scholar

Boersma, Paul & Weenink, David. 2016. Praat: doing phonetics by computer. http://www.praat.org/ (19 April 2016).Google Scholar

Brown, Bruce L., Giles, Howard & Thakerar, Jitendra N.. 1985. Speaker evaluations as a function of speech rate, accent and context. Language & Communication 5(3). 207-220.Google Scholar

Byrd, Dani. 1992. Preliminary results on speaker-dependent variation in the TIMIT database. Journal of the Acoustical Society of America 92(1). 593-596.CrossRef Google Scholar PubMed

Christen, Helen. 2004. Dialekt-Schreiben oder sorry ech hassä Text schribä. In Elivra Glaser, Peter Ott & Rudolf Schwarzenbach (eds.), Alemannisch im Sprachvergleich: Beiträge zur 14. Arbeitstagung für alemannische Dialektologie in Männedorf (Zürich) vom 16.-18.9.2002, 71-87. Stuttgart: Franz Steiner.Google Scholar

Crystal, Thomas H & House., Arthur S. 1982. Segmental durations in connected speech signals: Preliminary results. Journal of the Acoustical Society of America 72. 705-716.Google Scholar

Cummins, Fred & Port, Robert. 1998. Rhythmic constraints on stress timing in English. Journal of Phonetics 26(2). 145-171.Google Scholar

De Decker, Paul & Nycz, Jennifer. 2011. For the Record: Which digital media can be used for sociophonetic analysis? University of Pennsylvania Working Papers in Linguistics 17(2). 51-59.Google Scholar

Estellés-Arolas, Enrique & González-Ladrón-De-Guevara, Fernando. 2012. Towards an integrated crowdsourcing definition. Journal of Information Science 38(2). 189-200.Google Scholar

faberacoustical. 2009. http://blog.faberacoustical.com/2009/ios/iphone/iphone-microphone-frequency-response-comparison/ (19 April 2016).Google Scholar

Federal Department of Statistics. 2016a. http://www.bfs.admin.ch/bfs/portal/de/index/themen/01/05/blank/key/sprachen.html (19 April 2016).Google Scholar

Federal Department of Statistics. 2016b. http://www.bfs.admin.ch/bfs/portal/de/index/themen/01/05/blank/key/sprachen.Document.199062.xls (19 April 2016).Google Scholar

Federal Department of Statistics. 2016c. http://www.bfs.admin.ch/bfs/portal/de/index/dienstleistungen/geostat/datenbeschreibung/generalisierte_gemeindegrenzen.html (19 April 2016).Google Scholar

Ferguson, Charles. 1959. Diglossia. Word 15. 325-340.Google Scholar

Fleischer, Jürg & Schmid, Stephan. 2006. Zurich German. Journal of the International Phonetic Association 36(2). 243-253.CrossRef Google Scholar

Hahn, Matthias & Siebenhaar, Beat. 2016. Sprechtempo und reduktion im Deutschen (SpuRD. In Oliver Jokisch (ed.), Elektronische Sprachsignalverarbeitung 2016, 198-205. Dresden: TUDpress.Google Scholar

Hewlett, Nigel & Rendall, Monica. 1998. Rural versus urban accent as an influence on the rate of speech. Journal of the International Phonetic Association 28. 63-71.CrossRef Google Scholar

Hughes, Thad, Nakajima, Kaisuke, Ha, Linne, Vasu, Atul, Moreno, Pedro & LeBeau, Mike. 2010. Building transcribed speech corpora quickly and cheaply for many languages. Proceedings of Interspeech 2010. 1914-1917.CrossRef Google Scholar

Jacewicz, Ewa, Fox, Robert A., O’Neill, Caitlin & Salmons, Joseph. 2009. Articulation rate across dialect, age, and gender. Language Variation and Change 21(2). 233-256.CrossRef Google Scholar PubMed

Jacewicz, Ewa, Fox, Robert A. & Wei, Lai. 2010. Between-speaker and within-speaker variation in speech tempo of American English. The Journal of the Acoustical Society of America 128(2). 839-850.CrossRef Google Scholar PubMed

Jessen, Michael. 2007. Forensic reference data on articulation rate in German. Science & Justice 47(2). 50-67.Google Scholar

Keller, Kathrin. 2008. “Hützt’s z’Zuzwil?” Zu den Silbenstrukturen des Schweizerdeutschen, empirisch analysiert an zwei Dialekten. Lizentiatsarbeit am Institut für Sprachwissenschaft der Universität Bern 2. Mai 2008.Google Scholar

Kilgarriff, Adam. 2005. Language is never, ever, ever, random. Corpus Linguistics and Linguistic Theory 1(2). 263-276.CrossRef Google Scholar

Kohler, Klaus J., Schafer, Kurt, Thon, Werner & Timmermann, Gerd. 1981. Sprechgeschwindigkeit in produktion und perzeption. Arbeitsberichte Kiel 16. 137-205.Google Scholar

Kohler, Klaus J. 1982. Rhythmus im Deutschen in experimentelle untersuchungen von zeitstrukturen im Deutschen. Le rythme en allemand. Arbeitsberichte-Institut für Phonetik 19. 89-105.Google Scholar

Kohler, Klaus J. 2001. The investigation of connected speech processes. Theory, method, hypotheses and empirical data. Arbeitsberichte des Instituts für Phonetik der Universität Kiel 35. 1-32.Google Scholar

Künzel, Hermann J. 1997. Some general phonetic and forensic aspects of speaking tempo. International Journal of Speech Language and the Law 4(1). 48-83.CrossRef Google Scholar

Labov, William. 1996. When Intuitions Fail. Proceedings of the 32nd Regional Meeting of the Chicago Linguistic Society 32. 77-106.Google Scholar

Leemann, Adrian. 2012. Swiss German intonation patterns. Amsterdam: Benjamins.CrossRef Google Scholar

Leemann, Adrian & Kolly, Marie-José. 2013. Dialäkt Äpp. https://itunes.apple.com/ch/app/dialakt-app/id606559705?mt=8 (19 April 2016).Google Scholar

Leemann, Adrian, Kolly, Marie-José & Britain, D, David. 2016. English Dialects. https://itunes.apple.com/gb/app/english-dialects/id882340404?mt=8, https://play.google.com/store/apps/details?id=ch.uk_regional&hl=en_GB (19 April 2016).Google Scholar

Leemann, Adrian, Kolly, Marie-José & Dellwo, Volker. 2014. Crowdsourcing regional variation in speaking rate through the iOS app ‘Dialäkt Äpp’. Speech Prosody 7. 217-221.Google Scholar

Leemann, Adrian, Kolly, Marie-José, Goldman, Jean-Philippe, Dellwo, Volker, Hove, Ingrid, Almajai, Ibrahim & Wanitsch, Daniel. 2015. Voice Äpp: a mobile app for crowdsourcing Swiss German dialect data. Proceedings of of Interspeech 2015, 2804-2808.CrossRef Google Scholar

Leemann, Adrian, Kolly, Marie-José, Purves, Ross, Britain, David & Glaser, Elvira. 2016. Crowdsourcing language change with smartphone applications. PloS ONE 11(1). e0143060.CrossRef Google Scholar PubMed

Leemann, Adrian & Siebenhaar, Beat. 2007. Intonational and temporal features of Swiss German. Proceedings of the ICPhS, Saarbrücken. 957–960.Google Scholar

Leemann, Adrian & Siebenhaar, Beat. 2010. Statistical modeling of F0 and timing of Swiss German dialects. Proceedings of Speech Prosody 2010, Chicago, 11-14. May.CrossRef Google Scholar

Löffler, Heinrich. 2005. Germanistische Soziolinguistik. 3rd ed. Berlin: ESV.Google Scholar

Lötscher, Andreas. 1983. Schweizerdeutsch: Geschichte, Dialekt, Gebrauch. Frauenfeld: Huber.Google Scholar

McGraw, Ian. 2013. Collecting speech from crowds. In Maxine Eskenazi, Gina-Anne Levow, Helen Meng, Gabriel Parent & David Suendermann (eds), Crowdsourcing for speech processing: Applications to data collection, transcription and assessment, 38-71. Hoboken, NJ: John Wiley & Sons.Google Scholar

Meyerhoff, Miriam. 2014. Variation and Gender. In S. Ehrlich, M. Meyerhoff & J. Holmes (eds), The handbook of language, gender, and sexuality, 87-102. Hoboken, NJ: John Wiley & Sons.Google Scholar

Morton, John, Marcus, Steven & Frankish, Clive. 1976. Perceptual centers (P-centers). Psychological Review 83(5). 405-408.Google Scholar

Munro, Robert, Bethard, Steven, Kuperman, Victor, Lai, Vicky T., Melnick, Robin, Potts, Christopher, Schoebelen, Tyler & Tily, Harry. 2010. Crowdsourcing and language studies: the new generation of linguistic data. Proceedings of the NAACL HLT 2010 workshop on creating speech and language data with Amazon’s Mechanical Turk. 122-130.Google Scholar

Pfitzinger, Hartmut R. 1998. Local speech rate as a combination of syllable and phone rate. Proceedings of the ICSLP 3. 1087-1090.CrossRef Google Scholar

Pfitzinger, Hartmut R. 1999. Local speech rate perception in German speech. Proc. of the XIVth Int. Congress of Phonetic Sciences 2. 893-896.Google Scholar

QGIS. 2016. QGIS Geographic Information System. Open Source Geospatial Foundation Project. http://qgis.osgeo.org (19 April 2016).Google Scholar

Quene, Hugo. 2008. Multilevel modeling of between-speaker and within-speaker variation in spontaneous speech tempo. Journal of the Acoustical Society of America 123(2). 1104-1113.Google Scholar

RCore Team. 2016. R: A language and environment for statistical computing, version 3.0.0. R Foundation for Statistical Computing. http://www.R-project.org (19 April 2016).Google Scholar

Ris, Roland. 1992. Innerethik der deutschen Schweiz. In P. Hugger (ed.), Handbuch der schweizerischen Volkskultur, vol. 2, 749-766. Zürich: Offizin.Google Scholar

Reips, Ulf-Dietrich. 2002. Standards for Internet-based experimenting. Experimental Psychology 49(4). 243-256.Google Scholar

Roach, Peter. 1998. Myth 18: Some languages are spoken more quickly than others. In Laurie Bauer & Peter Trudgill (eds.), Language Myths, 150-158. London: Penguin.Google Scholar

Robb, Michael P., Maclagan, Margaret A. & Chen, Yang. 2004. Speaking rates of American and New Zealand varieties of English. Clinical Linguistics & Phonetics 18(1). 1-15.Google Scholar

Schwab, Sandra & Avanzi, Matthieu. 2015. Regional variation and articulation rate in French. Journal of Phonetics 48. 96-105.CrossRef Google Scholar

Schwarzenbach, Rudolf. 1969. Die Stellung der Mundart in der deutschsprachigen Schweiz. Studien zum Sprachgebrauch der Gegenwart. Frauenfeld: Huber.Google Scholar

Sieber, Peter & Sitta, Horst. 1986. Mundart und Standardsprache als Problem der Schule. Aarau: Sauerländer.Google Scholar

Siebenhaar, Beat. 2000. Sprachvariation, Sprachwandel und Einstellung. Der Dialekt der Stadt Aarau in der Labilitätszone zwischen Zürcher und Berner Mundartraum. Stuttgart: Franz Steiner Verlag.Google Scholar

Siebenhaar, Beat & Wyler, Alfred. 1997. Dialekt und Hochsprache in der deutschsprachigen Schweiz. Zürich: Pro Helvetia.Google Scholar

Siebenhaar, Beat, Keller, Brigitte Zellner & Keller, Eric. 2001. Phonetic and timing considerations in a Swiss High German TTS system. In Eric Keller, Gérard Bailly, Alex Monaghan, Jacques Terken and Mark Huckvale (eds.), Improvements in speech synthesis:COST 258, 165-175. Chichester, UK: John Wiley & Sons.Google Scholar

Simpson, Adrian P. 1998. Phonetische Datenbanken des Deutschen in der empirischen Sprachforschung und der phonologischen Theoriebildung. Arbeitsberichte des Instituts für Phonetik und digitale Sprachverarbeitung der Universität Kiel 33.Google Scholar

Sprachatlas der deutschen Schweiz (Atlas). 1962-2003. Bern (I-VI), Basel (VII-VIII). Francke.Google Scholar

Staub, Friedrich. 1874. Die Vokalisierung des N bei den schweizerischen Alemannen. Halle 1874 (unter dem Titel: Ein schweizerisch-alemannisches Lautgesetz auch in Deutsche Mundarten 7, 1877, S. 18-36, 191-207, 333-389).Google Scholar

Trouvain, Jürgen. 2003. Tempo variation in speech production. Implications for speech synthesis. Saarbrücken, Germany: Universität des Saarlandes Ph.D. thesis.Google Scholar

Trouvain, Jürgen, Koreman, Jacques, Erriquez, Attilio & Braun, Bettina. 2001. Articulation rate measures and their relation to phone classification in spontaneous and read German speech. Proceedings of the ISCA Workshop on Adaptation Methods for Speech Recognition 2001. 155-158.Google Scholar

Trudgill, Peter. 1972. Sex, covert prestige and linguistic change in the urban British English of Norwich. Language in Society 1(2). 179-195.Google Scholar

Ulbrich, Christiane. 2005. Phonetische Untersuchungen zur Prosodie der Standardvarietäten des Deutschen in der Bundesrepublik Deutschland, in der Schweiz und in Österreich. Frankfurt: Peter Lang.Google Scholar

Verhoeven, Jo, Pauw, Guy De & Kloots, Hanne. 2004. Speech rate in a pluricentric language: A comparison between Dutch in Belgium and the Netherlands. Language and Speech 47(3). 297-308.Google Scholar

de Vries, Nic J., Davel, Marelie H., Badenhorst, Jaco, Basson, Willem D., de Wet, Febe, Barnard, Etienne & de Waal, Alta. 2014. A smartphone-based ASR data collection tool for under-resourced languages. Speech Communication 56. 119-131.CrossRef Google Scholar

Weber, Albert & Dieth, Eugen. 1987. Zürichdeutsche Grammatik: ein Wegweiser zur guten Mundart vol. 1. Zurich: Verlag Hans Rohr.Google Scholar

Weiss, Richard. 1947. Die Brünig-Napf-Reuss-Linie als Kulturgrenze zwischen Ost- und Westschweiz auf volksmundlichen Karten. Geographica Helvetica 2(3). 153-175.CrossRef Google Scholar

Werlen, Iwar. 1978. Zur Einschätzung von schweizerdeutschen Dialekten. In Iwar Werlen (ed.), Probleme der schweizerdeutschen Dialektologie 2. Kolloquium der Schweizerischen Geisteswissenschaftlichen Gesellschaft, 195-257. Freiburg: University of Freiburg.Google Scholar

Werlen, Iwar. 2012. Zu Staub zurückkehren. Oder: warum Hanf nicht Haif ist - Gedanken zum Staubschen Gesetz. Presentation, University of Bern.Google Scholar

Whiteside, Sandra P. 1996. Temporal-based acoustic-phonetic patterns in read speech: some evidence for speaker gender differences. Journal of the International Phonetic Association 26. 23-40.Google Scholar

Yuan, Jiahong, Liberman, Mark & Cieri, Christopher. 2006. Towards an integrated understanding of speaking rate in conversation. Paper presented at the International Conference on Spoken Language Processing (Interspeech 2006), Pittsburgh, PA.CrossRef Google Scholar

Leemann supplementary material

Leemann supplementary material 1

Audio 62.3 KB

Leemann supplementary material

Leemann supplementary material 10

Audio 55.5 KB

Leemann supplementary material

Leemann supplementary material 11

Audio 70.3 KB

Leemann supplementary material

Leemann supplementary material 12

Audio 56.7 KB

Leemann supplementary material

Leemann supplementary material 13

Audio 93.8 KB

Leemann supplementary material

Leemann supplementary material 2

Audio 64.9 KB

Leemann supplementary material

Leemann supplementary material 3

Audio 52.9 KB

Leemann supplementary material

Leemann supplementary material 4

Audio 75.6 KB

Leemann supplementary material

Leemann supplementary material 5

Audio 105.2 KB

Leemann supplementary material

Leemann supplementary material 6

Audio 74.4 KB

Leemann supplementary material

Leemann supplementary material 7

Audio 63.1 KB

Leemann supplementary material

Leemann supplementary material 8

Audio 76.8 KB

Leemann supplementary material

Leemann supplementary material 9

Audio 73.2 KB

Article contents

Analyzing geospatial variation in articulation rate using crowdsourced speech data

Abstract

References

Leemann supplementary material

Leemann supplementary material

Leemann supplementary material

Leemann supplementary material

Leemann supplementary material

Leemann supplementary material

Leemann supplementary material

Leemann supplementary material

Leemann supplementary material

Leemann supplementary material

Leemann supplementary material

Leemann supplementary material

Leemann supplementary material

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests