Search results for Computing: general interest

Challenges in Natural Language Processing

Edited by Madeleine Bates, Ralph M. Weischedel
Published online:

05 March 2010

Print publication:

24 September 1993
- Book
- - Get access
    
    Buy a print copy
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Although natural language processing has come far, the technology has not achieved a major impact on society. Is this because of some fundamental limitation that cannot be overcome? Or because there has not been enough time to refine and apply theoretical work already done? Editors Madeleine Bates and Ralph Weischedel believe it is neither; they feel that several critical issues have never been adequately addressed in either theoretical or applied work, and they have invited capable researchers in the field to do that in Challenges in Natural Language Processing. This volume will be of interest to researchers of computational linguistics in academic and non-academic settings and to graduate students in computational linguistics, artificial intelligence and linguistics.

Natural Language Parsing

Psychological, Computational, and Theoretical Perspectives
Edited by David R. Dowty, Lauri Karttunen, Arnold M. Zwicky
Published online:

25 January 2010

Print publication:

31 May 1985
- Book
- - Get access
    
    Buy a print copy
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
This is a collection of new papers by leading researchers on natural language parsing. In the past, the problem of how people parse the sentences they hear - determine the identity of the words in these sentences and group these words into larger units - has been addressed in very different ways by experimental psychologists, by theoretical linguists, and by researchers in artificial intelligence, with little apparent relationship among the solutions proposed by each group. However, because of important advances in all these disciplines, research on parsing in each of these fields now seems to have something significant to contribute to the others, as this volume demonstrates. The volume includes some papers applying the results of experimental psychological studies of parsing to linguistic theory, others which present computational models of parsing, and a mathematical linguistics paper on tree-adjoining grammars and parsing.

Efficient Algorithms for Listing Combinatorial Structures

Leslie Ann Goldberg
Published online:

14 January 2010

Print publication:

22 April 1993
- Book
- - Get access
    
    Buy a print copy
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
First published in 1993, this thesis is concerned with the design of efficient algorithms for listing combinatorial structures. The research described here gives some answers to the following questions: which families of combinatorial structures have fast computer algorithms for listing their members? What general methods are useful for listing combinatorial structures? How can these be applied to those families which are of interest to theoretical computer scientists and combinatorialists? Amongst those families considered are unlabelled graphs, first order one properties, Hamiltonian graphs, graphs with cliques of specified order, and k-colourable graphs. Some related work is also included, which compares the listing problem with the difficulty of solving the existence problem, the construction problem, the random sampling problem, and the counting problem. In particular, the difficulty of evaluating Pólya's cycle polynomial is demonstrated.

Semantic Interpretation and the Resolution of Ambiguity

Graeme Hirst
Published online:

18 December 2009

Print publication:

13 August 1987
- Book
- - Get access
    
    Buy a print copy
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Semantic interpretation and the resolution of ambiguity presents an important advance in computer understanding of natural language. While parsing techniques have been greatly improved in recent years, the approach to semantics has generally improved in recent years, the approach to semantics has generally been ad hoc and had little theoretical basis. Graeme Hirst offers a new, theoretically motivated foundation for conceptual analysis by computer, and shows how this framework facilitates the resolution of lexical and syntactic ambiguities. His approach is interdisciplinary, drawing on research in computational linguistics, artificial intelligence, montague semantics, and cognitive psychology.

Affine Analysis of Image Sequences

Larry S. Shapiro
Published online:

18 December 2009

Print publication:

13 July 1995
- Book
- - Get access
    
    Buy a print copy
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Computer vision is a rapidly growing field which aims to make computers 'see' as effectively as humans. In this book Dr Shapiro presents a new computer vision framework for interpreting time-varying imagery. This is an important task, since movement reveals valuable information about the environment. The fully-automated system operates on long, monocular image sequences containing multiple, independently-moving objects, and demonstrates the practical feasibility of recovering scene structure and motion in a bottom-up fashion. Real and synthetic examples are given throughout, with particular emphasis on image coding applications. Novel theory is derived in the context of the affine camera, a generalisation of the familiar scaled orthographic model. Analysis proceeds by tracking 'corner features' through successive frames and grouping the resulting trajectories into rigid objects using new clustering and outlier rejection techniques. The three-dimensional motion parameters are then computed via 'affine epipolar geometry', and 'affine structure' is used to generate alternative views of the object and fill in partial views. The use of all available features (over multiple frames) and the incorporation of statistical noise properties substantially improves existing algorithms, giving greater reliability and reduced noise sensitivity.

Uncertain Inference

Henry E. Kyburg, Jr, Choh Man Teng
Published online:

07 December 2009

Print publication:

06 August 2001
- Book
- - Get access
    
    Buy a print copy
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Coping with uncertainty is a necessary part of ordinary life and is crucial to an understanding of how the mind works. For example, it is a vital element in developing artificial intelligence that will not be undermined by its own rigidities. There have been many approaches to the problem of uncertain inference, ranging from probability to inductive logic to nonmonotonic logic. Thisbook seeks to provide a clear exposition of these approaches within a unified framework. The principal market for the book will be students and professionals in philosophy, computer science, and AI. Among the special features of the book are a chapter on evidential probability, which has not received a basic exposition before; chapters on nonmonotonic reasoning and theory replacement, matters rarely addressed in standard philosophical texts; and chapters on Mill's methods and statistical inference that cover material sorely lacking in the usual treatments of AI and computer science.

9 - Recognising Groups among Dialects
- By Jelena Prokić, University of Groningen, John Nerbonne, University of Groningen
Edited by John Nerbonne, University of Groningen, Charlotte Gooskens, University of Groningen, Sebastian Kürschner, Friedrich-Alexander-Universität Erlangen-Nürnberg, Renée van Bezooijen, University of Groningen
Book:

Computing and Language Variation

Published by:

Edinburgh University Press

Published online:

12 September 2012

Print publication:

04 December 2009, pp 153-172
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Abstract In this paper we apply various clustering algorithms to the dialect pronunciation data. At the same time we propose several evaluation techniques that should be used in order to deal with the instability of the clustering techniques. The results have shown that three hierarchical clustering algorithms are not suitable for the data we are working with. The rest of the tested algorithms have successfully detected two-way split of the data into the Eastern and Western dialects. At the aggregate level that we used in this research, no further division of sites can be asserted with high confidence.
INTRODUCTION
Dialectometry is a multidisciplinary field that uses various quantitative methods in the analysis of dialect data. Very often those techniques include classification algorithms such as hierarchical clustering algorithms used to detect groups within certain dialect area. Although known for their instability (Jain and Dubes, 1988), clustering algorithms are often applied without evaluation (Goebl, 2007; Nerbonne and Siedle, 2005) or with only partial evaluation (Moisl and Jones, 2005). Very small differences in the input data can produce substantially different grouping of dialects (Nerbonne et al., 2008). Without proper evaluation, it is very hard to determine if the results of the applied clustering technique are an artifact of the algorithm or the detection of real groups in the data.
The aim of this paper is to evaluate algorithms used to detect groups among language dialect varieties measured at the aggregate level. The data used in this research is dialect pronunciation data that consists of various pronunciations of 156 words collected all over Bulgaria.

6 - Mutual Intelligibility of Standard and Regional Dutch Language Varieties
- By Leen Impe, University of Leuven, Dirk Geeraerts, University of Leuven, Dirk Speelman, University of Leuven
Edited by John Nerbonne, University of Groningen, Charlotte Gooskens, University of Groningen, Sebastian Kürschner, Friedrich-Alexander-Universität Erlangen-Nürnberg, Renée van Bezooijen, University of Groningen
Book:

Computing and Language Variation

Published by:

Edinburgh University Press

Published online:

12 September 2012

Print publication:

04 December 2009, pp 101-118
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Abstract In this experimental study, we aim to arrive at a global picture of the mutual intelligibility of various Dutch language varieties by carrying out a computer-controlled lexical decision task in which ten target varieties are evaluated – the Belgian and Netherlandic Dutch standard language as well as four regional varieties of both countries. We auditorily presented real as well as pseudo-words in various varieties of Dutch to Netherlandic and Belgian test subjects, who were asked to decide as quickly as possible whether the items were existing Dutch words or not. The experiment's working assumption is that the faster the subjects react, the better the intelligibility of (the language variety of) the word concerned.
INTRODUCTION
Research framework
When speakers of different languages or language varieties communicate with each other, one group (generally the economically and culturally weaker one) often switches to the language or language variety of the other, or both groups of speakers adopt a third, common lingua franca. However, if the languages or language varieties are so much alike that the degree of mutual comprehension is sufficiently high, both groups of speakers might opt for communicating in their own language variety.
This type of interaction between closely related language varieties, which Haugen (1966) coins semicommunication and Braunmüller and Zeevaert (2001) refer to as receptive multilingualism, has been investigated between speakers of native Indian languages in the United States (Pierce 1952), between Spaniards and Portuguese (Jensen, 1989), between speakers of Scandinavian languages (Zeevaert, 2004; Gooskens, 2006; Lars-Olof Delsing, 2007) and between Slovaks and Czechs (Budovičová, 1987).

7 - The Dutch-German Border: Relating Linguistic, Geographic and Social Distances
- By Folkert de Vriend, Radboud University, Charlotte Giesbers, University in Nijmegen, Roeland van Hout, Louis ten Bosch, Radboud University
Edited by John Nerbonne, University of Groningen, Charlotte Gooskens, University of Groningen, Sebastian Kürschner, Friedrich-Alexander-Universität Erlangen-Nürnberg, Renée van Bezooijen, University of Groningen
Book:

Computing and Language Variation

Published by:

Edinburgh University Press

Published online:

12 September 2012

Print publication:

04 December 2009, pp 119-134
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Abstract In this paper we relate linguistic, geographic and social distances to each other in order to get a better understanding of the impact the Dutch-German state border has had on the linguistic characteristics of a sub-area of the Kleverlandish dialect area. This area used to be a perfect dialect continuum. We test three models for explaining today's pattern of linguistic variation in the area. In each model another variable is used as the determinant of linguistic variation: geographic distance (continuum model), the state border (gap model) and social distance (social model). For the social model we use perceptual data for friends, relatives and shopping locations. Testing the three models reveals that nowadays the dialect variation in the research area is closely related to the existence of the state border and to the social structure of the area. The geographic spatial configuration hardly plays a role anymore.
INTRODUCTION
The Dutch-German state border south of the river Rhine was established in 1830. Before that time, the administrative borders in this region frequently changed. The Kleverlandish dialect area, which extends from Duisburg in Germany to Nijmegen in The Netherlands, crosses the state border south of the Rhine. The area is demarcated by the Uerdingen line in the south, the diphthongisation line of the West Germanic ‘i’ in the West, and the border with the Low Saxon dialects of the Achterhoek area in the North-East. The geographic details of the area can be found in Figure 1 (the state border is depicted with a dashed-dotted line).

13 - The Role of Concept Characteristics in Lexical Dialectometry
- By Dirk Speelman, University of Leuven, Dirk Geeraerts, University of Leuven
Edited by John Nerbonne, University of Groningen, Charlotte Gooskens, University of Groningen, Sebastian Kürschner, Friedrich-Alexander-Universität Erlangen-Nürnberg, Renée van Bezooijen, University of Groningen
Book:

Computing and Language Variation

Published by:

Edinburgh University Press

Published online:

12 September 2012

Print publication:

04 December 2009, pp 221-242
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Abstract In this paper the role of concept characteristics in lexical dialectometric research is examined in three consecutive logical steps. First, a regression analysis of data taken from a large lexical database of Limburgish dialects in Belgium and The Netherlands is conducted to illustrate that concept characteristics such as concept salience, concept vagueness and negative affect contribute to the lexical heterogeneity in the dialect data. Next, it is shown that the relationship between concept characteristics and lexical heterogeneity influences the results of conventional lexical dialectometric measurements. Finally, a dialectometric procedure is proposed which downplays this undesired influence, thus making it possible to obtain a clearer picture of the ‘truly’ regional variation. More specifically, a lexical dialectometric method is proposed in which concept characteristics form the basis of a weighting schema that determines to which extent concept specific dissimilarities can contribute to the aggregate dissimilarities between locations.
BACKGROUND AND RESEARCH QUESTIONS
An important assumption underlying most if not all methods of dialectometry is that the automated analysis of the differences in language use between different locations, as they are recorded by dialectologists in large scale surveys, can reveal patterns which directly reflect regional variation. In this paper, in which we focus on lexical variation, we want to address one factor, viz. concept characteristics, which we will claim complicates this picture.
The argumentation which underlies our claim consists of three consecutive logical steps. As a first step, we analyse data taken from a large lexical database of Limburgish dialects in Belgium and The Netherlands, in which we more particularly zoom in on the names for concepts in the field of ‘the human body’.

2 - Panel Discussion on Computing and the Humanities
- By John Nerbonne, University of Groningen, Paul Heggarty, Cambridge, Roeland van Hout, David Robey, Oxford e-Research Centre
Edited by John Nerbonne, University of Groningen, Charlotte Gooskens, University of Groningen, Sebastian Kürschner, Friedrich-Alexander-Universität Erlangen-Nürnberg, Renée van Bezooijen, University of Groningen
Book:

Computing and Language Variation

Published by:

Edinburgh University Press

Published online:

12 September 2012

Print publication:

04 December 2009, pp 19-38
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This is the report of a panel discussion held in connection with the special session on computational methods in dialectology at Methods XIII: Methods in Dialectology on 5 August, 2008 at the University of Leeds. We scheduled this panel discussion in order to reflect on what the introduction of computational methods has meant to our subfield of linguistics, dialectology (in alternative divisions of linguistic subfields also known as variationist linguistics), and whether the dialectologists' experience is typical of such introductions in other humanities studies. Let's emphasise that we approach the question as working scientists and scholars in the humanities rather than as methodology experts or as historians or philosophers of science, i.e. we wished to reflect on how the introduction of computational methods has gone in our own field in order to conduct our own future research more effectively, or alternatively, to suggest to colleagues in neighbouring disciplines which aspects of computational studies have been successful, which have not been, and which might have been introduced more effectively. Since we explicitly wished to reflect not only on how things have gone in dialectology, but also to compare our experiences to others, we invited panellists with broad experience in linguistics and other fields.
We introduce the chair and panellists briefly.
John Nerbonne chaired the panel discussion. He works on dialectology, but also on grammar, and on applications such as language learning and information extraction and information access. He works in Groningen, and is past president of the Association for Computational Linguistics (2002).

11 - Factor Analysis of Vowel Pronunciation in Swedish Dialects
- By Therese Leinonen, University of Helsinki
Edited by John Nerbonne, University of Groningen, Charlotte Gooskens, University of Groningen, Sebastian Kürschner, Friedrich-Alexander-Universität Erlangen-Nürnberg, Renée van Bezooijen, University of Groningen
Book:

Computing and Language Variation

Published by:

Edinburgh University Press

Published online:

12 September 2012

Print publication:

04 December 2009, pp 189-204
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Abstract In this study 91 local Swedish dialects were analysed based on vowel pronunciation. Acoustic measurements of vowel quality were made for 18 vowels of 1,014 speakers by means of principal component analysis of vowel spectra. Two principal components were extracted explaining more than ¾ of the total variance in the vowel spectra. Plotting vowels in the PC1-PC2 plane showed a solution with strong resemblance to vowels in a formant plane. Per location averages of all speakers were calculated and factor analysis was run with the 91 locations as data cases and the two acoustic component of the 18 words as variables. Nine factors were extracted corresponding to distinct geographic distribution patterns. The factor scores of the analysis revealed co-occurrence of a number of linguistic features.
INTRODUCTION
The traditional method of identifying dialect areas has been the so-called isogloss method, where researchers choose some linguistic features that they find representative for the dialect areas and draw lines on maps based on different realisations of these features. One problem with the isogloss method is that isoglosses rarely coincide, and a second is that the choice of linguistic features is subjective and depends on what the researcher chooses to emphasise. Dialectometric research has been trying to avoid these problems by aggregating over large data sets and using more objective data-driven methods when determining dialect areas (Séguy, 1973; Goebl, 1982; Heeringa, 2004; Nerbonne, 2009).

12 - Representing Tone in Levenshtein Distance
- By Cathryn Yang, La Trobe University, Andy Castro
Edited by John Nerbonne, University of Groningen, Charlotte Gooskens, University of Groningen, Sebastian Kürschner, Friedrich-Alexander-Universität Erlangen-Nürnberg, Renée van Bezooijen, University of Groningen
Book:

Computing and Language Variation

Published by:

Edinburgh University Press

Published online:

12 September 2012

Print publication:

04 December 2009, pp 205-220
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Abstract Levenshtein distance, also known as string edit distance, has been shown to correlate strongly with both perceived distance and intelligibility in various Indo-European languages (Gooskens and Heeringa, 2004; Gooskens, 2006). We apply Levenshtein distance to dialect data from Bai (Allen, 2004), a Sino-Tibetan language, and Hongshuihe (HSH) Zhuang (Castro and Hansen, accepted), a Tai language. In applying Levenshtein distance to languages with contour tone systems, we ask the following questions: 1) How much variation in intelligibility can tone alone explain? and 2) Which representation of tone results in the Levenshtein distance that shows the strongest correlation with intelligibility test results? This research evaluates six representations of tone: onset, contour and offset; onset and contour only; contour and offset only; target approximation (Xu & Wang, 2001), autosegments of H and L, and Chao's (1930) pitch numbers. For both languages, the more fully explicit onset-contouroffset and onset-contour representations showed significantly stronger inverse correlations with intelligibility. This suggests that, for cross-dialectal listeners, the optimal representation of tone in Levenshtein distance should be at a phonetically explicit level and include information on both onset and contour.
INTRODUCTION
The Levenshtein distance algorithm measures the phonetic distance between closely related language varieties by counting the cost of transforming the phonetic segment string of one cognate into another by means of insertions, deletions and substitutions. After Kessler (1995) first applied the algorithm to dialect data in Irish Gaelic, Heeringa (2004) showed that cluster analysis based on Levenshtein distances agreed remarkably with expert consensus on Dutch dialect groupings.

From the Editors
Edited by John Nerbonne, University of Groningen, Charlotte Gooskens, University of Groningen, Sebastian Kürschner, Friedrich-Alexander-Universität Erlangen-Nürnberg, Renée van Bezooijen, University of Groningen
Book:

Computing and Language Variation

Published by:

Edinburgh University Press

Published online:

12 September 2012

Print publication:

04 December 2009, pp v-vi
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

We are pleased to launch the first of several special issues designed to highlight cutting-edge research, methods, applications, literature, and websites in key fields of humanities and arts computing. The current double issue on variationist linguistics and computational humanities is an exemplar of what we hope to accomplish, especially in shortening the time it takes for important papers to move from initial presentation to publication. Under the guest editorship of John Nerbonne, Professor of Humanities Computing, Charlotte Gooskens, Associate Professor of Scandinavian Languages and Literature, both at the University of Groningen, The Netherlands, Sebastian Kürschner, Tenure track position (‘Juniorprofessur’) in variationist linguistics and language contact at the University of Erlangen-Nürnberg, Germany, and Renée van Bezooijen, Researcher at the University of Groningen, The Netherlands. This issue also introduces a roundtable discussion that we intend to become a regular feature of these special editions. The aim of the forum is to assess contributions to the field and link them to the broader interests of humanities and arts computing, as well as to highlight opportunities for connection and research within and among disciplines.
Over the next year, we will publish two additional thematic issues. Volume 3.1 will focus on humanities GIS. The past decade has witnessed an explosion of interest in the application of geo-spatial technologies to history, literature, and other arts and humanities disciplines. The special issue will highlight leading presentations from an August 2008 conference at the University of Essex and will include two new features – book reviews and website/tool reviews.

Notes on Contributors
Edited by John Nerbonne, University of Groningen, Charlotte Gooskens, University of Groningen, Sebastian Kürschner, Friedrich-Alexander-Universität Erlangen-Nürnberg, Renée van Bezooijen, University of Groningen
Book:

Computing and Language Variation

Published by:

Edinburgh University Press

Published online:

12 September 2012

Print publication:

04 December 2009, pp vii-xii
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

3 - Making Sense of Strange Sounds: (Mutual) Intelligibility of Related Language Varieties. A Review
- By Vincent J. van Heuven, Leiden University
Edited by John Nerbonne, University of Groningen, Charlotte Gooskens, University of Groningen, Sebastian Kürschner, Friedrich-Alexander-Universität Erlangen-Nürnberg, Renée van Bezooijen, University of Groningen
Book:

Computing and Language Variation

Published by:

Edinburgh University Press

Published online:

12 September 2012

Print publication:

04 December 2009, pp 39-62
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

INTRODUCTION
Two basic questions
In this paper we ask two questions, which superficially seem to ask the same thing but in actual fact do not. First, we ask to what degree two languages (or language varieties) A and B resemble each other. The second question is how well a listener of variety B understands a speaker of variety A.
When we ask to what degree two language varieties resemble one another, or how different they are (which is basically the same question), it should be clear that the answer cannot be expressed in a single number. Languages differ from each other not in just one dimension but in a great many respects. They may differ in their sound inventories, in the details of the sounds in the inventory, in their stress, tone and intonation systems, in their vocabularies, and in the way they build words from morphemes and sentences from words. Last, but not least, they may differ in the meanings they attach to the forms in the language, in so far as the forms in two languages may be related to each other. In order to express the distance between two languages, we need a weighted average of the component distances along each of the dimensions identified (and probably many more). So, linguistic distance is a multidimensional phenomenon and we have no a priori way of weighing the dimensions.
The answer to the question how well listener B understands speaker A can be expressed as a single number. If listener B does not understand speaker A at all, the number would be zero. If listener B gets every detail of speaker A's intentions, the score would be maximal.

Frontmatter
Edited by John Nerbonne, University of Groningen, Charlotte Gooskens, University of Groningen, Sebastian Kürschner, Friedrich-Alexander-Universität Erlangen-Nürnberg, Renée van Bezooijen, University of Groningen
Book:

Computing and Language Variation

Published by:

Edinburgh University Press

Published online:

12 September 2012

Print publication:

04 December 2009, pp i-ii
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

14 - What Role does Dialect Knowledge Play in the Perception of Linguistic Distances?
- By Wilbert Heeringa, Meertens Institute, Charlotte Gooskens, University of Groningen, Koenraad De Smedt, University of Bergen
Edited by John Nerbonne, University of Groningen, Charlotte Gooskens, University of Groningen, Sebastian Kürschner, Friedrich-Alexander-Universität Erlangen-Nürnberg, Renée van Bezooijen, University of Groningen
Book:

Computing and Language Variation

Published by:

Edinburgh University Press

Published online:

12 September 2012

Print publication:

04 December 2009, pp 243-260
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Abstract The present paper investigates to what extent subjects base their judgments of linguistic distances on actual dialect data presented in a listening experiment and to what extent they make use of previous knowledge of the dialects when making their judgments. The point of departure for our investigation were distances between 15 Norwegian dialects as perceived by Norwegian listeners. We correlated these perceptual distances with objective phonetic distances measured on the basis of the transcriptions of the recordings used in the perception experiment. In addition, we correlated the perceptual distances with objective distances based on other datasets. On the basis of the correlation results and multiple regression analyses we conclude that the listeners did not base their judgments solely on information that they heard during the experiments but also on their general knowledge of the dialects. This conclusion is confirmed by the fact that the effect is stronger for the group of listeners who recognised the dialects than for listeners who did not recognise the dialects on the tape.
INTRODUCTION
To what extent do subjects base their judgment of linguistic distances between dialects on what they really hear, i.e. on the linguistic phenomena available in the speech signal, and to what degree do they generalise from the knowledge that they have from previous confrontations with the dialects? This is the central question of the investigation described in this paper. The answer to this question is important to scholars who want to understand how dialect speakers perceive dialect pronunciation differences and may give more insight in the mechanisms behind the way in which linguistic variation is experienced.

5 - Linguistic Determinants of the Intelligibility of Swedish Words among Danes
- By Sebastian Kürschner, University of Erlangen-Nürnberg, Charlotte Gooskens, University of Groningen, Renée van Bezooijen, University of Nijmegen
Edited by John Nerbonne, University of Groningen, Charlotte Gooskens, University of Groningen, Sebastian Kürschner, Friedrich-Alexander-Universität Erlangen-Nürnberg, Renée van Bezooijen, University of Groningen
Book:

Computing and Language Variation

Published by:

Edinburgh University Press

Published online:

12 September 2012

Print publication:

04 December 2009, pp 83-100
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Abstract In the present investigation we aim to determine to which degree various linguistic factors contribute to the intelligibility of Swedish words among Danes. We correlated the results of an experiment on word intelligibility with eleven linguistic factors and carried out logistic regression analyses. In the experiment, the intelligibility of 384 frequent Swedish words was tested among Danish listeners via the Internet. The choice of eleven linguistic factors was motivated by their contribution to intelligibility in earlier studies. The highest correlation was found in the negative correlation between word intelligibility and phonetic distances. Also word length, different syllable numbers, foreign sounds, neighbourhood density, word frequency, orthography, and the absence of the prosodic phenomenon of ‘stød’ in Swedish contribute significantly to intelligibility. Although the results thus show that linguistic factors contribute to the intelligibility of single words, the amount of explained variance was not very large (R2(Cox and Snell)= .16, R2 (Nagelkerke) = .21) when compared with earlier studies which were based on aggregate intelligibility. Partly, the lower scores result from the logistic regression model used. It was necessary to use logistic regression in our study because the intelligibility scores were coded in a binary variable. Additionally, we attribute the lower correlation to the higher number of idiosyncrasies of single words compared with the aggregate intelligibility and linguistic distance used in earlier studies. Based on observations in the actual data from the intelligibility experiment, we suggest further steps to be taken to improve the predictability of word intelligibility.

15 - Quantifying Dialect Similarity by Comparison of the Lexical Distribution of Phonemes
- By Warren Maguire, University of Edinburgh
Edited by John Nerbonne, University of Groningen, Charlotte Gooskens, University of Groningen, Sebastian Kürschner, Friedrich-Alexander-Universität Erlangen-Nürnberg, Renée van Bezooijen, University of Groningen
Book:

Computing and Language Variation

Published by:

Edinburgh University Press

Published online:

12 September 2012

Print publication:

04 December 2009, pp 261-278
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Abstract This paper describes a new method for quantifying the similarity of the lexical distribution of phonemes in different varieties of a language (in this case English). In addition to introducing the method, it discusses phonological problems which must be addressed if any comparison of this sort is to be attempted, and applies the method to a limited data set of varieties of English. Since the method assesses their structural similarity, it will be useful for analysing the historical development of varieties of English and the relationships (either as a result of common origin or of contact) that hold between them.
INTRODUCTION
In recent years considerable progress has been made in assessing the relationships between linguistic varieties by measuring the similarity between strictly comparable sets of phonetic data. In particular, measurement of Levenshtein Distance (see, for example, Nerbonne, Heeringa, and Kleiweg, 1999; Nerbonne and Heeringa, 2001; Heeringa, 2004) has proved useful for determining the relationships between closely related varieties, and the ‘Sound Comparisons’ method for assessing the distance between varieties provides a very promising alternative technique for looking into the changing relationships between closely-related and not so closely-related varieties (Heggarty, McMahon and McMahon, 2005; McMahon, Heggarty, McMahon and Maguire, 2007).
Phonetic comparison algorithms of this sort are not, however, without their problems. Firstly, they often depend upon auditory phonetic transcriptions of one degree of fineness or another, with all the associated issues of transcriber isoglosses, inaccuracies and realism that this method brings (see Milroy and Gordon, 2003: 144–152 for a discussion of the issues).

Computing: general interest

Refine search

Refine search

Actions for selected content:

1645 results in Computing: general interest

Challenges in Natural Language Processing

Natural Language Parsing

Efficient Algorithms for Listing Combinatorial Structures

Semantic Interpretation and the Resolution of Ambiguity

Affine Analysis of Image Sequences

Uncertain Inference

9 - Recognising Groups among Dialects

Summary

6 - Mutual Intelligibility of Standard and Regional Dutch Language Varieties

Summary

7 - The Dutch-German Border: Relating Linguistic, Geographic and Social Distances

Summary

13 - The Role of Concept Characteristics in Lexical Dialectometry

Summary

2 - Panel Discussion on Computing and the Humanities

Summary

11 - Factor Analysis of Vowel Pronunciation in Swedish Dialects

Summary

12 - Representing Tone in Levenshtein Distance

Summary

From the Editors

Summary

Notes on Contributors

3 - Making Sense of Strange Sounds: (Mutual) Intelligibility of Related Language Varieties. A Review

Summary

Frontmatter

14 - What Role does Dialect Knowledge Play in the Perception of Linguistic Distances?

Summary

5 - Linguistic Determinants of the Intelligibility of Swedish Words among Danes

Summary

15 - Quantifying Dialect Similarity by Comparison of the Lexical Distribution of Phonemes

Summary

Computing: general interest

Refine search

Refine search

Actions for selected content:

Save Search

1645 results in Computing: general interest

Challenges in Natural Language Processing

Natural Language Parsing

Efficient Algorithms for Listing Combinatorial Structures

Semantic Interpretation and the Resolution of Ambiguity

Affine Analysis of Image Sequences

Uncertain Inference

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary