No CrossRef data available.
Article contents
Annotating an oral corpus using the Text Encoding Initiative. Methodology, problems, solutions1
Published online by Cambridge University Press: 01 March 2008
Abstract
The objective of this paper is to describe and evaluate the application of the Text Encoding Initiative (TEI) Guidelines to a corpus of oral French, this being the first corpus of oral French where the TEI has been used. The paper explains the purpose of the corpus, both in creating a specialist corpus of néo-contage that will broaden the range of oral corpora available, and, more importantly, in creating a dataset to explore a variety of oral French that has a particularly interesting status in terms of factors such as conception orale/écrite, réalisation médiale and comportement communicatif (Koch and Oesterreicher 2001). The linguistic phenomena to be encoded are both stylistic (speech and thought presentation) and syntactic (negation, detachment, inversion), and all represent areas where previous research has highlighted the significance of factors such as medium, register and discourse type, as well as a host of linguistic factors (syntactic, phonetic, lexical). After a discussion of how a tagset can be designed and applied within the TEI to encode speech and thought presentation, negation, detachment and inversion, the final section of the paper evaluates the benefits and possible drawbacks of the methodology offered by the TEI when applied to a syntactic and stylistic markup of an oral corpus.
- Type
- Articles
- Information
- Journal of French Language Studies , Volume 18 , Issue 1: Le français à la lumière des corpus , March 2008 , pp. 103 - 119
- Copyright
- Copyright © Cambridge University Press 2008