Published online by Cambridge University Press: 24 February 2014
This study reports on the results of classroom research investigating the effects of corpus use in the process of revising compositions in English as a foreign language. Our primary aim was to investigate the relationship between the information extracted from corpus data and how that information actually helped in revising different types of errors in the essays. In ‘data-driven learning’, previous research has often failed to provide rigorous criteria for choosing the words or phrases suitable for correction with corpus data. By investigating the above relationship, this study aims to clarify what should be corrected by looking at corpus data. 93 undergraduate students from two universities in Tokyo wrote a short essay in 20 minutes without a dictionary, and the instructors gave coded error feedback for two lexical or grammatical errors. They deliberately selected one error which should be appropriate for checking against corpus data and one that was more likely to be corrected without using any reference resource. Three weeks later, a short hands-on instruction of the corpus query tool was given, followed by revision activities in which the participants were instructed to revise their first drafts, with or without the tool depending on the codes given to each error. 188 errors were automatically classified into three different categories (omission, addition and misformation) using natural language processing techniques. All words and phrases tagged for errors were further annotated for part-of-speech (POS) information. The results show that there was a significant difference in the accuracy rate among the three error types when the students consulted the corpus: omission and addition errors were easily identified and corrected, whereas misformation errors were low in correction accuracy. This reveals that certain errors are more suitable for checking against corpus data than others.