Generating example contexts to help children learn word meaning†

LIU LIU; JACK MOSTOW; GREGORY S. AIST

doi:10.1017/S1351324911000374

Generating example contexts to help children learn word meaning†

Published online by Cambridge University Press: 12 January 2012

LIU LIU ,

JACK MOSTOW and

GREGORY S. AIST

Show author details

LIU LIU: Affiliation:
Google Pittsburgh, 6425 Penn Ave. Suite 700., Pittsburgh, PA 15206, USA e-mail: [email protected]
JACK MOSTOW: Affiliation:
Project LISTEN, School of Computer Science, Carnegie Mellon University, RI-NSH 4103, 5000 Forbes Avenue, Pittsburgh, PA 15213, USA e-mail: [email protected]
GREGORY S. AIST: Affiliation:
Applied Linguistics and Communication Studies, Iowa State University, 206 Ross Hall, Ames, IA 50011, USA e-mail: [email protected]

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

This article addresses the problem of generating good example contexts to help children learn vocabulary. We describe VEGEMATIC, a system that constructs such contexts by concatenating overlapping five-grams from Google's N-gram corpus. We propose and operationalize a set of constraints to identify good contexts. VEGEMATIC uses these constraints to filter, cluster, score, and select example contexts. An evaluation experiment compared the resulting contexts against human-authored example contexts (e.g., from children's dictionaries and children's stories). Based on rating by an expert blind to source, their average quality was comparable to story sentences, though not as good as dictionary examples. A second experiment measured the percentage of generated contexts rated by lay judges as acceptable, and how long it took to rate them. They accepted only 28% of the examples, but averaged only 27 seconds to find the first acceptable example for each target word. This result suggests that hand-vetting VEGEMATIC's output may supply example contexts faster than creating them manually.

Type: Articles
Information: Natural Language Engineering , Volume 19 , Issue 2 , April 2013 , pp. 187 - 212

DOI: https://doi.org/10.1017/S1351324911000374 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2012

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

†

This work, performed while the first author was a Master's student in the Language Technologies Institute at Carnegie Mellon University, was supported by the Institute of Education Sciences, US Department of Education, through Grant R305A080157 to Carnegie Mellon University. The opinions expressed are those of the authors and do not necessarily represent the views of the Institute or the US Department of Education. We thank Dr. Margaret McKeown for her expertise and assistance, and our lay judges for their participation.

References

Aist, G. 2001. Towards automatic glossarization: automatically constructing and administering vocabulary assistance factoids and multiple-choice assessment. International Journal of Artificial Intelligence in Education 12: 212–31.Google Scholar

Aist, G. 2002. Helping children learn vocabulary during computer-assisted oral reading. Educational Technology and Society 5 (2): 147–63. http://ifets.ieee.org/periodical/vol_2_2002/aist.html Google Scholar

Beck, I. L., McKeown, M. G., and Kucan, L. 2002. Bringing Words to Life: Robust Vocabulary Instruction. New York: Guilford.Google Scholar

Beck, I. L., McKeown, M. G., and McCaslin, E. S. 1983. Vocabulary development: all contexts are not created equal. Elementary School Journal 83: 177–81.CrossRef Google Scholar

Biemiller, A. 2009. Words Worth Teaching: Closing the Vocabulary Gap. Columbus, OH: SRA/McGraw-Hill.Google Scholar

Bolger, D. J., Balass, M., Landen, E. and Perfetti, C. A. 2008. Contextual variation and definitions in learning the meanings of words: an instance-based learning approach. Discourse Processes 45 (2): 122–59.CrossRef Google Scholar

Brants, T. and Franz, A. 2006. Web 1T 5-gram Version 1. Philadelphia, PA: Linguistic Data Consortium.Google Scholar

Brown, J. C., Frishkoff, G. A. and Eskenazi, M. 2005. Automatic question generation for vocabulary assessment. In Proceedings of the Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, Vancouver, BC, Canada (October 6–8), pp. 819–26. Stroudsburg, PA: Association for Computational Linguistics.Google Scholar

Carlson, A. and Fette, I. 2007. Memory-based context-sensitive spelling correction at web scale. In Proceedings of the Sixth International Conference on Machine Learning and Applications (ICMLA'07), Cincinnati, OH (December 13–15), pp. 166–71. Washington, DC: IEEE Computer Society.Google Scholar

Carbonell, J. and Goldstein, J. 1998. The use of MMR diversity-based reranking for reordering documents and producing summaries. In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Melbourne, Australia (August 24–28), pp. 335–6. New York, NY: ACM.CrossRef Google Scholar

Carbonell, J., Klein, S., Miller, D., Steinbaum, M., Grassiany, T., and Frey, J. 2006. Context-based machine translation. In Proceedings of the 7th Conference of the Association for Machine Translation in the Americas (AMTA 2006), Cambridge, Massachusetts, USA (August 8–12).Google Scholar

Chandler, D. 2004. Semiotics: The Basics, 2nd ed.New York: Routledge.Google Scholar

Church, K. W. and Hanks, P. 1990. Word association norms, mutual information, and lexicography. Computational Linguistics 16 (1): 22–9.Google Scholar

Dowding, J., Aist, G., Hockey, B. A. and Bratt, E. O. 2003. Generating canonical example sentences using candidate words. In Working Papers of the 2003 AAAI Spring Symposium on Natural Language Generation in Spoken and Written Dialogue, Palo Alto, California, USA (March 24–26), pp. 23–7. Menlo Park, CA: AAAI Press.Google Scholar

Durme, B. V., Qian, T. and Schubert, L. 2008. Class-driven attribute extraction. In 22nd International Conference on Computational Linguistics (COLING 2008), Manchester, UK (August 18–22), pp. 921–8. Stroudsburg, PA: Association for Computational Linguistics.CrossRef Google Scholar

Fellbaum, C. 1998. WordNet: An Electronic Lexical Database. Cambridge, MA: MIT Press.CrossRef Google Scholar

Finkel, J. R., Grenager, T. and Manning, C. 2005. Incorporating non-local information into information extraction systems by Gibbs sampling. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, Ann Arbor, MI, USA, pp. 363–70. Stroudsburg, PA: Association for Computational Linguistics.Google Scholar

Fukkink, R. G., Blok, H. and Glopper, K. D. 2001. Deriving word meaning from written context: a multicomponential skill. Language Learning 51 (3): 477–96.CrossRef Google Scholar

Heiner, C., Beck, J. E. and Mostow, J. 2006. Automated vocabulary instruction in a reading tutor. In Proceedings of the 8th International Conference on Intelligent Tutoring Systems, LN CS, vol. 4053, Jhongli, Taiwan (June 26–30), pp. 741–3. Berlin: Springer Verlag.CrossRef Google Scholar

Herman, P. A., Anderson, R. C., Pearson, P. D. and Nagy, W. E. 1987. Incidental acquisition of word meaning from expositions with varied text features. Reading Research Quarterly 22 (3): 263–84.CrossRef Google Scholar

Hermjakob, U., Knight, K. and Iii, H. D. 2008. Name translation in statistical machine translation – learning when to transliterate. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Columbus, OH, USA, pp. 389–97. Stroudsburg, PA: Association for Computational Linguistics.Google Scholar

Jenkins, J. R., Stein, M. and Wysocki, K. 1984. Learning vocaulary through reading. American Educational Research Journal 21: 767–87.Google Scholar

Kuhn, M. R. and Stahl, S. A. 1998. Teaching children to learn word meaning from context: a synthesis and some questions. Journal of Literacy Research 30 (1): 119–38.CrossRef Google Scholar

Landis, J. R. and Koth, G. G. 1977. The measurement of observer agreement for categorical data. Biometrics 33 (1): 159–74.CrossRef Google Scholar PubMed

Liu, C.-L., Wang, C.-H., Gao, Z.-M., and Huang, S.-M. 2005. Applications of lexical information for algorithmically composing multiple-choice cloze items. In Proceedings of the Second Workshop on Building Educational Applications Using NLP, Ann Arbor, MI, USA (June 29), pp. 1–8. Stroudsburg, PA: Association for Computational Linguistics.Google Scholar

McKeown, M. G. 1985. The acquisition of word meaning from context by children of high and low ability. Reading Research Quarterly 20: 482–96.CrossRef Google Scholar

Mostow, J. 1983. Machine transformation of advice into a heuristic search procedure. In Michalski, R. S., Carbonell, J. G., and Mitchell, T. M. (eds.), Machine Learning, pp. 367–403. Palo Alto, CA: Tioga.Google Scholar

Mostow, J. and Aist, G. S. 1999. Giving help and praise in a reading tutor with imperfect listening – because automated speech recognition means never being able to say you're certain. CALICO Journal 16 (3): 407–24.CrossRef Google Scholar

Mostow, J. and Duan, W. 2011. Generating example contexts to illustrate a target word sense. In Proceedings of the 6th Workshop on Innovative Use of NLP for Building Educational Applications, Portland, OR, USA (June 24), pp. 105–10. Stroudsburg, PA: Association for Computational Linguistics.Google Scholar

Nagy, W. E., Anderson, R. C. and Herman, P. A. 1987. Learning word meanings from context during normal reading. American Educational Research Journal 24 (2): 237–70.CrossRef Google Scholar

Nagy, W. E., Herman, P. A. and Anderson, R. C. 1985. Learning words from context. Reading Research Quarterly 20 (2): 233–53.CrossRef Google Scholar

National Reading Panel 2000. Report of the National Reading Panel. Teaching children to read: an evidence-based assessment of the scientific research literature on reading and its implications for reading instruction (publication no. 00-4769). National Institute of Child Health & Human Development, Washington, DC. www.nichd.nih.gov/publications/nrppubskey.cfm Google Scholar

Oberlander, J., Karakatsiotis, G., Isard, A. and Androutsopoulos, I. 2008. Building an adaptive museum gallery in Second Life. In Proceedings of Museums and the Web: The International Conference for Culture and Heritage On-line, Montréal, Québec, Canada (April 9–12), pp. 749–53.Google Scholar

Paynter, D. E., Bodrova, E. and Doty, J. K. 2005. For the Love of Words: Vocabulary Instruction that Works, Grades K-6. San Francisco: Jossey-Bass.Google Scholar

Pino, J., Heilman, M. and Eskenazi, M. 2008. A selection strategy to improve cloze question quality. In Proceedings of the Workshop on Intelligent Tutoring Systems for Ill-Defined Domains. 9th International Conference on Intelligent Tutoring Systems, Montreal, Canada (June 23), pp. 22–34.Google Scholar

Reiter, E. and Dale, R. 2000. Building Natural Language Generation Systems. Cambridge, UK: Cambridge University Press.CrossRef Google Scholar

Schatz, E. K. and Baldwin, R. S. 1986. Context clues are unreliable predictors of word meanings. Reading Research Quarterly 21: 439–53.CrossRef Google Scholar

Schwanenflugel, P. J., Stahl, S. A. and McFalls, E. L. 1997. Partial word knowledge and vocabulary growth during reading comprehension. Journal of Literacy Research 29 (4): 531–53.CrossRef Google Scholar

Sleator, D. and Temperley, D. 1993. Parsing English with a link grammar. Proceedings of the Third International Workshop on Parsing Technologies, Tilburg, Netherlands (August 10–13).Google Scholar

Stanovich, K., West, R. and Cunningham, A. E. 1991. Beyond phonological processes: print exposure and orthographic processing. In Brady, S. and Shankweiler, D. (eds.), Phonological Processes in Literacy. pp. 219–235. Hillsdale, NJ: Lawrence Erlbaum.Google Scholar

Toutanova, K., Klein, D., Manning, C. and Singer, Y. 2003. Feature-rich part-of-speech tagging with a cyclic dependency network. In Proceedings of the Human Language Technology Conference (HLT-NAACL 2003), Edmonton, Canada (May 27–June 1), pp. 252–259. Stroudsburg, PA: Association for Computational Linguistics.Google Scholar

Toutanova, K. and Manning, C. D. 2000. Enriching the knowledge sources used in a maximum entropy part-of-speech tagger. In Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora (EMNLP/VLC-2000), Hong Kong, pp. 63–70. Stroudsburg, PA: Association for Computational Linguistics.Google Scholar

Yu, L.-C., Wu, C.-H., Philpot, A., and Hovy, E. 2007. OntoNotes: sense pool verification using Google N-gram and statistical tests. In Proceedings of the OntoLex Workshop at the 6th International Semantic Web Conference (ISWC 2007) (November 11), Busan, Korea.Google Scholar

Article contents

Generating example contexts to help children learn word meaning†

Abstract

Access options

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests