Hostname: page-component-78c5997874-j824f Total loading time: 0 Render date: 2024-11-03T08:33:10.979Z Has data issue: false hasContentIssue false

Automatic generation of short answer questions for reading comprehension assessment

Published online by Cambridge University Press:  13 January 2016

YAN HUANG
Affiliation:
Language Technology Lab, Department of Theoretical and Applied Linguistics, University of Cambridge, 9 West Road, Cambridge CB3 9DA, UK e-mail: [email protected]
LIANZHEN HE
Affiliation:
School of International Studies, Zhejiang University, No. 866 Yuhangtang Road, Hangzhou, 310058, P.R. China e-mail: [email protected]

Abstract

Writing items for reading comprehension assessment is time-consuming. Automating part of the process can help test-designers to develop assessments more efficiently and consistently. This paper presents an approach to automatically generating short answer questions for reading comprehension assessment. Our major contribution is to introduce Lexical Functional Grammar (LFG) as the linguistic framework for question generation, which enables systematic utilization of semantic and syntactic information. The approach can efficiently generate questions of better quality than previous high-performing question generation systems, and uses paraphrasing and sentence selection to improve the cognitive complexity and effectiveness of questions.

Type
Articles
Copyright
Copyright © Cambridge University Press 2016 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Adamson, D., Bhartiya, D., Gujral, B., Kedia, R., Singh, A., and Rosé, C. P., 2013. Automatically generating discussion questions. In Proceedings of the 16th International Conference on Artificial Intelligence in Education, Memphis, TN, USA. Berlin: Springer, pp. 8190.Google Scholar
Agarwal, M., and Mannem, P., 2011. Automatic gap-fill question generation from text books. In Proceedings of the 6th Workshop on Innovative Use of NLP for Building Educational Applications, Portland, OR, USA. Stroudsburg, PA: ACL, pp. 5664.Google Scholar
Ali, H., Chali, Y., and Hasan, S. A., 2010. Automation of question generation from sentences. In Proceedings of QG2010: the 3rd Workshop on Question Generation, Pittsburgh, PA, USA. Pittsburgh, PA: questiongeneration.org, pp. 5867.Google Scholar
Anderson, R. C., 1972. How to construct achievement tests to assess comprehension. Review of Educational Research 42 (2): 145–70.Google Scholar
Bachman, L. F., and Palmer, A. S., 1996. Language Testing in Practice: Designing and Developing Useful Language Tests. Oxford: Oxford University Press.Google Scholar
Baldwin, T., Dras, M., Hockenmaier, J., King, T. H., and van Noord, G., 2007. The impact of deep linguistic processing on parsing technology. In Proceedings of the 10th International Conference on Parsing Technologies, Prague, Czech Republic. Stroudsburg, PA: ACL, pp. 36–8.Google Scholar
Banerjee, S., and Pedersen, T. 2002. An adapted Lesk algorithm for word sense disambiguation using WordNet. In Gelbukh, A. (ed.), Computational Linguistics and Intelligent Text Processing. Berlin: Springer, pp. 136–45.Google Scholar
Becker, L., Basu, S., and Vanderwende, L., 2012. Mind the gap: learning to choose gaps for question generation. In Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Montréal, Canada. Stroudsburg, PA: ACL, pp. 742–51.Google Scholar
Bernhard, D., De Viron, L., Moriceau, V. E. R., and Tannier, X., 2012. Question generation for French: collating parsers and paraphrasing questions. Dialogue & Discourse 3 (2): 4374.Google Scholar
Bloom, B. S., Engelhart, M. D., Furst, E. J., Hill, W. H., and Krathwohl, D. R., 1956. Taxonomy of Educational Objectives: Handbook I: Cognitive Domain. New York, NY: David McKay.Google Scholar
Bormuth, J. R., 1970. On the Theory of Achievement Test Items: with an Appendix on the Linguistic Bases of the Theory of Writing Items. Chicago: University of Chicago Press.Google Scholar
Burgess, C., Livesay, K., and Lund, K., 1998. Explorations in context space: words, sentences, discourse. Discourse Processes 25 (2–3): 211–57.Google Scholar
Carroll, J. B. 1968. The psychology of language testing. In Davies, A. (ed.), Language Testing Symposium: a Psycholinguistic Perspective. London: Oxford University Press.Google Scholar
Charniak, E., 2001. Immediate-head parsing for language models. In Proceedings of the 39th Annual Meeting on Association for Computational Linguistics, Toulouse, France. Stroudsburg, PA: ACL, pp. 124–31.Google Scholar
Charniak, E., and Elsner, M., 2009. EM works for pronoun anaphora resolution. In Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, Athens, Greece. Stroudsburg, PA: ACL, pp. 148–56.Google Scholar
Chomsky, N. 1977. On wh-movement. In Culicover, W. and Akmajan (eds.), Formal Syntax, pp. 91132. New York, NY: Academic Press.Google Scholar
Ciaramita, M., and Altun, Y., 2006. Broad-coverage sense disambiguation and information extraction with a supersense sequence tagger. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, Sydney, Australia. Stroudsburg, PA: ACL, pp. 594602.Google Scholar
Curto, S., Mendes, A. C., and Coheur, L., 2012. Question generation based on lexico-syntactic patterns learned from the web. Dialogue & Discourse 3 (2): 147–75.Google Scholar
Dowty, D., 1991. Thematic proto-roles and argument selection. Language 67 (3): 547619.Google Scholar
Dumais, S. T., Furnas, G. W., Landauer, T. K., Deerwester, S., and Harshman, R., 1988. Using latent semantic analysis to improve access to textual information. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Washington, DC, USA. New York, NY: ACM, pp. 281–5.CrossRefGoogle Scholar
Ebel, R. L., 1972. Essentials of Educational Measurement. Englewood Cliffs, NJ: Prentice-Hall.Google Scholar
Felker, D. B., and Dapra, R. A., 1975. Effects of question type and question placement on problem-solving ability from prose material. Journal of Educational Psychology 67 (3): 380–4.CrossRefGoogle Scholar
Ganitkevitch, J., Van Durme, B., and Callison-Burch, C., 2013. PPDB: the paraphrase database. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Atlanta, GA, USA. Stroudsburg, PA: ACL, pp. 758–64.Google Scholar
Gates, D. M. 2008. Automatically generating reading comprehension look-back strategy: questions from expository texts. Technical Report CMU-LTI-08-011. School of Computer Science, Carnegie-Mellon University.Google Scholar
Graesser, A., Rus, V. and Cai, Z. 2008. Question classification schemes. Proceedings of the NSF-sponsored Workshop on the Question Generation Shared Task and Evaluation Challenge, Arlington, VA, USA. Retrieved on April 5, 2013: http://www.cs.memphis.edu/~vrus/questiongeneration/16-GraesserEtAl-QG08.pdf.Google Scholar
Heilman, M., and Smith, N. A., 2010. Good question! statistical ranking for question generation. In Proceedings of the 2010 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Los Angeles, CA, USA. Stroudsburg, PA: ACL, pp. 609–17.Google Scholar
Hrebcek, L. 1993. Text as a construct of aggregations. Contributions to Quantitative Linguistics: Proceedings of the 1st International Conference on Quantitative Linguistics, Trier, German. Dordrecht: Springer, pp. 33–9.CrossRefGoogle Scholar
Ishikawa, S. 2013. ICNALE: the international corpus network of Asian learners of English. Retrieved on November 21, 2014: http://language.sakura.ne.jp/icnale/about.html.Google Scholar
Jackendoff, R. 2010. Your theory of language evolution depends on your theory of language. In Larson, R. K., Déprez, V., and Yamakido, H. (eds.), The Evolution of Human Languages: Biolinguistic Perspectives. Cambridge: Cambridge University Press, pp. 6372.Google Scholar
Kalady, S., Elikkottil, A., and Das, R., 2010. Natural language question generation using syntax and keywords. In Proceedings of QG2010: the 3rd Workshop on Question Generation, Pittsburgh, PA, USA. Pittsburgh, PA: questiongeneration.org, pp. 110.Google Scholar
Kaplan, R. M., and Bresnan, J. 1982. Lexical-functional grammar: a formal system for grammatical representation. In Dalrymple, M., Kaplan, R. M., Maxwell, J. III, and Zaenen, A. (eds.), Formal Issues in Lexical-Functional Grammar. Stanford, CA: CSLI, pp. 29130.Google Scholar
Kunichika, H., Katayama, T., Hirashima, T., and Takeuchi, A., 2003. Automated question generation methods for intelligent English learning systems and its evaluation. In Proceedings of the International Conference on Computers in Education, Hong Kong, China. Hong Kong: AACE, pp. 25.Google Scholar
Landauer, T. K., and Dumais, S. T., 1997. A solution to Plato’s problem: the latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Review 104 (2): 211–40.Google Scholar
Landis, J. R., and Koch, G. G., 1977. The measurement of observer agreement for categorical data. Biometrics 33 (1): 159–74.Google Scholar
Levy, O., Goldberg, Y., and Dagan, I., 2015. Improving distributional similarity with lessons learned from word embeddings. Transactions of the Association for Computational Linguistics 3: 211–25.Google Scholar
Lindberg, F., Popowich, D., Winne, J., and Nesbit, P., 2013. Generating natural language questions to support learning on-line. In Proceedings of the 14th European Workshop on Natural Language Generation, Sofia, Bulgaria. Stroudsburg, PA: ACL, pp. 105–14.Google Scholar
Liu, M., Calvo, R. A., and Rus, V., 2012. G-Asks: an intelligent automatic question generation system for academic writing support. Dialogue & Discourse 3 (2): 101–24.Google Scholar
Mannem, P., Prasad, R., and Joshi, A., 2010. Question generation from paragraphs at UPenn: QGSTEC system description. In Proceedings of QG2010: the 3rd Workshop on Question Generation, Pittsburgh, PA, USA. Pittsburgh, PA: questiongeneration.org, pp. 8491.Google Scholar
Mazidi, K., and Nielsen, R. D., 2014. Linguistic considerations in automatic question generation. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Baltimore, MD, USA. Stroudsburg, PA: ACL, pp. 321–6.Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., and Dean, J. 2013. Distributed representations of words and phrases and their compositionality. Advances in Neural Information Processing Systems, Stateline, Nevada, San Diego, CA: NIPS Foudation, pp. 3111–9.Google Scholar
Miller, G. A., 1995. WordNet: a lexical database for English. Communications of the ACM 38 (11): 3941.Google Scholar
Minnen, G., Carroll, J., and Pearce, D., 2001. Applied morphological processing of English. Natural Language Engineering 7 (03): 207–23.Google Scholar
Mitkov, R., and Ha, L. A., 2003. Computer-aided generation of multiple-choice tests. In Proceedings of the HLT-NAACL 03 workshop on Building educational applications using natural language processing, Edmonton, Canada. Stroudsburg, PA: ACL, vol. 2, pp. 1722.Google Scholar
Mitkov, R., Ha, L. A., and Karamanis, N., 2006. A computer-aided environment for generating multiple-choice test items. Natural Language Engineering 12 (2): 177–94.Google Scholar
Mostow, J., Beck, J., Bey, J., Cuneo, A., Sison, J., Tobin, B., Valeri, J.et al., 2004. Using automated questions to assess reading comprehension, vocabulary, and effects of tutorial interventions. Technology Instruction Cognition and Learning 2: 97134.Google Scholar
Mostow, J., and Chen, W., 2009. Generating instruction automatically for the reading strategy of self-questioning. In Proceedings of the 14th International Conference on Artificial Intelligence in Education, Brighton, UK. IOS Press, pp. 465–72.Google Scholar
Mostow, J., and Jang, H., 2012. Generating diagnostic multiple choice comprehension cloze questions. In Proceedings of the 7th Workshop on Building Educational Applications Using NLP, Montréal, Canada. Stroudsburg, PA: ACL, pp. 136–46.Google Scholar
Navigli, R., and Ponzetto, S. P., 2012. Multilingual WSD with just a few lines of code: the BabelNet API. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Jeju Island, South Korea. Stroudsburg, PA: ACL, pp. 6772.Google Scholar
Ockey, G. J., 2009. Developments and challenges in the use of computer-based testing for assessing second language ability. The Modern Language Journal 93: 836–47.Google Scholar
Olney, A. M., Graesser, A. C., and Person, N. K., 2012. Question generation from concept maps. Dialogue & Discourse 3 (2): 7599.Google Scholar
Pal, S., Mondal, T., Pakray, P., Das, D., and Bandyopadhyay, S. 2010. QGSTEC system description–JUQGG: a rule based approach, Proceedings of QG2010: the 3rd Workshop on Question Generation, Pittsburgh, PA, USA. Pittsburgh, PA: questiongeneration.org, pp. 76–9.Google Scholar
Palmer, M., Gildea, D., and Kingsbury, P., 2005. The proposition bank: an annotated corpus of semantic roles. Computational Linguistics 31 (1): 71106.Google Scholar
Pennington, J., Socher, R., and Manning, C. D. 2014. GloVe: global vectors for word representation. In Proceedings of the Empirical Methods in Natural Language Processing, Doha, Qatar. Stroudsburg, PA: ACL, vol. 12, 1532–43.Google Scholar
Perfetti, C. A. 1992. The representation problem in reading acquisition. In Gough, P. B., Ehri, L. C., and Treiman, R. (ed.), Reading Acquisition, pp. 145–74. Hillsdale, NJ: Lawrence Erlbaum Associates.Google Scholar
Piwek, P., and Boyer, K. E., 2012. Varieties of question generation: introduction to this special issue. Dialogue & Discourse 3 (2): 19.CrossRefGoogle Scholar
Punyakanok, V., Roth, D., and Yih, W.-T., 2008. The importance of syntactic parsing and inference in semantic role labeling. Computational Linguistics 34 (2): 257–87.CrossRefGoogle Scholar
Raphael, T. E. 1982. Question-answering strategies for children. The Reading Teacher 36 (2): pp. 186–90.Google Scholar
Roid, G., and Haladyna, T. M., 1978. A comparison of objective-based and modified-Bormuth item writing techniques. Educational and Psychological Measurement 38 (1): 1928.Google Scholar
Rus, V., Wyse, B., Piwek, P., Lintean, M., Stoyanchev, S., and Moldovan, C., 2012. A detailed account of the first question generation shared task evaluation challenge. Dialogue & Discourse 3 (2): 177204.Google Scholar
Sag, I. A., and Wasow, T. 2011. Performance-compatible competence grammar. In Borsley, R. D., and Borjars, K. (eds.), Non-Transformational Syntax: Formal and Explicit Models of Grammar. Oxford: Wiley-Blackwell, pp. 359–77.CrossRefGoogle Scholar
Snow, R., Jurafsky, D., and Ng, A. Y., 2006. Semantic taxonomy induction from heterogenous evidence. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, Sydney, Australia. Stroudsburg, PA: ACL, pp. 801–8.Google Scholar
Spärck Jones, K., 2007. Automatic summarising: the state of the art. Information Processing & Management 43 (6): 1449–81.CrossRefGoogle Scholar
Varga, A., and Ha, L. A., 2010. WLV: a question generation system for the QGSTEC 2010 task B. In Proceedings of QG2010: the 3rd Workshop on Question Generation, Pittsburgh, PA, USA. Pittsburgh, PA: questiongeneration.org, pp. 80–3.Google Scholar
Weischedel, R., and Brunstein, A. 2005. BBN pronoun coreference and entity type corpus. Technical Report LDC2005T33. Linguistic Data Consortium.Google Scholar
Wolfe, J. H., 1976. Automatic question generation from text - an aid to independent study. SIGCUE Outlook 10 (SI): 104–12.Google Scholar
Yao, X., Bouma, G., and Zhang, Y., 2012. Semantics-based question generation and implementation. Dialogue & Discourse 3 (2): 1142.Google Scholar
Yao, X., and Zhang, Y., 2010. Question generation with minimal recursion semantics. In Proceedings of QG2010: the 3rd Workshop on Question Generation, Pittsburgh, PA, USA. Pittsburgh, PA: questiongeneration.org, pp. 6875.Google Scholar