Automatic generation of short answer questions for reading comprehension assessment

YAN HUANG; LIANZHEN HE

doi:10.1017/S1351324915000455

Automatic generation of short answer questions for reading comprehension assessment

Published online by Cambridge University Press: 13 January 2016

YAN HUANG and

LIANZHEN HE

Show author details

YAN HUANG: Affiliation:
Language Technology Lab, Department of Theoretical and Applied Linguistics, University of Cambridge, 9 West Road, Cambridge CB3 9DA, UK e-mail: [email protected]
LIANZHEN HE: Affiliation:
School of International Studies, Zhejiang University, No. 866 Yuhangtang Road, Hangzhou, 310058, P.R. China e-mail: [email protected]

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Writing items for reading comprehension assessment is time-consuming. Automating part of the process can help test-designers to develop assessments more efficiently and consistently. This paper presents an approach to automatically generating short answer questions for reading comprehension assessment. Our major contribution is to introduce Lexical Functional Grammar (LFG) as the linguistic framework for question generation, which enables systematic utilization of semantic and syntactic information. The approach can efficiently generate questions of better quality than previous high-performing question generation systems, and uses paraphrasing and sentence selection to improve the cognitive complexity and effectiveness of questions.

Type: Articles
Information: Natural Language Engineering , Volume 22 , Issue 3 , May 2016 , pp. 457 - 489

DOI: https://doi.org/10.1017/S1351324915000455 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2016

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Adamson, D., Bhartiya, D., Gujral, B., Kedia, R., Singh, A., and Rosé, C. P., 2013. Automatically generating discussion questions. In Proceedings of the 16th International Conference on Artificial Intelligence in Education, Memphis, TN, USA. Berlin: Springer, pp. 81–90.Google Scholar

Agarwal, M., and Mannem, P., 2011. Automatic gap-fill question generation from text books. In Proceedings of the 6th Workshop on Innovative Use of NLP for Building Educational Applications, Portland, OR, USA. Stroudsburg, PA: ACL, pp. 56–64.Google Scholar

Ali, H., Chali, Y., and Hasan, S. A., 2010. Automation of question generation from sentences. In Proceedings of QG2010: the 3rd Workshop on Question Generation, Pittsburgh, PA, USA. Pittsburgh, PA: questiongeneration.org, pp. 58–67.Google Scholar

Anderson, R. C., 1972. How to construct achievement tests to assess comprehension. Review of Educational Research 42 (2): 145–70.Google Scholar

Bachman, L. F., and Palmer, A. S., 1996. Language Testing in Practice: Designing and Developing Useful Language Tests. Oxford: Oxford University Press.Google Scholar

Baldwin, T., Dras, M., Hockenmaier, J., King, T. H., and van Noord, G., 2007. The impact of deep linguistic processing on parsing technology. In Proceedings of the 10th International Conference on Parsing Technologies, Prague, Czech Republic. Stroudsburg, PA: ACL, pp. 36–8.Google Scholar

Banerjee, S., and Pedersen, T. 2002. An adapted Lesk algorithm for word sense disambiguation using WordNet. In Gelbukh, A. (ed.), Computational Linguistics and Intelligent Text Processing. Berlin: Springer, pp. 136–45.Google Scholar

Becker, L., Basu, S., and Vanderwende, L., 2012. Mind the gap: learning to choose gaps for question generation. In Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Montréal, Canada. Stroudsburg, PA: ACL, pp. 742–51.Google Scholar

Bernhard, D., De Viron, L., Moriceau, V. E. R., and Tannier, X., 2012. Question generation for French: collating parsers and paraphrasing questions. Dialogue & Discourse 3 (2): 43–74.Google Scholar

Bloom, B. S., Engelhart, M. D., Furst, E. J., Hill, W. H., and Krathwohl, D. R., 1956. Taxonomy of Educational Objectives: Handbook I: Cognitive Domain. New York, NY: David McKay.Google Scholar

Bormuth, J. R., 1970. On the Theory of Achievement Test Items: with an Appendix on the Linguistic Bases of the Theory of Writing Items. Chicago: University of Chicago Press.Google Scholar

Burgess, C., Livesay, K., and Lund, K., 1998. Explorations in context space: words, sentences, discourse. Discourse Processes 25 (2–3): 211–57.Google Scholar

Carroll, J. B. 1968. The psychology of language testing. In Davies, A. (ed.), Language Testing Symposium: a Psycholinguistic Perspective. London: Oxford University Press.Google Scholar

Charniak, E., 2001. Immediate-head parsing for language models. In Proceedings of the 39th Annual Meeting on Association for Computational Linguistics, Toulouse, France. Stroudsburg, PA: ACL, pp. 124–31.Google Scholar

Charniak, E., and Elsner, M., 2009. EM works for pronoun anaphora resolution. In Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, Athens, Greece. Stroudsburg, PA: ACL, pp. 148–56.Google Scholar

Chomsky, N. 1977. On wh-movement. In Culicover, W. and Akmajan (eds.), Formal Syntax, pp. 91–132. New York, NY: Academic Press.Google Scholar

Ciaramita, M., and Altun, Y., 2006. Broad-coverage sense disambiguation and information extraction with a supersense sequence tagger. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, Sydney, Australia. Stroudsburg, PA: ACL, pp. 594–602.Google Scholar

Curto, S., Mendes, A. C., and Coheur, L., 2012. Question generation based on lexico-syntactic patterns learned from the web. Dialogue & Discourse 3 (2): 147–75.Google Scholar

Dowty, D., 1991. Thematic proto-roles and argument selection. Language 67 (3): 547–619.Google Scholar

Dumais, S. T., Furnas, G. W., Landauer, T. K., Deerwester, S., and Harshman, R., 1988. Using latent semantic analysis to improve access to textual information. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Washington, DC, USA. New York, NY: ACM, pp. 281–5.CrossRef Google Scholar

Ebel, R. L., 1972. Essentials of Educational Measurement. Englewood Cliffs, NJ: Prentice-Hall.Google Scholar

Felker, D. B., and Dapra, R. A., 1975. Effects of question type and question placement on problem-solving ability from prose material. Journal of Educational Psychology 67 (3): 380–4.CrossRef Google Scholar

Ganitkevitch, J., Van Durme, B., and Callison-Burch, C., 2013. PPDB: the paraphrase database. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Atlanta, GA, USA. Stroudsburg, PA: ACL, pp. 758–64.Google Scholar

Gates, D. M. 2008. Automatically generating reading comprehension look-back strategy: questions from expository texts. Technical Report CMU-LTI-08-011. School of Computer Science, Carnegie-Mellon University.Google Scholar

Graesser, A., Rus, V. and Cai, Z. 2008. Question classification schemes. Proceedings of the NSF-sponsored Workshop on the Question Generation Shared Task and Evaluation Challenge, Arlington, VA, USA. Retrieved on April 5, 2013: http://www.cs.memphis.edu/~vrus/questiongeneration/16-GraesserEtAl-QG08.pdf.Google Scholar

Heilman, M., and Smith, N. A., 2010. Good question! statistical ranking for question generation. In Proceedings of the 2010 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Los Angeles, CA, USA. Stroudsburg, PA: ACL, pp. 609–17.Google Scholar

Hrebcek, L. 1993. Text as a construct of aggregations. Contributions to Quantitative Linguistics: Proceedings of the 1st International Conference on Quantitative Linguistics, Trier, German. Dordrecht: Springer, pp. 33–9.CrossRef Google Scholar

Ishikawa, S. 2013. ICNALE: the international corpus network of Asian learners of English. Retrieved on November 21, 2014: http://language.sakura.ne.jp/icnale/about.html.Google Scholar

Jackendoff, R. 2010. Your theory of language evolution depends on your theory of language. In Larson, R. K., Déprez, V., and Yamakido, H. (eds.), The Evolution of Human Languages: Biolinguistic Perspectives. Cambridge: Cambridge University Press, pp. 63–72.Google Scholar

Kalady, S., Elikkottil, A., and Das, R., 2010. Natural language question generation using syntax and keywords. In Proceedings of QG2010: the 3rd Workshop on Question Generation, Pittsburgh, PA, USA. Pittsburgh, PA: questiongeneration.org, pp. 1–10.Google Scholar

Kaplan, R. M., and Bresnan, J. 1982. Lexical-functional grammar: a formal system for grammatical representation. In Dalrymple, M., Kaplan, R. M., Maxwell, J. III, and Zaenen, A. (eds.), Formal Issues in Lexical-Functional Grammar. Stanford, CA: CSLI, pp. 29–130.Google Scholar

Kunichika, H., Katayama, T., Hirashima, T., and Takeuchi, A., 2003. Automated question generation methods for intelligent English learning systems and its evaluation. In Proceedings of the International Conference on Computers in Education, Hong Kong, China. Hong Kong: AACE, pp. 2–5.Google Scholar

Landauer, T. K., and Dumais, S. T., 1997. A solution to Plato’s problem: the latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Review 104 (2): 211–40.Google Scholar

Landis, J. R., and Koch, G. G., 1977. The measurement of observer agreement for categorical data. Biometrics 33 (1): 159–74.Google Scholar

Levy, O., Goldberg, Y., and Dagan, I., 2015. Improving distributional similarity with lessons learned from word embeddings. Transactions of the Association for Computational Linguistics 3: 211–25.Google Scholar

Lindberg, F., Popowich, D., Winne, J., and Nesbit, P., 2013. Generating natural language questions to support learning on-line. In Proceedings of the 14th European Workshop on Natural Language Generation, Sofia, Bulgaria. Stroudsburg, PA: ACL, pp. 105–14.Google Scholar

Liu, M., Calvo, R. A., and Rus, V., 2012. G-Asks: an intelligent automatic question generation system for academic writing support. Dialogue & Discourse 3 (2): 101–24.Google Scholar

Mannem, P., Prasad, R., and Joshi, A., 2010. Question generation from paragraphs at UPenn: QGSTEC system description. In Proceedings of QG2010: the 3rd Workshop on Question Generation, Pittsburgh, PA, USA. Pittsburgh, PA: questiongeneration.org, pp. 84–91.Google Scholar

Mazidi, K., and Nielsen, R. D., 2014. Linguistic considerations in automatic question generation. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Baltimore, MD, USA. Stroudsburg, PA: ACL, pp. 321–6.Google Scholar

Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., and Dean, J. 2013. Distributed representations of words and phrases and their compositionality. Advances in Neural Information Processing Systems, Stateline, Nevada, San Diego, CA: NIPS Foudation, pp. 3111–9.Google Scholar

Miller, G. A., 1995. WordNet: a lexical database for English. Communications of the ACM 38 (11): 39–41.Google Scholar

Minnen, G., Carroll, J., and Pearce, D., 2001. Applied morphological processing of English. Natural Language Engineering 7 (03): 207–23.Google Scholar

Mitkov, R., and Ha, L. A., 2003. Computer-aided generation of multiple-choice tests. In Proceedings of the HLT-NAACL 03 workshop on Building educational applications using natural language processing, Edmonton, Canada. Stroudsburg, PA: ACL, vol. 2, pp. 17–22.Google Scholar

Mitkov, R., Ha, L. A., and Karamanis, N., 2006. A computer-aided environment for generating multiple-choice test items. Natural Language Engineering 12 (2): 177–94.Google Scholar

Mostow, J., Beck, J., Bey, J., Cuneo, A., Sison, J., Tobin, B., Valeri, J.et al., 2004. Using automated questions to assess reading comprehension, vocabulary, and effects of tutorial interventions. Technology Instruction Cognition and Learning 2: 97–134.Google Scholar

Mostow, J., and Chen, W., 2009. Generating instruction automatically for the reading strategy of self-questioning. In Proceedings of the 14th International Conference on Artificial Intelligence in Education, Brighton, UK. IOS Press, pp. 465–72.Google Scholar

Mostow, J., and Jang, H., 2012. Generating diagnostic multiple choice comprehension cloze questions. In Proceedings of the 7th Workshop on Building Educational Applications Using NLP, Montréal, Canada. Stroudsburg, PA: ACL, pp. 136–46.Google Scholar

Navigli, R., and Ponzetto, S. P., 2012. Multilingual WSD with just a few lines of code: the BabelNet API. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Jeju Island, South Korea. Stroudsburg, PA: ACL, pp. 67–72.Google Scholar

Ockey, G. J., 2009. Developments and challenges in the use of computer-based testing for assessing second language ability. The Modern Language Journal 93: 836–47.Google Scholar

Olney, A. M., Graesser, A. C., and Person, N. K., 2012. Question generation from concept maps. Dialogue & Discourse 3 (2): 75–99.Google Scholar

Pal, S., Mondal, T., Pakray, P., Das, D., and Bandyopadhyay, S. 2010. QGSTEC system description–JUQGG: a rule based approach, Proceedings of QG2010: the 3rd Workshop on Question Generation, Pittsburgh, PA, USA. Pittsburgh, PA: questiongeneration.org, pp. 76–9.Google Scholar

Palmer, M., Gildea, D., and Kingsbury, P., 2005. The proposition bank: an annotated corpus of semantic roles. Computational Linguistics 31 (1): 71–106.Google Scholar

Pennington, J., Socher, R., and Manning, C. D. 2014. GloVe: global vectors for word representation. In Proceedings of the Empirical Methods in Natural Language Processing, Doha, Qatar. Stroudsburg, PA: ACL, vol. 12, 1532–43.Google Scholar

Perfetti, C. A. 1992. The representation problem in reading acquisition. In Gough, P. B., Ehri, L. C., and Treiman, R. (ed.), Reading Acquisition, pp. 145–74. Hillsdale, NJ: Lawrence Erlbaum Associates.Google Scholar

Piwek, P., and Boyer, K. E., 2012. Varieties of question generation: introduction to this special issue. Dialogue & Discourse 3 (2): 1–9.CrossRef Google Scholar

Punyakanok, V., Roth, D., and Yih, W.-T., 2008. The importance of syntactic parsing and inference in semantic role labeling. Computational Linguistics 34 (2): 257–87.CrossRef Google Scholar

Raphael, T. E. 1982. Question-answering strategies for children. The Reading Teacher 36 (2): pp. 186–90.Google Scholar

Roid, G., and Haladyna, T. M., 1978. A comparison of objective-based and modified-Bormuth item writing techniques. Educational and Psychological Measurement 38 (1): 19–28.Google Scholar

Rus, V., Wyse, B., Piwek, P., Lintean, M., Stoyanchev, S., and Moldovan, C., 2012. A detailed account of the first question generation shared task evaluation challenge. Dialogue & Discourse 3 (2): 177–204.Google Scholar

Sag, I. A., and Wasow, T. 2011. Performance-compatible competence grammar. In Borsley, R. D., and Borjars, K. (eds.), Non-Transformational Syntax: Formal and Explicit Models of Grammar. Oxford: Wiley-Blackwell, pp. 359–77.CrossRef Google Scholar

Snow, R., Jurafsky, D., and Ng, A. Y., 2006. Semantic taxonomy induction from heterogenous evidence. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, Sydney, Australia. Stroudsburg, PA: ACL, pp. 801–8.Google Scholar

Spärck Jones, K., 2007. Automatic summarising: the state of the art. Information Processing & Management 43 (6): 1449–81.CrossRef Google Scholar

Varga, A., and Ha, L. A., 2010. WLV: a question generation system for the QGSTEC 2010 task B. In Proceedings of QG2010: the 3rd Workshop on Question Generation, Pittsburgh, PA, USA. Pittsburgh, PA: questiongeneration.org, pp. 80–3.Google Scholar

Weischedel, R., and Brunstein, A. 2005. BBN pronoun coreference and entity type corpus. Technical Report LDC2005T33. Linguistic Data Consortium.Google Scholar

Wolfe, J. H., 1976. Automatic question generation from text - an aid to independent study. SIGCUE Outlook 10 (SI): 104–12.Google Scholar

Yao, X., Bouma, G., and Zhang, Y., 2012. Semantics-based question generation and implementation. Dialogue & Discourse 3 (2): 11–42.Google Scholar

Yao, X., and Zhang, Y., 2010. Question generation with minimal recursion semantics. In Proceedings of QG2010: the 3rd Workshop on Question Generation, Pittsburgh, PA, USA. Pittsburgh, PA: questiongeneration.org, pp. 68–75.Google Scholar

Article contents

Automatic generation of short answer questions for reading comprehension assessment

Abstract

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests