Computer-assisted assessment of free-text answers

Diana Pérez-Marín; Ismael Pascual-Nieto; Pilar Rodríguez

doi:10.1017/S026988890999018X

Computer-assisted assessment of free-text answers

Published online by Cambridge University Press: 01 December 2009

Diana Pérez-Marín ,

Ismael Pascual-Nieto and

Pilar Rodríguez

Show author details

Diana Pérez-Marín*: Affiliation:
Language and Computer Systems I Department, Computer Science Faculty, Office 2025, Ampliación del Rectorado Building, Tulipán Street, 28933 Móstoles, Universidad Rey Juan Carlos, Madrid, Spain; e-mail: [email protected]
Ismael Pascual-Nieto*: Affiliation:
Computer Science Department of the Universidad Autónoma of Madrid, Calle Francisco Tomás y Valiente, 11, Cantoblanco 28049, Madrid, Spain
Pilar Rodríguez*: Affiliation:
Computer Science Department of the Universidad Autónoma of Madrid, Calle Francisco Tomás y Valiente, 11, Cantoblanco 28049, Madrid, Spain
*: e-mail: [email protected], [email protected]
e-mail: [email protected], [email protected]

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

The automatic assessment of students’ free-text answers has recently received much attention, due to the necessity of exploring and taking advantage of new and more complex computer-based assessment methods. In this paper, a review of the state-of-art of the field is presented, focusing on the techniques that underpin these systems and their evaluation metrics. Although there is still a long way to go so as to reach the ideal system, the fact that the existing systems are already being used commercially and as a second opinion in exams such as GMAT proves the uptake of this field.

Type: Articles
Information: The Knowledge Engineering Review , Volume 24 , Issue 4 , December 2009 , pp. 353 - 374

DOI: https://doi.org/10.1017/S026988890999018X [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2009

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Alfonseca, E., Carro, R., Freire, M., Ortigosa, A., Pérez, D. 2004. Educational adaptive hypermedia meets computer assisted assessment. In Proceedings of the International Workshop of Educational Adaptive Hypermedia, collocated with the Adaptive Hypermedia (AH) Conference, Eindhoven, The Netherlands.Google Scholar

Birenbaum, M., Tatsuoka, K., Gutvirtz, Y. 1992. Effects of response format on diagnostic assessment of scholastic achievement. Applied Psychological Measurement 14(4), 353–363.CrossRef Google Scholar

Blayney, P., Freeman, M. 2003. Automated marking of individualised spreadsheet assignments: the impact of different formative self-assessment options. In Proceedings of the 7th Computer Assisted Assessment Conference, Loughborough, UK.Google Scholar

Bloom, B. 1956. Taxonomy of educational objectives: the classification of educational goals. Handbook I, Cognitive Domain. Longman, Whiteplains (New York); Toronto.Google Scholar

Burstein, J., Kukich, K., Wolff, S., Lu, C., Chodorow, M., Bradenharder, L., Harris, M. D. 1998. Automated scoring using a hybrid feature identification technique. In Proceedings of the Annual Meeting of the Association of Computational Linguistics, The Association of Computational Linguistics, Montreal, Quebec, Canada.CrossRef Google Scholar

Burstein, J., Leacock, C., Swartz, R. 2001. Automated evaluation of essays and short answers. In Proceedings of the 5th International Computer Asssited Assessment Conference, Loughborough, UK.Google Scholar

Callear, D., Jerrams-Smith, J., Soh, V. 2001. CAA of short non-MCQ answers. In Proccedings of the 5th International Computer Assissted Assessment conference, Loughborough, UK.Google Scholar

Christie, J. 1999. Automated essay marking—for both style and content. In Proceedings of the 3rd Computer Assisted Assessment International Conference, Loughborough, UK.Google Scholar

Christie, J. 2003. Automated essay marking for content—does it work? In Proceedings of the 7th International Computer Assisted Assessment Conference, Loughborough, UK.Google Scholar

Chung, G., O’Neill, H. 1997. Methodological Approaches to Online Scoring of Essays. Technical Report 461, UCLA, National Center for Research on Evaluation, Student Standards, and Testing, USA.Google Scholar

Cucchiarelli, A., Faggioli, E., Velardi, P. 2000. Will very large corpora play for semantic disambiguation the role that massive computing power is playing for other AI-hard problems? In Proceedings of the 2nd Conference on Language Resources and Evaluation, Greece.Google Scholar

Datar, A., Doddapaneni, N., Khanna, S., Kodali, V., Yadav, A. 2004. EGAL—Essay Grading and Analysis Logic, SourceForge project. http://egal.sourceforge.net Google Scholar

Darus, S., Hussin, S., Stapa, S. 2001. Students’ expectations of a computer-based essay marking system. In Reflections, Visions and Dreams of Practice: Selected papers from the IEC 2001 International Education Conference, Malaysia, 197–204.Google Scholar

Darus, S., Stapa, S. 2001. Lecturers’ expectations of a computer-based essay marking systems. Journal of the Malaysian English Language Teachers’ Association (MELTA) 30, 47–56.Google Scholar

Deerwester, S. C., Dumais, S. T., Landauer, T. K., Furnas, G. W., Harshman, R. A. 1990. Indexing by latent semantic analysis. Journal of the American Society of Information Science 41(6), 391–407.Google Scholar

Denton, P. 2003. Evaluation of the ‘electronic feedback’ marking assistant and analysis of a novel collusion detection facility. In Proceedings of the 7th Computer Assisted Assessment Conference, Loughborough, UK.Google Scholar

Dessus, P., Lemaire, B., Vernier, A. 2000. Free text assessment in a virtual campus. In Proceedings of the 3rd International Conference on Human System Learning, Paris, France, 61–75.Google Scholar

Foltz, P., Laham, D., Landauer, T. 1999. The intelligent essay assessor: Applications to educational technology. Interactive Multimedia Electronic Journal of Computer-Enhanced Learning 1(2). Available online at http://imej.wfu.edu/articles/1999/2/04/index.asp Google Scholar

Ishioka, T., Kameda, M. 2004. Automated Japanese Essay Scoring System: JESS. In Proceedings of the 15th International Workshop on Database and Expert Systems Applications, 4–8.Google Scholar

Kakkonen, T., Myller, N., Timonen, J., Sutinen, E. 2005. Automatic Essay Grading with Probabilistic Latent Semantic Analysis. In Proceedings of the 2nd Workshop on Building Educational Applications Using NLP, Association for Computational Linguistics, 29–36.Google Scholar

Kintsch, E., Steinhart, D., Stahl, G., the LSA Research Group 2000. Developing summarization skills through the use of LSA-based feedback. Interactive Learning Environments 8, 87–109.Google Scholar

Landauer, T., Dumais, S. 1997. A solution to Plato’s problem: The latent semantic analysis theory of the acquisition, induction, and representation of knowledge. Psychological Review 104, 211–240.Google Scholar

Landauer, T., Laham, D., Rehder, B., Schreiner, M. 1997. How well can passage meaning be derived without using word order? A comparison of Latent Semantic Analysis and humans. In Proceedings of the 19th Annual Meeting of the Cognitive Science Society, Erlbaum, Mawhwah, New Jersey, 412–417.Google Scholar

Larkey, L. S. 1998. Automatic essay grading using text categorization techniques. In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM Press, New York, 90–95.Google Scholar

Leacock, C. 2004. Scoring free-responses automatically: A case study of a large-scale assessment. English version of Leacock, C. 2004. Automatisch beoordelen van antwoorden op open vragen; een taalkundige benadering. Examens Journal 1(3).Google Scholar

Lutticke, R. 2005. Graphic and NLP Based Assessment of Knowledge about Semantic Networks. In Proceedings of the Artificial Intelligence in Education conference. IOS Press.Google Scholar

Malatesta, K., Wiemer-Hastings, P., Robertson, J. 2002. Beyond the short answer question with research methods tutor. In Proceedings of the Intelligent Tutoring Systems Conference, Lecture Notes in Computer Science 2363. Springer; San Sebastian.Google Scholar

Manning, C., Schutze, H. 2001. Foundations of Statistical Natural Language Processing. MIT Press.Google Scholar

Marcu, D. 2000. The Theory and Practice of Discourse Parsing and Summarization. The MIT Press.Google Scholar

Marshall, S., Barron, C. 1987. Marc-methodical assessment of reports by computer. System 15(2), 161–167.Google Scholar

Mason, O., Grove-Stephenson, I. 2002. Automated free text marking with paperless school. In Proceedings of the 6th International Computer Assisted Assessment Conference, Loughborough, UK.Google Scholar

Mcgrath, P. 2003. Assessing students: Computer simulation vs MCQs. In Proceedings of the 7th Computer Assisted Assessment Conference, Loughborough, UK.Google Scholar

Mikhailov, A. 1998. Indextron. Intelligent Engineering Systems Through Artificial Neural Networks 8, 57–67.Google Scholar

Ming, Y., Mikhailov, A., Kuan, T. 2000. Intelligent essay marking system. In Learners Together, Cheers, C. (ed.). NGEE ANN Polytechnic.Google Scholar

Mitchell, T., Aldridge, N., Williamson, W., Broomhead, P. 2003. Computer based testing of medial knowledge. In Proceedings of the 7th Computer Assisted Assessment Conference, Loughborough, UK.Google Scholar

Mitchell, T., Russell, T., Broomhead, P., Aldridge, N. 2002. Towards robust computerised marking of free-text responses. In Proceedings of the 6th Computer Assisted Assessment Conference, Loughborough, UK.Google Scholar

MUC7. 1998. Proceedings of the 7th Message Understanding Conference (MUC-7). Morgan Kaufmann, California, USA.Google Scholar

Page, E. 1966. The imminence of grading essays by computer. Phi Delta Kappan 47, 238–243.Google Scholar

Page, E. 1994. Computer grading of student prose, using modern concepts and software. Journal of Experimental Education 2(62), 127–142.Google Scholar

Palmer, K., Richardson, P. 2003. On-line assessment and free-response input—a pedagogic and technical model for squaring the circle. In Proceedings of the 7th Computer Assisted Assessment Conference, Loughborough, UK.Google Scholar

Parsons, H., Schofield, D., Woodget, S. 2003. Piloting summative Web assessment in secondary education. In Proceedings of the 7th Computer Assisted Assessment Conference, Loughborough, UK.Google Scholar

Pérez, D., Gliozzo, A., Strapparava, C., Alfonseca, E., Rodríguez, P., Magnini, B. 2005. Automatic assessment of students’ free-text answers underpinned by the combination of a Bleu-inspired algorithm and latent semantic analysis. In Proceedings of the 18th International Conference of the Florida Artificial Intelligence Research Society, American Association for Artificial Intelligence (AAAI), Menlo Park, California.Google Scholar

Pérez-Marín, D., Alfonseca, E., Rodríguez, P., Pascual-Nieto, I. 2006. Willow: Automatic and adaptive assessment of students free-text answers. In Proceedings of the 22nd International Conference of the Spanish Society for the Natural Language Processing (SEPLN), Zaragoza, Spain.Google Scholar

Pérez-Marín, D., Alfonseca, E., Rodríguez, P., Pascual-Nieto, I. 2007. Automatic generation of students’ conceptual models from answers in plain text. In Proceedings of the User Modeling International Conference, Conati, C., McCoy, K. & Paliouras, G. (eds). Lecture Notes in Artificial Intelligence 4511, 329–333. Springer-Verlag.Google Scholar

Quinlan, J. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers.Google Scholar

Rosé, C., Roque, A., Bhembe, D., VanLehn, K. 2003. A hybrid text classification approach for analysis of student essays. In Proceedings of the HLT-NAACL Workshop on Educational Applications of NLP, Edmonton, Canada.Google Scholar

Rudner, L., Gagne, P. 2001. An overview of three approaches to scoring written essays by computer. Educational Resources Information Center (ERIC) digest, ERIC Clearinghouse on Assessment and Evaluation, College Park, MD.Google Scholar

Rudner, L., Liang, T. 2002. Automated essay scoring using bayes’ theorem. In Proceedings of the Annual Meeting of the National Council on Measurement in Education, New Orleans, LA.Google Scholar

Salton, G. 1989. Automatic Text Processing: the Transformation, Analysis, and Retrieval of Information by Computer. Addison-Wesley.Google Scholar

Salton, G., Wong, A., Yang, C. 1975. A vector space model for automatic indexing. Communications of the ACM 11(18), 613–620.Google Scholar

Sealey, C., Humphries, P., Reppert, D. 2003. At the coal face. Experiences of computer-based exams. In Proceedings of the 7th Computer Assisted Assessment Conference, Loughborough, UK.Google Scholar

Shermis, M., Koch, C., Page, E., Keith, T., Harrington, S. 2002. Trait rating for automated essay scoring. Educational and Psychological Measures 62, 5–18.Google Scholar

Streeter, L., Pstoka, J., Laham, D., MacCuish, D. 2003. The credible grading machine: Automated essay scoring in the DOD. In Proceedings of Interservice/Industry, Simulation and Education Conference (I/ITSEC), Orlando, Florida, USA.Google Scholar

Sukkarieh, J., Pulman, S., Raikes, N. 2003. Auto-marking: using computational linguistics to score short, free text responses. In Proceedings of the 29th IAEA Conference, Theme: Societies’ Goals and Assessment, Philadelphia, USA.Google Scholar

Valenti, S., Neri, F., Cucchiarelli, A. 2003. An overview of current research on automated essay grading. Journal of Information Technology Education 2, 319–330.Google Scholar

van Rijsbergen, C. J. 1979. Information Retrieval. Butterworths.Google Scholar

Vantage Learning Technology 2000. A Study of Expert Scoring and Intellimetric Scoring Accuracy for Dimensional Scoring of Grade 11 Student Writing Responses. Technical Report RB-397, Vantage, USA.Google Scholar

Vantage Learning Technology 2001. A Preliminary Study of the Efficacy of Intellimetric for Use in Scoring Hebrew Assessments. Technical Report RB-561, Vantage, USA.Google Scholar

Whittingdon, D., Hunt, H. 1999. Approaches to the computerised assessment of free-text responses. In Proceedings of the 3rd International Computer Assisted Assessment Conference, Loughborough, UK.Google Scholar

Wiemer-Hastings, P., Graesser, A. 2000. Select-a-kibitzer: A computer tool that gives meaningful feedback on student compositions. Interactive Learning Environments 8(2), 149–169.CrossRef Google Scholar

Wiemer-Hastings, P., Allbritton, D., Arnott, E. 2004. RMT: A dialog-based research methods tutor with or without a head. In Proceedings of the 7th International Conference on Intelligent Tutoring Systems, Springer-Verlag, Berlin.Google Scholar

Wiemer-Hastings, P., Graesser, A., Harter, D., the Tutoring Research Group 1998. The foundations and architecture of Autotutor. In Proceedings of the 4th International Conference on Intelligent Tutoring Systems, Springer-Verlag, New York, 334–343.Google Scholar

Williams, R. 2001. Automated essay grading: an evaluation of four conceptual models. In Proceedings of the 10th Annual Teaching and Learning Forum: Expanding Horizons in Teaching and Learning, Curtin University of Technology, Perth, Australia.Google Scholar

Williams, R., Dreher, H. 2004. Automatically Grading Essays with Markit. In Proceedings of Informing Science Conference, Rockhampton, Queensland, Australia.Google Scholar

Article contents

Computer-assisted assessment of free-text answers

Abstract

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests