Generalization Inference for a Computer-Mediated Graphic-Prompt Writing Test for ESL Placement

doi:10.1017/9781108669849.009

6 - Generalization Inference for a Computer-Mediated Graphic-Prompt Writing Test for ESL Placement

from Part II - Investigating Score Interpretations

Published online by Cambridge University Press: 14 January 2021

YunDeok Choi

Edited by

Carol A. Chapelle and

Erik Voss

Show author details

Carol A. Chapelle: Affiliation:
Iowa State University
Erik Voss: Affiliation:
Teachers College, Columbia University

Book contents

Get access

Summary

This argument-based validation research investigates the validity of score interpretations on a computer-based, graphic-prompt writing test, focusing on the generalization inference. The graphic-prompt writing test assesses examinees’ ability to incorporate visual graphic information into their writing,. Both analytic ratings on Graph Description, Content Development, Organization, and Grammar/Vocabulary (n = 2,424) and composite ratings (n = 606) on written test responses from 101 ESL students were analyzed using Generalizability (G) Theory and Multi-Faceted Rasch Measurement (MFRM). Findings indicated three of the four analytic scales and the composites yielded dependable scores. In addition, the results of the G-studies and MFRM analysis revealed the relative effects of the raters on the total score variance was not trivial for both composite and analytic scores and the three raters were not quite equivalent in their rating severity. Nevertheless, the findings support the generalization inference to a large extent. Thus, it can be claimed the graphic-prompt writing task scores were dependable enough to be used for the intended purposes, particularly with the two-rater and three-task test administration design.

Keywords

graphic prompt writing task academic writing generalization inference placement decisions source-based writing computer-based test

Type: Chapter
Information: Validity Argument in Language Testing
Case Studies of Validation Research
, pp. 120 - 153

DOI: https://doi.org/10.1017/9781108669849.009 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2021

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book purchase

Temporarily unavailable

References

Bachman, L. F., Lynch, B. K., & Mason, M. (1995). Investigating variability in tasks and rater judgments in a performance test of foreign language speaking. Language Testing, 12(2), 238–257.Google Scholar

Barkaoui, K. (2007). Rating scale impact on EFL essay marking: A mixed-method study. Assessing Writing, 12(2), 86–107.Google Scholar

Bouwer, R., Béguin, A., Sanders, T., & van den Bergh, H. (2015). Effect of genre on the generalizability of writing scores. Language Testing, 32(1), 83–100.Google Scholar

Bridges, G. (2010). Demonstrating cognitive validity of IELTS academic writing task 1. Cambridge ESOL Research Notes, 42, 24–33. Retrieved from www.cambridgeenglish.org/images/23160-research-notes-42.pdf Google Scholar

Briesch, A. M., Swaminathan, H., Welsh, M., & Chafouleas, S. M. (2014). Generalizability theory: A practical guide to study design, implementation, and interpretation. Journal of School Psychology, 52(1), 13–35.Google Scholar

Briggs, D. C. (2004). Comment: Making an argument for design validity before interpretive validity. Measurement, 2(3), 171–174.Google Scholar

Carr, N. T. (2011). Designing and analyzing language tests. Oxford: Oxford University Press.Google Scholar

Chapelle, C. A., Enright, M. K., & Jamieson, J. M. (Eds.). (2008). Building a validity argument for the Test of English as a Foreign Language^TM. New York: Routledge.Google Scholar

Choi, Y. D. (2018). Graphic-prompt tasks for assessment of academic English writing ability: An argument-based approach to investigating validity. Unpublished doctoral dissertation, Iowa State University, Ames, IA.Google Scholar

Conklin, K., Pellicer-Sánchez, A., & Carrol, G. (2018). Eye-tracking: A guide for applied linguistics research. Cambridge: Cambridge University Press.Google Scholar

Creswell, J. W., & Plano Clark, V. L. (2007). Designing and conducting mixed methods research. London: Sage Publications.Google Scholar

Cumming, A. (2013). Assessing integrated writing tasks for academic purposes: Promises and perils. Language Assessment Quarterly, 10(1), 1–8.Google Scholar

Cumming, A., Grant, L., Mulcahy-Ernt, P., & Powers, D. E. (2004). A teacher-verification study of speaking and writing prototype tasks for a new TOEFL. Language Testing, 21(2), 107–145.Google Scholar

Cumming, A., Kantor, R., Baba, K., Erdosy, U., Eouanzoui, K., & James, M. (2005). Differences in written discourse in independent and integrated prototype tasks for next generation TOEFL. Assessing Writing, 10(1), 5–43.Google Scholar

Eckes, T. (2015). Introduction to many-facet Rasch measurement: Analyzing and evaluating rater-mediated assessment (2nd ed.). New York: Peter Lang.Google Scholar

Farahani, D. B., & Kashanifar, F. S. (2016). Graph writing test taking strategies and performance on the task: The role of academic background. Journal of Applied Linguistics and Language Research, 3(2), 51–69.Google Scholar

Gebril, A. (2009). Score generalizability of academic writing tasks: Does one test method fit it all? Language Testing, 26(4), 507–531.Google Scholar

Gebril, A. (2010). Bringing reading-to-write and writing-only assessment tasks together: A generalizability analysis. Assessing Writing, 15(2), 100–117.Google Scholar

Hyland, K. (2006). English for academic purposes: An advanced resource book. New York: Routledge.Google Scholar

IBM Corp. (2015). IBM SPSS statistics for Macintosh (version 23.0) [Computer software]. Armonk, NY: IBM Corp.Google Scholar

IELTS. (2006). Handbook 2006. Retrieved from http://aabe.com.ua/Uploads/Files/LinkDirectory/Exams/IELTS/handbook2006.pdf Google Scholar

In’nami, Y., & Koizumi, R. (2016). Task and rater effects in L2 speaking and writing: A synthesis of generalizability studies. Language Testing, 33(3), 341–366.Google Scholar

Iowa State University. (2019). English placement test. Retrieved from https://apling.engl.iastate.edu/english-placement-test/Google Scholar

Jewitt, C. (2005). Multimodality, “reading”, and “writing” for the 21st century. Discourse: Studies in the Cultural Politics of Education, 26(3), 315–331.Google Scholar

Jewitt, C. (2008). Multimodality and literacy in school classrooms. Review of Research in Education, 32(1), 241–267.Google Scholar

Kane, M. T. (2006). Validation. In Brennen, R. (Ed.), Educational measurement (4th ed., pp. 17–64). Westport, CT: Greenwood Publishing.Google Scholar

Kane, M. T. (2013). Validating the interpretations and uses of test scores. Journal of Educational Measurement, 50(1), 1–73.Google Scholar

Knoch, U. (2009). Diagnostic assessment of writing: A comparison of two rating scales. Language Testing, 26(2), 275–304.Google Scholar

Knoch, U., & Chapelle, C. A. (2017). Validation of rating processes within an argument-based framework. Language Testing, 35(4), 477–499https://doi.org/10.1177/0265532217710049 Google Scholar

Knoch, U., & Sitajalabhorn, W. (2013). A closer look at integrated writing tasks: Towards a more focused definition for assessment purposes. Assessing Writing, 18(4), 300–308.Google Scholar

Lee, Y. W., & Kantor, R. (2007). Evaluating prototype tasks and alternative rating schemes for a new ESL writing test through G-theory. International Journal of Testing, 7(4), 353–385.Google Scholar

Lim, G. S. (2009). Prompt and rater effects in second language writing performance assessment. Unpublished doctoral dissertation, University of Michigan.Google Scholar

Linacre, J. M. (2014). Facets Rasch measurement computer program (version 3.71.4) [Computer software]. Chicago: Winsteps.com.Google Scholar

Mackey, A., & Gass, S. M. (2005). Second language research: Methodology and design. New York: Routledge.Google Scholar

Mickan, P., Slater, S., & Gibson, C. (2000). Study of response validity of the IELTS writing subtest. International English Language Testing System, 3, 29–48.Google Scholar

Mushquash, C., & O’Connor, B. P. (2006). SPSS and SAS programs for generalizability theory analyses. Behavior Research Methods, 38(3), 542–547.Google Scholar

Ockey, G. J. (2012). Item response theory. In Fulcher, G. & Davidson, F. (Eds.), Routledge handbook of language testing (pp. 316–328). London: Routledge.Google Scholar

O’Loughlin, K., & Wigglesworth, G. (2003). Task design in IELTS academic writing task 1: The effect of quantity and manner of presentation of information on candidate writing. IELTS research report #4. Retrieved from http://search.informit.com.au/documentSummary;dn=908957733867582;res=IELHSS Google Scholar

Plakans, L. (2008). Comparing composing processes in writing-only and reading-to-write test tasks. Assessing Writing, 13(2), 111–129.Google Scholar

Schoonen, R. (2005). Generalizability of writing scores: An application of structural equation modeling. Language Testing, 22(1), 1–30.Google Scholar

Shavelson, R. J., & Webb, N. M. (1991). Generalizability theory: A primer. London: Sage Publications.Google Scholar

Shin, S. Y., & Ewert, D. (2015). What accounts for integrated reading-to-write task scores? Language Testing, 32(2), 259–281.Google Scholar

Weigle, S. C. (1998). Using FACETS to model rater training effects. Language Testing, 15(2), 263–287.Google Scholar

Weigle, S. C. (1999). Investigating rater/prompt interactions in writing assessment: Quantitative and qualitative approaches. Assessing Writing, 6(2), 145–178.Google Scholar

Weigle, S. C. (2002). Assessing writing. Cambridge: Cambridge University Press.Google Scholar

Yang, H. C. (2012a). A comparative study of composing processes in reading-and graph-based writing tasks. Language Testing in Asia, 2(3), 33.Google Scholar

Yang, H. C. (2012b). Modeling the relationships between test-taking strategies and test performance on a graph-writing task: Implications for EAP. English for Specific Purposes, 31(3), 174–187.Google Scholar

Yang, H. C. (2016). Describing and interpreting graphs: The relationships between undergraduate writer characteristics and academic graph writing performance. Assessing Writing, 28, 28–42.Google Scholar

Yu, G., Rea-Dickens, P., & Kiely, P. (2012). The cognitive processes of taking IELTS Academic Writing Task 1. IELTS research report #11. Retrieved from www.ielts.org/PDF/vol11_report_6_the_cognitive_processes.pdf Google Scholar