Hostname: page-component-586b7cd67f-vdxz6 Total loading time: 0 Render date: 2024-12-01T01:48:50.726Z Has data issue: false hasContentIssue false

Optimal Number of Strata for the Stratified Methods in Computerized Adaptive Testing

Published online by Cambridge University Press:  08 July 2014

Juan Ramón Barrada*
Affiliation:
Universidad de Zaragoza (Spain)
Francisco José Abad
Affiliation:
Universidad Autónoma de Madrid (Spain)
Julio Olea
Affiliation:
Universidad Autónoma de Madrid (Spain)
*
*Correspondence concerning this article should be addressed to Juan Ramón Barrada. Facultad de Ciencias Sociales y Humanas. Universidad de Zaragoza. 44003. Teruel (Spain). E-mail: [email protected]

Abstract

Test security can be a major problem in computerized adaptive testing, as examinees can share information about the items they receive. Of the different item selection rules proposed to alleviate this risk, stratified methods are among those that have received most attention. In these methods, only low discriminative items can be presented at the beginning of the test and the mean information of the items increases as the test goes on. To do so, the item bank must be divided into several strata according to the information of the items. To date, there is no clear guidance about the optimal number of strata into which the item bank should be split. In this study, we will simulate conditions with different numbers of strata, from 1 (no stratification) to a number of strata equal to test length (maximum level of stratification) while manipulating the maximum exposure rate that no item should surpass (r max ) in its whole domain. In this way, we can plot the relation between test security and accuracy, making it possible to determine the number of strata that leads to better security while holding constant measurement accuracy. Our data indicates that the best option is to stratify into as many strata as possible.

Type
Research Article
Copyright
Copyright © Universidad Complutense de Madrid and Colegio Oficial de Psicólogos de Madrid 2014 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Abad, F. J., Olea, J., Aguado, D., Ponsoda, V., & Barrada, J. R. (2010). Deterioro de parámetros de los ítems en tests adaptativos informatizados: Estudio con eCAT. [Item parameter drift in computerized adaptive testing: Study with eCAT]. Psicothema, 22, 340347.Google Scholar
Barrada, J. R. (2012). Tests adaptativos informatizados: Una perspectiva general [Computerized adaptive testing: A general perspective]. Anales de Psicología, 28, 289302.Google Scholar
Barrada, J. R., Abad, F. J., & Veldkamp, B. P. (2009). Comparison of methods for controlling maximum exposure rates in computerized adaptive testing. Psicothema, 21, 318325.Google Scholar
Barrada, J. R., Abad, F. J., & Olea, J. (2011). Varying the valuating function and the presentable bank in computerized adaptive testing. The Spanish Journal of Psychology, 14, 500508. http://dx.doi.org/10.5209/rev_SJOP.2011.v14.n1.45 CrossRefGoogle ScholarPubMed
Barrada, J. R., Mazuela, P., & Olea, J. (2006). Maximum information stratification method for controlling item exposure in computerized adaptive testing. Psicothema, 18, 156159.Google Scholar
Barrada, J. R., Olea, J., & Abad, F. J. (2008). Rotating item banks versus restriction of maximum exposure rates in computerized adaptive testing. The Spanish Journal of Psychology, 11, 618625.Google Scholar
Barrada, J. R., Olea, J., Ponsoda, V., & Abad, F. J. (2008). Incorporating randomness in the Fisher information for improving item-exposure control in CATs. British Journal of Mathematical and Statistical Psychology, 61, 493513. http://dx.doi.org/10.1348/000711007X230937 Google Scholar
Barrada, J. R., Olea, J., Ponsoda, V., & Abad, F. J. (2010). A method for the comparison of item selection rules in computerized adaptive testing. Applied Psychological Measurement, 34, 438452. http://dx.doi.org/10.1177/0146621610370152 Google Scholar
Chang, H. H. (2004). Understanding computerized adaptive testing – From Robbins-Monro to Lord and beyond. In Kaplan, D. (Ed.), The SAGE handbook of quantitative methodology for the social sciences (pp. 117133). Thousand Oaks, CA: Sage Publications.Google Scholar
Chang, H. H., Qian, J., & Ying, Z. (2001). a-Stratified multistage computerized adaptive testing with b blocking. Applied Psychological Measurement, 25, 333341. http://dx.doi.org/10.1177/01466210122032181 Google Scholar
Chang, H. H., & Ying, Z. (1999). a-Stratified multistage computerized adaptive testing. Applied Psychological Measurement, 23, 211222. http://dx.doi.org/10.1177/01466219922031338 Google Scholar
Chen, S. Y., Ankenmann, R. D., & Spray, J. A. (2003). The relationship between item exposure and test overlap in computerized adaptive testing. Journal of Educational Measurement, 40, 129145. http://dx.doi.org/10.1111/j.1745-3984.2003.tb01100.x Google Scholar
Cheng, Y., Chang, H. H., Douglas, J., & Guo, F. (2009). Constraint-weighted a-stratification for computerized adaptive testing With nonstatistical constraints: Balancing measurement efficiency and exposure control. Educational and Psychological Measurement, 69, 3549. http://dx.doi.org/10.1177/0013164408322030 Google Scholar
Davey, T., & Nering, N. (2002). Controlling item exposure and maintaining item security. In Mills, C. N., Potenza, M. T., Fremer, J. J., & Ward, W. C., (Eds), Computer-based testing: Building the foundation for future assessments (pp. 165191). Mahwah, NJ: Lawrence Erlbaum.Google Scholar
Deng, H., Ansley, T., & Chang, H. H. (2010). Stratified and maximum information item selection procedures in computer adaptive testing. Journal of Educational Measurement, 47, 202226. http://dx.doi.org/10.1111/j.1745-3984.2010.00109.x Google Scholar
Dodd, B. G. (1990). The effect of item selection procedure and stepsize on computerized adaptive attitude measurement using the rating scale model. Applied Psychological Measurement, 14, 355366. http://dx.doi.org/10.1177/014662169001400403 Google Scholar
Georgiadou, E., Triantafillou, E., & Economides, A. (2007). A review of item exposure control strategies for computerized adaptive testing developed from 1983 to 2005. Journal of Technology, Learning, and Assessment, 5(8). Retrieved from http://ejournals.bc.edu/ojs/index.php/jtla/article/view/1647/ Google Scholar
Han, K. T. (2012). An efficiency balanced information criterion for item selection in computerized adaptive testing. Journal of Educational Measurement, 49, 225246. http://dx.doi.org/10.1111/j.1745-3984.2012.00173.x Google Scholar
Leung, C. K., Chang, H. H., & Hau, K. T. (2002). Item selection in computerized adaptive testing: Improving the alpha-stratified design with the Sympson-Hetter algorithm. Applied Psychological Measurement, 26, 376392. http://dx.doi.org/10.1177/014662102237795 Google Scholar
Leung, C. K., Chang, H. H., & Hau, K. T. (2005). Computerized adaptive testing: A mixture item selection approach for constrained situations. British Journal of Mathematical and Statistical Psychology, 58, 239257. http://dx.doi.org/10.1348/000711005X62945 Google Scholar
Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Lawrence Erlbaum.Google Scholar
Olea, J., Abad, F. J., Ponsoda, V., & Ximénez, M. C. (2004). Un test adaptativo informatizado para evaluar el conocimiento de inglés escrito: Diseño y comprobaciones psicométricas [A computerized adaptive test for the assessment of written English: Design and psychometric properties]. Psicothema, 16, 519525.Google Scholar
Olea, J., Barrada, J. R., Abad, F. J., Ponsoda, V., Cuevas, L. (2012). Computerized adaptive testing: The capitalization on chance problem. The Spanish Journal of Psychology, 15, 424441. http://dx.doi.org/10.5209/rev_SJOP.2012.v15.n1.37348 Google Scholar
Stocking, M. L., & Lewis, C. L. (2000). Methods of controlling the exposure of items in CAT. In Van der Linden, W. J., & Glas, C. A. W. (Eds.) Computerized adaptive testing: Theory and practice (pp. 163182). Dordrecht, the Netherlands: Kluwer Academic.Google Scholar
van der Linden, W. J., & Glas, C. A. W. (Eds.) (2010). Elements of adaptive testing. New York, NY: Springer. http://dx.doi.org/10.1007/978-0-387-85461-8 Google Scholar
van der Linden, W. J., & Veldkamp, B. P. (2004). Constraining item exposure in computerized adaptive testing with shadow tests. Journal of Educational and Behavioral Statistics, 29, 273291. http://dx.doi.org/10.3102/10769986029003273 Google Scholar
Way, W. D. (1998). Protecting the integrity of computerized testing item pools. Educational Measurement: Issues and Practice, 17, 1727. http://dx.doi.org/10.1111/j.1745-3992.1998.tb00632.x Google Scholar
Yi, Q., & Chang, H. H. (2003). a-Stratified CAT design with content blocking. British Journal of Mathematical and Statistical Psychology, 56, 359378. http://dx.doi.org/10.1348/000711003770480084 Google Scholar