Hostname: page-component-745bb68f8f-s22k5 Total loading time: 0 Render date: 2025-01-08T12:18:57.193Z Has data issue: false hasContentIssue false

Asymptotic Standard Errors of IRT Observed-Score Equating Methods

Published online by Cambridge University Press:  01 January 2025

Haruhiko Ogasawara*
Affiliation:
Otaru University of Commerce
*
Requests for reprints should be sent to Haruhiko Ogasawaxa, Department of Information and Management Science, Otaru University of Commerce, 3-5-21, Midori, Otaru 047-8501 JAPAN. E-Mail: [email protected]

Abstract

A method of the IRT observed-score equating using chain equating through a third test without equating coefficients is presented with the assumption of the three-parameter logistic model. The asymptotic standard errors of the equated scores by this method are obtained using the results given by M. Liou and P.E. Cheng. The asymptotic standard errors of the IRT observed-score equating method using a synthetic examinee group with equating coefficients, which is a currently used method, are also provided. Numerical examples show that the standard errors by these observed-score equating methods are similar to those by the corresponding true score equating methods except in the range of low scores.

Type
Articles
Copyright
Copyright © 2003 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

The author is indebted to Michael J. Kolen for access to the real data used in this article and anonymous reviewers for their corrections and suggestions on this work.

References

Angoff, W.H. (1971). Scales, norms, and equivalent scores. In Thorndike, R.L. (Eds.), Educational measurement 2nd ed., (pp. 508600). Washington DC: American Council on Education.Google Scholar
Bahadur, R.R. (1966). A note on quantiles in large samples. Annals of Mathematical Statistics, 37, 577580.CrossRefGoogle Scholar
Bentler, P.M., Dudgeon, P. (1996). Covariance structure analysis: Statistical practice, theory, and directions. Annual Review of Psychology, 47, 563592.CrossRefGoogle Scholar
Bock, R.D., Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 46, 443459.CrossRefGoogle Scholar
Bock, R.D., Lieberman, M. (1970). Fitting a response model forn dichotomously scored items. Psychometrika, 35, 179197.CrossRefGoogle Scholar
Braun, H.I., Holland, P.W. (1982). Observed-score test equating: A mathematical analysis of some ETS equating procedures. In Holland, P.W., Rubin, D.B. (Eds.), Test equating (pp. 949). New York, NY: Academic Press.Google Scholar
Cox, D.R. (1961). Tests of separate families of hypotheses. Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, 1, 105123.Google Scholar
Ghosh, J.K. (1971). A new proof of he Bahadur representation of quantiles and an application. Annals of Mathematical Statistics, 42, 19571961.CrossRefGoogle Scholar
Han, T., Kolen, M.J., Pohlmann, J. (1997). A comparison among IRT true- and observed score equatings and traditional equipercentile equating. Applied Measurement in Education, 10, 105121.CrossRefGoogle Scholar
Kolen, M.J. (1981). Comparison of traditional and item response theory methods for equating tests. Journal of Educational Measurement, 18, 111.CrossRefGoogle Scholar
Kolen, M.J., Brennan, R.L. (1995). Test equating: Methods and practices. New York, NY: Springer.CrossRefGoogle Scholar
Liou, M., Cheng, P.E. (1995). Asymptotic standard error of equipercentile equating. Journal of Educational and Behavioral Statistics, 20, 259286.CrossRefGoogle Scholar
Liou, M., Cheng, P. E., Johnson, E. (1997). Standard errors of the kernel equating methods under the common-item design. Applied Psychological Measurement, 21, 349369.CrossRefGoogle Scholar
Lord, F.M. (1977). Practical applications of item characteristic curve theory. Journal of Educational Measurement, 14, 117138.CrossRefGoogle Scholar
Lord, F.M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Erlbaum.Google Scholar
Lord, F.M. (1982). Item response theory and equating: A technical summary. In Holland, P.W., Rubin, D.B. (Eds.), Test equating (pp. 141148). New York, NY: Academic Press.Google Scholar
Lord, F.M. (1982). Standard errors of an equating by item response theory. Applied Psychological Measurement, 6, 463472.CrossRefGoogle Scholar
Lord, F.M. (1982). The standard error of equipercentile equating. Journal of Educational Statistics, 7, 165174.CrossRefGoogle Scholar
Lord, F.M., Wingersky, M.S. (1984). Comparison of IRT true-score and equipercentile observed-score “equatings”. Applied Psychological Measurement, 8, 453461.CrossRefGoogle Scholar
Loyd, B.H., Hoover, H.D. (1980). Vertical equating using the Rasch model. Journal of Educational Measurement, 17, 179193.CrossRefGoogle Scholar
Ogasawara, H. (2000). Asymptotic standard errors of IRT equating coefficients using moments. Economic Review (Otaru University of Commerce), 51(1), 123.Google Scholar
Ogasawara, H. (2001). Standard errors of item response theory equating/linking by response function methods. Applied Psychological Measurement, 25, 5367.CrossRefGoogle Scholar
Ogasawara, H. (2001). Item response theory true score equatings and their standard errors. Journal of Educational and Behavioral Statistics, 26, 3150.CrossRefGoogle Scholar
Rubin, D.B. (1982). Discussion of “Observed-score test equating: A mathematical analysis of some ETS equating procedures”. In Holland, P.W., Rubin, D.B. (Eds.), Test equating (pp. 5154). New York, NY: Academic Press.Google Scholar
Stocking, M.L., Lord, F.M. (1983). Developing a common metric in item response theory. Applied Psychological Measurement, 7, 201210.CrossRefGoogle Scholar
Tsai, T.-H., Hanson, B.A., Kolen, M.J, Forsyth, R.A. (2001). A comparison of bootstrap standard errors of IRT equating methods for the common item nonequivalent groups design. Applied Measurement in Education, 14, 1730.CrossRefGoogle Scholar
van der Linden, W.J. (2000). A test-theoretic approach to observed-score equating. Psychometrika, 65, 437456.CrossRefGoogle Scholar
van der Linden, W.J., Luecht, R.M. (1998). Observed-score equating as a test assembly problem. Psychometrika, 63, 401418.CrossRefGoogle Scholar
Zeng, L., Kolen, M.J. (1995). An alternative approach for IRT observed-score equating of number-correct scores. Applied Psychological Measurement, 19, 231241.CrossRefGoogle Scholar