Hostname: page-component-745bb68f8f-d8cs5 Total loading time: 0 Render date: 2025-01-08T12:17:02.475Z Has data issue: false hasContentIssue false

A Note on the Reliability Coefficients for Item Response Model-Based Ability Estimates

Published online by Cambridge University Press:  01 January 2025

Seonghoon Kim*
Affiliation:
Keimyung University
*
Requests for reprints should be sent to Seonghoon Kim, Department of Education, Keimyung University, 1095 Dalgubeoldaero, Dalseo-Gu, Daegu 704-701, South Korea. E-mail: [email protected]; [email protected]

Abstract

Assuming item parameters on a test are known constants, the reliability coefficient for item response theory (IRT) ability estimates is defined for a population of examinees in two different ways: as (a) the product-moment correlation between ability estimates on two parallel forms of a test and (b) the squared correlation between the true abilities and estimates. Due to the bias of IRT ability estimates, the parallel-forms reliability coefficient is not generally equal to the squared-correlation reliability coefficient. It is shown algebraically that the parallel-forms reliability coefficient is expected to be greater than the squared-correlation reliability coefficient, but the difference would be negligible in a practical sense.

Type
Article
Copyright
Copyright © 2011 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

AERA APA NCME (1985/1999). Standards for educational and psychological testing, Washington, D.C.: Author.Google Scholar
Bock, R.D., Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 46, 443459.CrossRefGoogle Scholar
Bock, R.D., Mislevy, R.J. (1982). Adaptive EAP estimation of ability in a microcomputer environment. Applied Psychological Measurement, 6, 431444.CrossRefGoogle Scholar
Green, B.F., Bock, R.D., Humphreys, L.G., Linn, R.L., Reckase, M.D. (1984). Technical guidelines for assessing computerized adaptive tests. Journal of Educational Measurement, 21, 347360.CrossRefGoogle Scholar
Feldt, L.S., Brennan, R.L. (1989). Reliability. In Linn, R.L. (Eds.), Educational measurement (pp. 105146). (3rd ed.). New York: Macmillan.Google Scholar
Feldt, L.S., Steffen, M., Gupta, N.C. (1985). A comparison of five methods for estimating the standard error of measurement at specific score levels. Applied Psychological Measurement, 9, 351361.CrossRefGoogle Scholar
Haertel, E.H. (2006). Reliability. In Brennan, R.L. (Eds.), Educational measurement, (4th ed.). (pp. 65110). Westport, CT: American Council on Education and Praeger.Google Scholar
Kim, J.K., Nicewander, W.A. (1993). Ability estimation for conventional tests. Psychometrika, 58, 587599.CrossRefGoogle Scholar
Lord, F.M. (1980). Applications of item response theory to practical testing applications, Hillsdale, NJ: Erlbaum.Google Scholar
Lord, F.M. (1983). Unbiased estimators of ability parameters, of their variance, and of their parallel-forms reliability. Psychometrika, 48, 233245.CrossRefGoogle Scholar
Lord, F.M. (1986). Maximum likelihood and Bayesian parameter estimation in item response theory. Journal of Educational Measurement, 23, 157162.CrossRefGoogle Scholar
Lord, F.M., Novick, M.R. (1968). Statistical theories of mental test scores, Reading, MA: Addison-Wesley.Google Scholar
Mellenbergh, G.J. (1996). Measurement precision in test score and item response models. Psychological Methods, 1, 293299.CrossRefGoogle Scholar
Nicewander, W.A., Thomasson, G.L. (1999). Some reliability estimates for computerized adaptive tests. Applied Psychological Measurement, 23, 239247.CrossRefGoogle Scholar
Raju, N.S., Oshima, T.C. (2005). Two prophecy formulas for assessing the reliability of item response theory-based ability estimates. Educational and Psychological Measurement, 65, 361375.CrossRefGoogle Scholar
Samejima, F. (1994). Estimation of reliability coefficients using the test information and its modifications. Applied Psychological Measurement, 18, 229244.CrossRefGoogle Scholar
Sireci, S.G., Thissen, D., Wainer, H. (1991). On the reliability of testlet-based tests. Journal of Educational Measurement, 28, 237247.CrossRefGoogle Scholar
Sympson, J.B. (1980). Estimating the reliability of adaptive tests from a single test administration. Paper presented at the annual meeting of the American Educational Research Association, Boston, April 1980.Google Scholar
Thissen, D. (1990). Reliability and measurement precision. In Wainer, H. (Eds.), Computerized adaptive testing: A primer (pp. 161186). Hillsdale, NJ: Erlbaum.Google Scholar
Thissen, D. (1991). MULTILOG: multiple, categorical item analysis and test scoring using item response theory [Computer program], Chicago: Scientific Software International.Google Scholar
Warm, T.A. (1989). Weighted likelihood estimation of ability in item response theory. Psychometrika, 54, 427450.CrossRefGoogle Scholar
Weiss, D.J. (1982). Improving measurement quality and efficiency with adaptive testing. Applied Psychological Measurement, 6, 473492.CrossRefGoogle Scholar