Hostname: page-component-745bb68f8f-s22k5 Total loading time: 0 Render date: 2025-01-07T18:38:44.295Z Has data issue: false hasContentIssue false

A Note on Statistical Hypothesis Testing Based on Log Transformation of the Mantel–Haenszel Common Odds Ratio for Differential Item Functioning Classification

Published online by Cambridge University Press:  01 January 2025

Insu Paek*
Affiliation:
Florida State University
Paul Holland
Affiliation:
St. Petersburg, Florida
*
Requests for reprints should be sent to Insu Paek, Educational Psychology & Learning Systems, Florida State University, 3204D Stone Building, 1114 W. Call St., Tallahassee, FL 32306-4453, USA. E-mail: [email protected]

Abstract

When differential item functioning (DIF) is investigated, DIF classification is made using statistical test results and estimated DIF sizes in practice. One of the well-known DIF classifications is that of the Educational Testing Service (ETS) A (negligible DIF), B (medium DIF), and C (large DIF) rules. This article provides a clarifying note on (a) a sketch of the proof of the asymptotic normality of what is known as the Mantel–Haenszel (MH) delta, which provides the basis of a point and an interval null hypothesis test based on the MH delta, and (b) how to conduct an interval null hypothesis test using the MH delta, which is necessary for the C DIF classification.

Type
Original Paper
Copyright
Copyright © 2013 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

P. Holland is retired. Former Frederic M. Lord Chair in Measurement and Statistics in Educational Testing Service.

References

Breslow, N.E. (1981). Odds ratio estimators when the data are sparse. Biometrika, 68, 7384.CrossRefGoogle Scholar
Camilli, G., & Shepard, L.A. (1994). Methods for identifying biased test items. Thousand Oaks: Sage.Google Scholar
Dorans, N., & Holland, P.W. (1993). DIF detection and description: Mantel–Haenszel and standardization. In Holland, P.W., & Wainer, H. (Eds.), Differential item functioning (pp. 3566). Hillsdale: Lawrence Erlbaum Associates.Google Scholar
Flanders, W.D. (1985). A new variance estimator for the Mantel–Haenszel odds ratio. Biometrics, 41, 637642.CrossRefGoogle Scholar
Hauck, W. (1979). The large sample variance of the Mantel–Haenszel estimator of a common odds ratio. Biometrics, 35, 817819.CrossRefGoogle Scholar
Holland, P.W. (2004). Comments on the definitions of A, B, and C items in DIF. Unpublished manuscript.Google Scholar
Holland, P.W., & Thayer, D.T. (1988). Differential item performance and the Mantel–Haenszel procedure. In Wainer, H., & Braun, H.I. (Eds.), Test validity (pp. 129145). Hillsdale: Lawrence Erlbaum Associates.Google Scholar
Lehmann, E.L. (1986). Testing statistical hypothesis (2nd ed.). New York: Springer.CrossRefGoogle Scholar
Longford, N.T., Holland, P.W., & Thayer, D.T. (1993). Stability of the MH D-DIF statistics across populations. In Holland, P.W., & Wainer, H. (Eds.), Differential item functioning (pp. 171196). Hillsdale: Lawrence Erlbaum Associates.Google Scholar
Mantel, N., & Haenszel, W. (1959). Statistical aspects of the analysis of data from retrospective studies of disease. Journal of the National Cancer Institute, 22, 719748.Google ScholarPubMed
Phillips, A., & Holland, P. (1987). Estimators of the variance of the Mantel–Haenszel log-odds-ratio estimate. Biometrics, 43, 425431.CrossRefGoogle Scholar
R Development Core Team (2013). R: a language and environment for statistical computing. Vienna: R Foundation for Statistical Computing. http://www.R-project.org.Google Scholar
Robins, J., Breslow, N.E., & Greenland, S. (1986). Estimators of the Mantel–Haenszel variance consistent in both sparse data and large strata limiting models. Biometrics, 42, 311324.CrossRefGoogle ScholarPubMed
Schervish, M.J. (1996). P values: what they are and what they are not. American Statistician, 50, 203206.Google Scholar