Statistical Inference for Multiple Choice Tests

John S. J. Hsu; Tom Leonard; Kam-Wah Tsui

doi:10.1007/BF02294466

Statistical Inference for Multiple Choice Tests

Published online by Cambridge University Press: 01 January 2025

John S. J. Hsu ,

Tom Leonard and

Kam-Wah Tsui

Show author details

John S. J. Hsu*: Affiliation:
Department of Statistics and Applied Probability, The University of California, Santa Barbara
Tom Leonard: Affiliation:
Department of Statistics, The University of Wisconsin, Madison
Kam-Wah Tsui: Affiliation:
Department of Statistics, The University of Wisconsin, Madison
*: Requests for reprints should be sent to John S.J. Hsu, Department of Statistics and Applied Probability, University of California-Santa Barbara, Santa Barbara, CA 93106.

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

Finite sample inference procedures are considered for analyzing the observed scores on a multiple choice test with several items, where, for example, the items are dissimilar, or the item responses are correlated. A discrete p-parameter exponential family model leads to a generalized linear model framework and, in a special case, a convenient regression of true score upon observed score. Techniques based upon the likelihood function, Akaike's information criteria (AIC), an approximate Bayesian marginalization procedure based on conditional maximization (BCM), and simulations for exact posterior densities (importance sampling) are used to facilitate finite sample investigations of the average true score, individual true scores, and various probabilities of interest. A simulation study suggests that, when the examinees come from two different populations, the exponential family can adequately generalize Duncan's beta-binomial model. Extensions to regression models, the classical test theory model, and empirical Bayes estimation problems are mentioned. The Duncan, Keats, and Matsumura data sets are used to illustrate potential advantages and flexibility of the exponential family model, and the BCM technique.

Keywords

multiple choice test exponential family likelihood Akaike's information criterion generalized linear model Bayesian marginalization importance sampling regression of true score upon observed score classical test theory model

Type: Article
Information: Psychometrika , Volume 56 , Issue 2 , June 1991 , pp. 327 - 348

DOI: https://doi.org/10.1007/BF02294466 [Opens in a new window]
Copyright: Copyright © 1991 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

Footnotes

The authors wish to thank Ella Mae Matsumura for her data set and helpful comments, Frank Baker for his advice on item response theory, Hirotugu Akaike and Taskin Atilgan, for helpful discussions regarding AIC, Graham Wood for his advice concerning the class of all binomial mixture models, Yiu Ming Chiu for providing useful references and information on tetrachoric models, and the Editor and two referees for suggesting several references and alternative approaches.

References

Akaike, H. (1978). A Bayesian analysis of the minimum AIC procedure. Annals of the Institute of Statistical Mathematics, 30(A), 9–14.CrossRef Google Scholar

Altham, P. M. E. (1978). Two generalizations of the binomial distribution. Applied Statistics, 27, 162–167.CrossRef Google Scholar

Anderson, D. A., & Aitken, M. (1985). Marginal maximum likelihood estimation of item parameters: Application of an algorithm. Journal of Royal Statistical Society, Series B, 26, 203–210.CrossRef Google Scholar

Atilgan, T. (1983). Parameter parsimony, model selection, and smooth density estimation. Unpublished doctoral dissertation, University of Wisconsin-Madison.Google Scholar

Atilgan, T., Leonard, T., & Gupta, A. K. (1988). On the application of AIC to bivariate density estimation, non-parametric regression, and discrimination. In Bozadogan, H. (Eds.), Multivariate statistical modeling and data analysis (pp. 1–16). Dordrecht, Holland: Reidel.Google Scholar

Bell, S. S. (1990). Empirical Bayes alternatives to the beta-binomial model. Unpublished doctoral dissertation, Columbia University.Google Scholar

Bock, R. D. (1972). Estimating item parameters and latent ability when responses are scored in two or more nominal categories. Psychometrika, 37, 29–51.CrossRef Google Scholar

Bock, R. D., & Aitken, M. (1981). Marginal maximum likelihood estimation of item parameters: application of an EM algorithm. Psychometrika, 46, 443–454.CrossRef Google Scholar

Carter, M. C., & Williford, W. O. (1975). Estimation in a modified binomial distribution. Applied Statistics, 24, 319–328.CrossRef Google Scholar

Consul, P. C. (1974). A simple urn model dependent upon predetermined strategy. Sankhya, Series B, 36, 391–399.Google Scholar

Consul, P. C. (1975). On a characterization of Lagrangian Poisson and quasi-binomial distributions. Communications in Statistics, 4, 555–563.CrossRef Google Scholar

Dalal, S. R., & Hall, W. J. (1983). Approximating priors by mixtures of natural conjugate priors. Journal of Royal Statistical Society, Series B, 45, 278–286.CrossRef Google Scholar

Duncan, G. T. (1974). An empirical Bayes approach to scoring multiple-choice tests in the misinformation model. Journal of the American Statistical Association, 69, 50–57.CrossRef Google Scholar

Gelfand, A. E., & Smith, A. F. M. (1990). Sampling based approaches to calculating marginal densities. Journal of the American Statistical Association, 85, 398–409.CrossRef Google Scholar

Geweke, J. (1988). Antithetic acceleration of Monte-Carlo integration in Bayesian inference. Journal of Econometrics, 38, 73–89.CrossRef Google Scholar

Geweke, J. (1989). Exact predictive density for linear models with arch distributions. Journal of Econometrics, 40, 63–86.CrossRef Google Scholar

Hsu, J. S.J. (1990). Bayesian inference and marginalization. Unpublished doctoral dissertation, University of Wisconsin-Madison.Google Scholar

Keats, J. A. (1964). Some generalizations of a theoretical distribution of mental test scores. Psychometrika, 29, 215–231.CrossRef Google Scholar

Lehmann, E. L. (1983). Theory of point estimation, New York: John Wiley & Sons.CrossRef Google Scholar

Leonard, T. (1972). Bayesian methods for binomial data. Biometrika, 59, 581–589.CrossRef Google Scholar

Leonard, T. (1973). A Bayesian method for histograms. Biometrika, 60, 297–308.Google Scholar

Leonard, T. (1982). Comment on the paper by Lejeune and Faulkenberry. Journal of the American Statistical Association, 77, 657–658.Google Scholar

Leonard, T. (1984). Some data-analytic modifications to Bayes-Stein estimation. Annals of the Institute of Statistical Mathematics, 36, 11–21.CrossRef Google Scholar

Leonard, T., Hsu, J. S.J., & Tsui, K. (1989). Bayesian marginal inference. Journal of the American Statistical Association, 84, 1051–1058.CrossRef Google Scholar

Leonard, T., & Novick, J. B. (1986). Bayesian full rank marginalization for two-way contingency tables. Journal of Educational Statistics, 11, 33–56.CrossRef Google Scholar

Lord, F. M. (1965). A strong true-score theory, with applications. Psychometrika, 30, 239–270.CrossRef Google Scholar

Lord, F. M. (1969). Estimating true-score distributions in psychological testing: An empirical Bayes estimation problem. Psychometrika, 34, 259–299.CrossRef Google Scholar

Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores (with contributions by Allen Birnbaum), Reading, MA: Addison-Wiley.Google Scholar

Lord, F. M., & Stocking, M. L. (1976). An interval estimate for making statistical inference about true scores. Psychometrika, 41, 79–87.CrossRef Google Scholar

McCullagh, P., & Nelder, J. A. (1985). Generalized linear models, New York: Chapman and Hall.Google Scholar

Mislevy, R. J. (1986). Bayes modal estimation in item response. Psychometrika, 51, 177–195.CrossRef Google Scholar

Morrison, D. G., & Brockway, G. (1979). A modified beta-binomial model with applications to multiple choice and taste tests. Psychometrika, 44, 427–442.CrossRef Google Scholar

Prentice, R. L., & Barlow, W. E. (1988). Correlated binary regression with covariates specific to each binary observation. Biometrics, 44, 1033–48.CrossRef Google Scholar PubMed

Rubinstein, R. Y. (1981). Simulation and the Monte Carlo method, New York: John Wiley and Sons.CrossRef Google Scholar

Schwarz, G. (1978). Estimating the dimension of a model. Annals of Mathematical Statistics, 6, 461–464.Google Scholar

Takane, Y., Bozdogan, H., & Shibayama, T. (1987). Ideal point discriminant analysis. Psychometrika, 52, 371–392.CrossRef Google Scholar

Wilcox, R. R. (1981). A review of the beta-binomial model and its extensions. Journal of Educational Statistics, 6, 3–32.CrossRef Google Scholar

Wilcox, R. R. (1981). A cautionary note on estimating the reliability of a mastery test with the beta-binomial model. Applied Psychological Measurement, 5, 531–537.CrossRef Google Scholar

Young, A. S. (1977). A Bayesian approach to prediction using polynomials. Biometrika, 64, 309–318.CrossRef Google Scholar

Article contents

Statistical Inference for Multiple Choice Tests

Abstract

Keywords

Access options

Article purchase

Temporarily unavailable

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests