Hostname: page-component-745bb68f8f-lrblm Total loading time: 0 Render date: 2025-01-07T15:48:27.682Z Has data issue: false hasContentIssue false

An Application of Confidence Intervals and of Maximum Likelihood to the Estimation of an Examinee's Ability

Published online by Cambridge University Press:  01 January 2025

Frederic M. Lord*
Affiliation:
Educational Testing Service

Abstract

A mathematical definition of the theoretical relation between the examinee's actual responses to the test items and his “true ability” is selected. A maximum-likelihood solution is obtained for estimating the examinee's “true ability” from his responses to the items. The standard error of the maximum-likelihood estimate is obtained, its relation to the discriminating power of the test is pointed out, and some generalizations are drawn as to the optimum level of item difficulty. The Neyman-Pearson power function is applied to determine which of two psychological tests is the most powerful for the selection of “successful” examinees.

Type
Original Paper
Copyright
Copyright © 1953 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

*

The author is indebteded to Dr. John W. Tukey for helpful comments on a draft of the present manuscript.

References

Brogden, H. Variation in test validity with variation in the distribution of item difficulties, number of items, and degree of their intercorrelation. Psychometrika, 1946, 11, 197214.CrossRefGoogle ScholarPubMed
Cronbach, L. J., and Warrington, W. G. Efficiency of multiple-choice tests as a function of spread of item difficulties. Psychometrika, 1952, 17, 127147.CrossRefGoogle Scholar
Ferguson, G. A. Item selection by the constant process. Psychometrika, 1942, 7, 1929.CrossRefGoogle Scholar
Finney, D. J. Probit analysis, Cambridge: Cambridge Univ. Press, 1947.Google Scholar
Green, B. F.Latent class analysis: A general solution and an empirical evaluation. Ph.D. thesis, Princeton University, 1951.CrossRefGoogle Scholar
Guilford, J. P. Psychometric methods, New York: McGraw-Hill, 1936.Google Scholar
Gulliksen, H. The relation of item difficulty and inter-item correlation to test variance and reliability. Psychometrika, 1945, 10, 7991.CrossRefGoogle Scholar
Gulliksen, H. Theory of mental tests, New York: John Wiley and Sons, 1950.CrossRefGoogle Scholar
Lawley, D. N. On problems connected with item selection and test construction. Proc. roy. Soc. Edin., 1943, 61-A, 273287.Google Scholar
Lawley, D. N. The factorial analysis of multiple item tests. Proc. roy. Soc. Edin., 1944, 62-A, 7482.Google Scholar
Lazarsfeld, P. F. and Stouffer, S. A. et al. Measurement and prediction, Vol. 4 of studies in social psychology in World War II, Princeton: Princeton Univ. Press, 1950.Google Scholar
Long, J. A., and Sandiford, P.The validation of test items. Bulletin No. 3, Department of Educational Research, University of Toronto, 1935.Google Scholar
Lord, F. M. A theory of test scores. Psychometric Monograph No. 7, 1952.Google Scholar
Lord, F. M. The relation of the reliability of multiple-choice tests to the distribution of item difficulties. Psychometrika, 1952, 17, 181194.CrossRefGoogle Scholar
Lorr, M. Interrelationships of number-correct and limen scores for an amount limit test. Psychometrika, 1944, 9, 1730.CrossRefGoogle Scholar
Mood, A. M. Introduction to the theory of statistics. McGraw-Hill, 1950.Google Scholar
Mosier, C. I. Psychophysics and mental test theory: fundamental postulates and elementary theorems. Psychol. Rev., 1940, 47, 355366.CrossRefGoogle Scholar
Mosier, C. I. Psychophysics and mental test theory. II. The constant process. Psychol. Rev., 1941, 48, 235249.CrossRefGoogle Scholar
Richardson, M. W. Relation between the difficulty and the differential validity of a test. Psychometrika, 1936, 1(2), 3349.CrossRefGoogle Scholar
Symonds, P. M. Choice of items for a test on the basis of difficulty. J. educ. Psychol., 1928, 19, 7387.CrossRefGoogle Scholar
Thorndike, R. L. Personnel selection (pp. 228230). New York: John Wiley and Sons, 1949.Google Scholar
Thurstone, T. The difficulty of a test and its diagnostic value. J. educ. Psychol., 1932, 23, 335343.CrossRefGoogle Scholar
Tucker, L. R. Maximum validity of a test with equivalent items. Psychometrika, 1946, 11, 113.CrossRefGoogle ScholarPubMed
Tucker, L. R. A method for scaling ability test items in difficulty taking item unreliability into account. Amer. Psychologist, 1948, 3, 309310. (Abstract)Google Scholar
Wilks, S. S. Mathematical statistics, Princeton: Princeton Univ. Press, 1944.Google Scholar