Hostname: page-component-745bb68f8f-kw2vx Total loading time: 0 Render date: 2025-01-08T12:01:45.987Z Has data issue: false hasContentIssue false

Generalizability of Stratified-Parallel Tests

Published online by Cambridge University Press:  01 January 2025

Nageswari Rajaratnam
Affiliation:
University of Illinois
Lee J. Cronbach
Affiliation:
University of Illinois
Goldine C. Gleser
Affiliation:
University of Illinois

Extract

One of the major concerns of reliability theory has been the estimation of the reliability of a composite measure from the degree of agreement among its component parts. In the classical theory, formulas were developed under the assumption that the parts are strictly equivalent. It was later shown that the same formulas follow from various sets of weaker assumptions which require the composites to be strictly equivalent and require the parts to have a certain homogeneity of statistical properties, but not necessarily to be equivalent. An alternative model which has received increasing attention in recent years regards a given measure as a random sample from a universe of measures whose homogeneity or equivalence is not specified a priori, and a composite test as a random sample of items from a universe of not-necessarily-equivalent items. This too permits an internal-consistency estimate of reliability. Both the equivalent-composites model and the randomsampling model appear to be unduly restrictive and unrealistic; we propose here to develop the implications of a third model in which a test is considered to have been formed by stratified sampling of items.

Type
Original Paper
Copyright
Copyright © 1965 Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

*

This manuscript was completed prior to Dr. Rajaratnam's death in December, 1963, at which time she was on the staff of the University of Minnesota. These investigations were conducted at the University of Illinois, under grant M-1839 from the National Institute of Mental Health. The present addresses of the junior authors are: Cronbach, School of Education, Stanford University; Gleser, School of Medicine, University of Cincinnati.

References

Cornfield, J. and Tukey, J. W. Average values of mean squares in factorials. Ann. math. Statist., 1956, 27, 907949.CrossRefGoogle Scholar
Cronbach, L. J. Coefficient alpha and the internal structure of tests. Psychometrika, 1951, 16, 297334.CrossRefGoogle Scholar
Cronbach, L. J. and Azuma, H. Internal-consistency reliability formulas applied to randomly-sampled single-factor tests: An empirical comparison. Educ. psychol. Measmt, 1962, 22, 645665.CrossRefGoogle Scholar
Cronbach, L. J. and Gleser, G. C. The signal/noise ratio in the comparison of reliability coefficients. Educ. psychol. Measmt, 1964, 24, 467480.CrossRefGoogle Scholar
Cronbach, L. J., Rajaratnam, N., and Gleser, G. C. Theory of generalizability: A liberalization of reliability theory. Brit. J. statist. Psychol., 1963, 16, 137163.CrossRefGoogle Scholar
Ebel, R. L. Estimation of the reliability of ratings. Psychometrika, 1951, 16, 407424.CrossRefGoogle Scholar
Jackson, R. W. B. and Ferguson, G. A. Studies on the reliability of tests. Bulletin No. 12. Department of Educational Research, Univ. Toronto, 1941.Google Scholar
Lord, F. M. Estimating test reliability. Educ. psychol. Measmt, 1955, 15, 325336.CrossRefGoogle Scholar
Lord, F. M. Sampling error due to choice of split in split-half reliability coefficients. J. exp. Educ., 1956, 24, 245249.CrossRefGoogle Scholar
Lord, F. M. An approach to mental test theory. Psychometrika, 1959, 24, 283302.CrossRefGoogle Scholar
Lyerly, S. B. The Kuder-Richardson formula (21) as a split-half coefficient, and some remarks on its basic assumption. Psychometrika, 1958, 23, 267270.CrossRefGoogle Scholar
Medley, D. M. and Mitzel, H. E. Measuring classroom behavior by systematic observation. In Gage, N. L. (Eds.), Handbook of research on teaching. Chicago: Rand McNally, 1963, 247328.Google Scholar
Rabinowitz, W. and Eikeland, H. M. Estimating the reliability of tests with clustered items. Pedagogisk Forskning, 1964, 86106.Google Scholar
Rajaratnam, N. Reliability formulas for independent decision data when reliability data are matched. Psychometrika, 1960, 25, 261271.CrossRefGoogle Scholar
Technical recommendations for psychological tests and diagnostic techniques. Washington, D.C.: American Psychological Association, 1954. (Psychol. Bull., 1954,51, Suppl.).CrossRefGoogle Scholar
Tryon, R. C. Reliability and behavior domain validity: Reformulation and historical critique. Psychol. Bull., 1957, 54, 229249.CrossRefGoogle ScholarPubMed
Webster, H. A generalization of Kuder-Richardson reliability formula 21. Educ. psychol. Measmt, 1960, 20, 131138.CrossRefGoogle Scholar