Hostname: page-component-745bb68f8f-s22k5 Total loading time: 0 Render date: 2025-01-08T12:06:46.976Z Has data issue: false hasContentIssue false

Missing Data Mechanisms and Homogeneity of Means and Variances–Covariances

Published online by Cambridge University Press:  01 January 2025

Ke-Hai Yuan*
Affiliation:
Nanjing University of Posts and Telecommunications, University of Notre Dame
Mortaza Jamshidian
Affiliation:
California State University, Fullerton
Yutaka Kano
Affiliation:
Osaka University
*
Correspondence should be made to Ke-Hai Yuan, University of Notre Dame, Notre Dame, USA. Email: [email protected]

Abstract

Unless data are missing completely at random (MCAR), proper methodology is crucial for the analysis of incomplete data. Consequently, methods for effectively testing the MCAR mechanism become important, and procedures were developed via testing the homogeneity of means and variances–covariances across the observed patterns (e.g., Kim & Bentler in Psychometrika 67:609–624, 2002; Little in J Am Stat Assoc 83:1198–1202, 1988). The current article shows that the population counterparts of the sample means and covariances of a given pattern of the observed data depend on the underlying structure that generates the data, and the normal-distribution-based maximum likelihood estimates for different patterns of the observed sample can converge to the same values even when data are missing at random or missing not at random, although the values may not equal those of the underlying population distribution. The results imply that statistics developed for testing the homogeneity of means and covariances cannot be safely used for testing the MCAR mechanism even when the population distribution is multivariate normal.

Type
Original Paper
Copyright
Copyright © 2018 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

The research was supported by the National Science Foundation under Grant No. SES-1461355.

References

Anderson, T.W., (1957). Maximum likelihood estimates for the multivariate normal distribution when some observations are missing, Journal of the American Statistical Association, 52, 200203.CrossRefGoogle Scholar
Bentler, P.M., (2006). EQS 6 structural equations program manual. Encino, CA:Multivariate Software.Google Scholar
Blanca, M.J., Arnau, J., Löpez-Montiel, D., Bono, R., Bendayan, R., (2015). Skewness and kurtosis in real data samples, Methodology, 9, 7884.CrossRefGoogle Scholar
Bradley, J.V., (1978). Robustness?, British Journal of Mathematical and Statistical Psychology, 31, 144152.CrossRefGoogle Scholar
Chen, H.Y., Little, R., (1999). A test of missing completely at random for generalised estimating equations with missing data, Biometrika, 86, 113.CrossRefGoogle Scholar
Enders, C.K., (2010). Applied missing data analysis. New York:Guilford.Google Scholar
Galati, J.C., Seaton, K.A., (2016). MCAR is not necessary for the complete cases to constitute a simple random subsample of the target sample, Statistical Methods in Medical Research, 25, 15271534.CrossRefGoogle Scholar
Hawkins, D.M., (1981). A new test for multivariate normality and homoscedasticity, Technometrics, 23, 105110.CrossRefGoogle Scholar
Jamshidian, M., Jalal, S., (2010). Tests of homoscedasticity, normality and missing completely at random for incomplete multivariate data, Psychometrika, 75, 6496743124223.CrossRefGoogle ScholarPubMed
Jamshidian, M., Jalal, S., Jansen, C., (2014). MissMech: An R Package for testing homoscedasticity, multivariate normality, and missing completely at random (MCAR), Journal of Statistical Software, 56, 131.CrossRefGoogle Scholar
Jöreskog, K.G., (1971). Simultaneous factor analysis in several populations, Psychometrika, 36, 409426.CrossRefGoogle Scholar
Kano, Y., Takai, K., (2011). Analysis of NMAR missing data without specifying missing-data mechanisms in a linear latent variate model, Journal of Multivariate Analysis, 102, 12411255.CrossRefGoogle Scholar
Kim, K.H., Bentler, P.M., (2002). Tests of homogeneity of means and covariance matrices for multivariate incomplete data, Psychometrika, 67, 609624.CrossRefGoogle Scholar
Li, J., Yu, Y., (2015). A nonparametric test of missing completely at random for incomplete multivariate data, Psychometrika, 80(3), 707726.CrossRefGoogle ScholarPubMed
Little, R.J.A., (1988). A test of missing completely at random for multivariate data with missing values, Journal of the American Statistical Association, 83, 11981202.CrossRefGoogle Scholar
Little, R.J.A., Rubin, D.B., (2002). Statistical analysis with missing data. 2 New York:Wiley.CrossRefGoogle Scholar
Micceri, T., (1989). The unicorn, the normal curve, and other improbable creatures, Psychological Bulletin, 105, 156166.CrossRefGoogle Scholar
Park, T., Davis, C.S., (1993). A test of the missing data mechanism for repeated categorical data, Biometrics, 49, 631638.CrossRefGoogle ScholarPubMed
Park, T., Lee, S-Y, (1997). A test of missing completely at random for longitudinal data with missing observations, Statistics in Medicine, 16, 18591871.3.0.CO;2-3>CrossRefGoogle ScholarPubMed
Qu, A., Song, P.X.K., (2002). Testing ignorable missingness in estimating equation approaches for longitudinal data, Biometrika, 89, 841850.CrossRefGoogle Scholar
Rubin, D.B., (1976). Inference and missing data (with discussions), Biometrika, 63, 581592.CrossRefGoogle Scholar
Sörbom, D., (1974). A general method for studying differences in factor means and factor structures between groups, British Journal of Mathematical and Statistical Psychology, 27, 229239.CrossRefGoogle Scholar
Tang, M., Bentler, P.M., (1998). Theory and method for constrained estimation in structural equation models with incomplete data, Computational Statistics and Data Analysis, 27, 257270.CrossRefGoogle Scholar
Thoemmes, F., & Enders, C. K., (2007). A structural equation model for testing whether data are missing completely at random. In Paper Presented at the Annual Meeting of the American Educational Research Association. IL: Chicago..Google Scholar
Yuan, K-H, (2009). Normal distribution based pseudo ML for missing data: With applications to mean and covariance structure analysis, Journal of Multivariate Analysis, 100, 19001918.CrossRefGoogle Scholar
Yuan, K-H, Chan, W., Tian, Y., (2016). Expectation-robust algorithm and estimating equations for means and dispersion matrix with missing data, Annals of the Institute of Statistical Mathematics, 68, 329351.CrossRefGoogle Scholar