Tests of Homoscedasticity, Normality, and Missing Completely at Random for Incomplete Multivariate Data

Mortaza Jamshidian; Siavash Jalal

doi:10.1007/s11336-010-9175-3

Tests of Homoscedasticity, Normality, and Missing Completely at Random for Incomplete Multivariate Data

Published online by Cambridge University Press: 01 January 2025

Mortaza Jamshidian and

Siavash Jalal

Show author details

Mortaza Jamshidian*: Affiliation:
California State University, Fullerton
Siavash Jalal: Affiliation:
University of California, Los Angeles
*: Requests for reprints should be sent to Mortaza Jamshidian, Department of Mathematics, California State University, Fullerton, CA 92834, USA. E-mail: [email protected]

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

Test of homogeneity of covariances (or homoscedasticity) among several groups has many applications in statistical analysis. In the context of incomplete data analysis, tests of homoscedasticity among groups of cases with identical missing data patterns have been proposed to test whether data are missing completely at random (MCAR). These tests of MCAR require large sample sizes n and/or large group sample sizes ni, and they usually fail when applied to nonnormal data. Hawkins (Technometrics 23:105–110, 1981) proposed a test of multivariate normality and homoscedasticity that is an exact test for complete data when ni are small. This paper proposes a modification of this test for complete data to improve its performance, and extends its application to test of homoscedasticity and MCAR when data are multivariate normal and incomplete. Moreover, it is shown that the statistic used in the Hawkins test in conjunction with a nonparametric k-sample test can be used to obtain a nonparametric test of homoscedasticity that works well for both normal and nonnormal data. It is explained how a combination of the proposed normal-theory Hawkins test and the nonparametric test can be employed to test for homoscedasticity, MCAR, and multivariate normality. Simulation studies show that the newly proposed tests generally outperform their existing competitors in terms of Type I error rejection rates. Also, a power study of the proposed tests indicates good power. The proposed methods use appropriate missing data imputations to impute missing data. Methods of multiple imputation are described and one of the methods is employed to confirm the result of our single imputation methods. Examples are provided where multiple imputation enables one to identify a group or groups whose covariance matrices differ from the majority of other groups.

Keywords

covariance structures k-sample test missing data multiple imputation nonparametric test structural equations test of homogeneity of covariances

Type: Original Paper
Information: Psychometrika , Volume 75 , Issue 4 , December 2010 , pp. 649 - 674

DOI: https://doi.org/10.1007/s11336-010-9175-3 [Opens in a new window]
Copyright: Copyright © 2010 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

Footnotes

This research has been supported in part by the National Science Foundation Grant DMS-0437258 and the National Institute on Drug Abuse Grant 5P01DA001070-36. Siavash Jalal’s work was partly conducted while he was a graduate student at California State University, Fullerton. We would like to thank the Associate Editor, anonymous referees, and Ke-Hai Yuan for providing valuable comments that resulted in a much improved version of this paper.

References

Anderson, T.W., Darling, D.A. (1954). A test of goodness of fit. Journal of the American Statistical Association, 49, 765–769.CrossRef Google Scholar

Bentler, P.M., Kim, K.H., & Yuan, K.H. (2004). Testing homogeneity of covariances with infrequent missing data patterns. Unpublished Manuscript.Google Scholar

David, F.N. (1939). On Neyman’s “smooth” test for goodness of fit: I. Distribution of the criterion Ψ2 when the hypothesis tested is true. Biometrika, 31, 191–199.Google Scholar

Fisher, R.A. (1932). Statistical methods for research workers, (4th ed.). London: Oliver & Boyd.Google Scholar

Hawkins, D.M. (1981). A new test for multivariate normality and homoscedasticity. Technometrics, 23, 105–110.CrossRef Google Scholar

Hou, C.D. (2005). A simple approximation for the distribution of the weighted combination of non-independent or independent probabilities. Statistics and Probability Letters, 73, 179–187.CrossRef Google Scholar

Jamshidian, M., Bentler, P.M. (1999). ML estimation of mean and covariance structures with missing data using complete data routines. Journal of Educational and Behavioral Statistics, 24, 21–41.CrossRef Google Scholar

Jamshidian, M., Schott, J. (2007). Testing equality of covariance matrices when data are incomplete. Computational Statistics and Data Analysis, 51, 4227–4239.CrossRef Google Scholar

Kim, K.H., Bentler, P.M. (2002). Tests of homogeneity of means and covariance matrices for multivariate incomplete data. Psychometrika, 67, 609–624.CrossRef Google Scholar

Kruskal, W., Wallis, W. (1952). Use of ranks in one-criterion variance analysis. Journal of the American Statistical Association, 47, 583–621.CrossRef Google Scholar

Ledwina, T. (1994). Data-driven version of Neyman’s smooth test of fit. Journal of the American Statistical Association, 89, 1000–1005.CrossRef Google Scholar

Little, R.J.A. (1988). A test of missing completely at random for multivariate data with missing values. Journal of the American Statistical Association, 83, 1198–1202.CrossRef Google Scholar

Little, R.J.A., Rubin, D.B. (1987). Statistical analysis with missing data, (1st ed.). New York: Wiley.Google Scholar

Little, R.J.A., Rubin, D.B. (2002). Statistical analysis with missing data, (2nd ed.). New York: Wiley.CrossRef Google Scholar

Marhuenda, Y., Morales, D., Pardo, M.C. (2005). A comparison of uniformity tests. Statistics, 39, 315–328.CrossRef Google Scholar

Neyman, J. (1937). Smooth test for goodness of fit. Skandinavisk Aktuarietidskrift, 20, 150–199.Google Scholar

Rayner, J.C.W., Best, D.J. (1990). Smooth tests of goodness of fit: An overview. International Statistical Review, 58, 9–17.CrossRef Google Scholar

Rubin, D.B. (1976). Inference and missing data. Biometrika, 63, 581–592.CrossRef Google Scholar

Rubin, D.B. (1987). Multiple imputation for nonresponse in surveys, New York: Wiley.CrossRef Google Scholar

Scholz, F.W., Stephens, M.A. (1987). K-sample Anderson-Darling tests. Journal of the American Statistical Association, 82, 918–924.Google Scholar

Srivastava, M.S. (2002). Methods of multivariate statistics, New York: Wiley.Google Scholar

Srivastava, M.S., Dolatabadi, M. (2009). Multiple imputation and other resampling scheme for imputing missing observations. Journal of Multivariate Analysis, 100, 1919–1937.CrossRef Google Scholar

Thas, O., Ottoy, J.-P. (2004). An extension of the Anderson-Darling k-sample test to arbitrary sample space partition sizes. Journal of Statistical Computation and Simulation, 74, 651–665.CrossRef Google Scholar

Yuan, K.H. (2009). Normal distribution pseudo ML for missing data: With applications to mean and covariance structure analysis. Journal of Multivariate Analysis, 100, 1900–1918.CrossRef Google Scholar

Yuan, K.H., Bentler, P.M., Zhang, W. (2005). The effect of skewness and kurtosis on mean and covariance structure analysis: The univariate case and its multivariate implication. Sociological Methods & Research, 34, 240–258.CrossRef Google Scholar

Article contents

Tests of Homoscedasticity, Normality, and Missing Completely at Random for Incomplete Multivariate Data

Abstract

Keywords

Access options

Article purchase

Temporarily unavailable

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests