Hostname: page-component-5f745c7db-f9j5r Total loading time: 0 Render date: 2025-01-06T21:10:51.208Z Has data issue: true hasContentIssue false

Finite Mixtures in Confirmatory Factor-Analysis Models

Published online by Cambridge University Press:  01 January 2025

Yiu-Fai Yung*
Affiliation:
L. L. Thurstone Psychometric Laboratory, University of North Carolina, Chapel Hill
*
Requests for reprints should be sent to Yiu-Fai Yung, CB#3270 Davie Hall, L. L. Thurstone Psychometric Laboratory, UNC-CH, Chapel Hill NC 27599.

Abstract

In this paper, various types of finite mixtures of confirmatory factor-analysis models are proposed for handling data heterogeneity. Under the proposed mixture approach, observations are assumed to be drawn from mixtures of distinct confirmatory factor-analysis models. But each observation does not need to be identified to a particular model prior to model fitting. Several classes of mixture models are proposed. These models differ by their unique representations of data heterogeneity. Three different sampling schemes for these mixture models are distinguished. A mixed type of the these three sampling schemes is considered throughout this article. The proposed mixture approach reduces to regular multiple-group confirmatory factor-analysis under a restrictive sampling scheme, in which the structural equation model for each observation is assumed to be known. By assuming a mixture of multivariate normals for the data, maximum likelihood estimation using the EM (Expectation-Maximization) algorithm and the AS (Approximate-Scoring) method are developed, respectively. Some mixture models were fitted to a real data set for illustrating the application of the theory. Although the EM algorithm and the AS method gave similar sets of parameter estimates, the AS method was found computationally more efficient than the EM algorithm. Some comments on applying the mixture approach to structural equation modeling are made.

Type
Article
Copyright
Copyright © 1997 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Note: This paper is one of the Psychometric Society's 1995 Dissertation Award papers.—Editor

This article is based on the dissertation of the author. The author would like to thank Peter Bentler, who was the dissertation chair, for guidance and encouragement of this work. Eric Holman, Robert Jennrich, Bengt Muthén, and Thomas Wickens, who served as the committee members for the dissertation, had been very supportive and helpful. Michael Browne is appreciated for discussing some important points about the use of the approximate information in the dissertation. Thanks also go to an anonymous associate editor, whose comments were very useful for the revision of an earlier version of this article.

References

Aitkin, M., Wilson, G. T. (1980). Mixture models, outliers, & the EM algorithm. Technometrics, 22, 325331.CrossRefGoogle Scholar
Behboodian, J. (1970). On a mixture of normal distributions. Biometrika, 57, 215217.CrossRefGoogle Scholar
Bentler, P. M., Lee, S.-Y., Weng, L.-J. (1987). Multiple population covariance structure analysis under arbitrary distribution theory. Communications in Statistics-Theory and Methods, 16, 19511964.CrossRefGoogle Scholar
Bhattacharya, C. G. (1967). A simple method of resolution of a distribution into Gaussian components. Biometrics, 23, 115135.CrossRefGoogle ScholarPubMed
Blåfield, E. (1980). Clustering of observations from finite mixtures with structural information. Jyvaskyla Studies in Computer Science, Economics & Statistics 2, Finland: Jyvaskyla University.Google Scholar
Bollen, K. A. (1989). Structural Equations with Latent Variables, New York: Wiley.CrossRefGoogle Scholar
Browne, M. W. (1982). Covariance structures. In Hawkins, D. M. (Eds.), Topics in applied multivariate analysis (pp. 72141). London: Cambridge University Press.CrossRefGoogle Scholar
Browne, M. W. (1984). Asymptotically distribution-free methods for the analysis of covariance structures. British Journal of Mathematical and Statistical Psychology, 37, 6283.CrossRefGoogle ScholarPubMed
Choi, K. (1969). Estimators for the parameters of a finite mixture of distributions. The Annals of Institute of Statistical Mathematics, 21, 107116.CrossRefGoogle Scholar
Choi, K., Bulgren, W. B. (1968). An estimation procedure for mixtures of distributions. Journal of the Royal Statistical Society, Series B, 30, 444460.CrossRefGoogle Scholar
Crawford, S. L., DeGroot, M. H., Kadane, J. B., Small, M. J. (1992). Modeling lake-chemistry distribution: Approximate Bayesian methods for estimating a finite mixture model. Technometrics, 34, 441455.CrossRefGoogle Scholar
Day, N. E. (1969). Estimating the components of a mixture of normal distributions. Biometrika, 56, 463474.CrossRefGoogle Scholar
Dempster, A. P., Laird, N. M., Rubin, D. B. (1977). Maximum likelihood estimation from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B, 39, 138.CrossRefGoogle Scholar
Do, K., McLachlan, G. J. (1984). Estimation of mixing proportions: A case study. Applied Statistics, 33, 134140.CrossRefGoogle Scholar
Everitt, B. S., Hand, D. J. (1981). Finite mixture distributions, London: Chapman and Hall.CrossRefGoogle Scholar
Fryer, J. G., Robertson, C. A. (1972). A comparison of some methods for estimating mixed normal distributions. Biometrika, 59, 639648.CrossRefGoogle Scholar
Furman, W. D., Lindsay, B. G. (1994). Measuring the relative effectiveness of moment estimators as starting values in maximizing mixture likelihoods. Computational Statistics and Data Analysis, 17, 473492.CrossRefGoogle Scholar
Ganesalingam, S., McLachlan, G. J. (1981). Some efficiency results for the estimation of the mixing proportion in a mixture of two normal distributions. Biometrics, 37, 2333.CrossRefGoogle Scholar
Goldfeld, S. M., Quandt, R. E. (1976). Studies in Nonlinear Estimation, Cambridge, MA: Ballinger.Google Scholar
Hartigan, J. A. (1977). Distribution problems in clustering. In van Ryzin, J. (Eds.), Classification and clustering (pp. 5471). New York: Academic Press.Google Scholar
Hartley, M. J. (1978). Comments on a paper by Quandt and Ramsey. Journal of the American Statistical Association, 73, 738741.Google Scholar
Hasselblad, V. (1966). Estimation of parameters for a mixture of normal distributions. Technometrics, 8, 431444.CrossRefGoogle Scholar
Hasselblad, V. (1967). Finite mixtures of distributions from the exponential family, California: UCLA.Google Scholar
Hathaway, R. J. (1985). A constrained formulation of maximum-likelihood estimation for normal mixture distributions. The Annals of Statistics, 13, 795800.CrossRefGoogle Scholar
Hathaway, R. J. (1986). A constrained EM algorithm for univariate normal mixtures. Journal of Statistical Computation and Simulation, 23, 211230.CrossRefGoogle Scholar
Holzinger, K. J., & Swineford, F. (1939). A study in factor analysis: The stability of a bi-factor solution. Supplementary Educational Monographs, 48.Google Scholar
Hosmer, D. W. (1974). Maximum likelihood estimates of the parameters of a mixture of two regression lines. Communication in Statistics-Theory and Methods, 3, 9951006.Google Scholar
Hosmer, D. W., Dick, N. P. (1977). Information and mixtures of two normal distributions. Journal of Statistical Computation and Simulation, 6, 137148.CrossRefGoogle Scholar
John, S. (1970). On identifying the population of origin of each observation in a mixture of observations from two normal populations. Technometrics, 12, 553563.CrossRefGoogle Scholar
Johnson, R. A., Wichern, D. W. (1988). Applied Multivariate Statistical Analysis 2nd ed.,, New Jersey: Prentice Hall.Google Scholar
Jöreskog, K. G. (1971). Simultaneous factor analysis in several populations. Psychometrika, 57, 409426.CrossRefGoogle Scholar
Kano, Y., Berkane, M., Bentler, P. M. (1990). Covariance structure analysis with heterogeneous kurtosis parameters. Biometrika, 77, 575585.CrossRefGoogle Scholar
Kiefer, N. M. (1978). Discrete parameter variation: Efficient estimation of a switching regression model. Econometrica, 46, 427434.CrossRefGoogle Scholar
Lee, S.-Y., Tsui, K. L. (1982). Covariance structure analysis in several populations. Psychometrika, 47, 297308.CrossRefGoogle Scholar
Lehmann, E. L. (1980). Efficient Likelihood Estimators. The American Statistician, 34, 233235.CrossRefGoogle Scholar
Lindsay, B. G. (1989). Moment matrices: Applications in mixtures. The Annals of Statistics, 13, 435475.Google Scholar
Lindsay, B. G., Basak, P. (1993). Multivariate normal mixtures: A fast consistent method of moments. Journal of the American Statistical Association, 88, 468476.CrossRefGoogle Scholar
Magnus, J. R., Neudecker, H. (1988). Matrix differential calculus with applications in statistics and econometrics, Chichester: Wiley.Google Scholar
McLachlan, G. J. (1982). The classification and mixture mixture likelihood approaches to cluster analysis. Handbook of statistics, 2, 199208.CrossRefGoogle Scholar
McLachlan, G. J., Basford, K. E. (1988). Mixture models: Inference and applications to clustering, New York: Marcel Dekker.Google Scholar
Muthén, B. O. (1989). Latent variable modeling in heterogeneous populations. Psychometrika, 54, 557585.CrossRefGoogle Scholar
Odell, P. L., Basu, J. P. (1976). Concerning several methods for estimating crop acreages using remote sensing data. Communications in Statistics-Theory and Methods, 5, 10911114.CrossRefGoogle Scholar
Pearson, K. (1894). Contribution to the mathematical theory of evolution. Philosophical Transactions of the Royal Society, Series A, 185, 71110.Google Scholar
Please, N. W. (1973). Comparison of factor loadings in different populations. British Journal of Mathematical and Statistical Psychology, 26, 6189.CrossRefGoogle Scholar
Quandt, R. E. (1972). A new approach to estimating switching regressions. Journal of the American Statistical Association, 67, 306310.CrossRefGoogle Scholar
Quandt, R. E., Ramsey, J. B. (1978). Estimating mixtures of normal distributions and switching regressions. Journal of the American Statistical Association, 73, 730738.CrossRefGoogle Scholar
Rajagopalan, M., Loganathan, A. (1991). Bayes estimates of mixing proportions in finite mixture distributions. Communications in Statistics-Theory and Methods, 20, 23372349.CrossRefGoogle Scholar
Rao, C. R. (1952). Advanced statistical methods in biometric research, New York: Wiley.Google Scholar
Redner, R. A., Walker, H. F. (1984). Mixture densities, maximum likelihood and the EM algorithm. SIAM Review, 26, 195239.CrossRefGoogle Scholar
Rubin, D. B., Thayer, D. T. (1982). EM algorithms for ML factor analysis. Psychometrika, 47, 6976.CrossRefGoogle Scholar
SAS Institute (1990). SAS/IML Software: Usage and Reference version 6,, Cary, NC: Author.Google Scholar
Schoenberg, R., Richtand, C. (1984). Application of the EM method. Sociological Methods and Research, 13, 127150.CrossRefGoogle Scholar
Schork, N. (1992). Bootstrapping likelihood ratios in quantitative genetics. In LePage, R., Billard, L. (Eds.), Exploring the limits of bootstrap (pp. 389396). New York: Wiley.Google Scholar
Sclove, S. C. (1977). Population mixture models and clustering algorithms. Communications in Statistics-Theory and Methods, Series A, 6, 417434.CrossRefGoogle Scholar
Scott, A. J., Symons, M. J. (1971). Clustering methods based on likelihood ratio criteria. Biometrics, 27, 238397.CrossRefGoogle Scholar
Smith, A. F. M., Makov, U. E. (1978). A quasi-Bayes sequential procedures for mixtures. Journal of the Royal Statistical Society, Series B, 40, 106111.CrossRefGoogle Scholar
Sörbom, D. (1974). A general method for studying differences in factor means and factor structures between groups. British Journal of Mathematical and Statistical Psychology, 27, 229239.CrossRefGoogle Scholar
Sundberg, R. (1976). An iterative method for solution of the likelihood equations for incomplete data from exponential families. Communications in Statistics-Simulation and Computation, 5, 5564.CrossRefGoogle Scholar
Symons, M. J. (1981). Clustering criteria and multivariate normal mixtures. Biometrics, 37, 3543.CrossRefGoogle Scholar
Tan, W. Y., Chang, W. C. (1972). Some comparisons of the method of moments and the method of maximum likelihood in estimating parameters of a mixture of two normal densities. Journal of the American Statistical Association, 67, 702708.CrossRefGoogle Scholar
Teicher, H. (1960). On the mixture of distributions. The Annals of the Mathematical Statistics, 31, 5573.CrossRefGoogle Scholar
Teicher, H. (1961). Identifiability of mixtures. The Annals of the Mathematical Statistics, 32, 244248.CrossRefGoogle Scholar
Teicher, H. (1963). Identifiability of finite mixtures. The Annals of the Mathematical Statistics, 34, 12651269.CrossRefGoogle Scholar
Titerington, D. M. (1990). Some recent research in the analysis of mixture distributions. Statistics, 21, 619641.CrossRefGoogle Scholar
Titterington, D. M., Smith, A. F. M., Makov, U. E. (1985). Statistical analysis of finite mixture distributions, Chichester: Wiley.Google Scholar
Wolfe, J. H. (1970). Pattern clustering by multivariate mixture analysis. Multivariate Behavioral Research, 5, 329350.CrossRefGoogle ScholarPubMed
Yakowitz, S. J. (1969). A consistent estimators for the identification of finite mixtures. The Annals of Mathematical Statistics, 39, 209214.CrossRefGoogle Scholar
Yakowitz, S. J., Spragins, J. D. (1968). On the identifiability of finite mixtures. The Annals of Mathematical Statistics, 39, 17281735.CrossRefGoogle Scholar
Yung, Y. F. (1995). Finite mixtures in confirmatory factor-analytic models (microfilm), Ann Arbor, MI: Univesity Microfilms.Google Scholar