Hostname: page-component-745bb68f8f-b95js Total loading time: 0 Render date: 2025-01-07T14:06:41.288Z Has data issue: false hasContentIssue false

Stepwise Variable Selection in Factor Analysis

Published online by Cambridge University Press:  02 January 2025

Yutaka Kano*
Affiliation:
Osaka University
Akira Harada
Affiliation:
Osaka University
*
Requests for reprints should be sent to Yutaka Kano, Osaka University, Faculty of Human Sciences, Suita, Osaka 565-0781, JAPAN, or to the email address: [email protected], or [email protected].

Abstract

It is very important to choose appropriate variables to be analyzed in multivariate analysis when there are many observed variables such as those in a questionnaire. What is actually done in scale construction with factor analysis is nothing but variable selection.

In this paper, we take several goodness-of-fit statistics as measures of variable selection and develop backward elimination and forward selection procedures in exploratory factor analysis. Once factor analysis is done for a certain number p of observed variables (the p-variable model is labeled the current model), simple formulas for predicted fit measures such as chi-square, GFI, CFI, IFI and RMSEA, developed in the field of the structural equation modeling, are provided for all models obtained by adding an external variable (so that the number of variables is p + 1) and for those by deleting an internal variable (so that the number is p − 1), provided that the number of factors is held constant.

A program SEFA (Stepwise variable selection in Exploratory Factor Analysis) is developed to actually obtain a list of the fit measures for all such models. The list is very useful in determining which variable should be dropped from the current model to improve the fit of the current model. It is also useful in finding a suitable variable that may be added to the current model. A model with more appropriate variables makes more stable inference in general.

The criteria traditionally often used for variable selection is magnitude of communalities. This criteria gives a different choice of variables and does not improve fit of the model in most cases.

Type
Original Paper
Copyright
Copyright © 2000 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

The URL of the programSEFA is http://koko15.hus.osaka-u.ac.jp/~harada/factor/stepwise/.

References

Akaike, H. (1987). Factor analysis and AIC. Psychometrika, 52, 317332CrossRefGoogle Scholar
Anderson, T. W., & Rubin, H. (1956). Statistical inference in factor analysis. Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability (pp. 111150). Berkeley: University of California PressGoogle Scholar
Arbuckle, J. L. (1995). AMOS 3.5. Chicago: SmallwatersGoogle Scholar
Bentler, P. M. (1990). Comparative fit indexes in structural models. Psychological Bulletin, 107, 238246CrossRefGoogle ScholarPubMed
Bentler, P. M. (1995). EQS structural equations program manual. Los Angeles: Multivariate SoftwareGoogle Scholar
Bollen, K. A. (1989). Structural equations with latent variables. New York: WileyCrossRefGoogle Scholar
Bollen, K. A. (1989). A new incremental fit index in general structural equation models. Sociological Methods and Research, 17, 303316CrossRefGoogle Scholar
Browne, M. W. (1982). Covariance structures. In Hawkins, D. M. (Eds.), Topics in applied multivariate analysis (pp. 72141). Cambridge: Cambridge University PressCrossRefGoogle Scholar
Browne, M. W. (1984). Asymptotically distribution-free methods for the analysis of covariance structures. British Journal of Mathematical and Statistical Psychology, 37, 6283CrossRefGoogle ScholarPubMed
Browne, M. W., Cudeck, R. (1993). Alternative ways of assessing model fit. In Bollen, K. A., & Long, J. S. (Eds.), Testing structural equation models (pp. 137162). Newbury Park: SAGE PublicationsGoogle Scholar
Browne, M. W., Cudeck, R., Tateneni, K., & Mels, G. (1998). CEFA: Comprehensive exploratory factor analysis. Columbus, OH: The Ohio State University, Department of PsychologyGoogle Scholar
Buse, A. (1982). The likelihood ratio, Wald, and Lagrange multiplier tests: An expository note. The American Statistician, 36, 153157Google Scholar
Clarke, M. R. B. (1970). A rapidly convergent method for maximum likelihood factor analysis. British Journal of Mathematical and Statistical Psychology, 23, 4352CrossRefGoogle Scholar
Cudeck, R. (1989). Noniterative factor analysis estimators, with algorithms for subset and instrumental variable selection. Journal of Educational Statistics, 16, 3552CrossRefGoogle Scholar
Davis, F. B. (1944). Fundamental factors of comprehension in reading. Psychometrika, 9, 185197CrossRefGoogle Scholar
Gorsuch, R. L. (1988). Exploratory factor analysis. In Nesselroade, J. R., & Cattell, R. B. (Eds.), Handbook of multivariate experimental psychology 2nd ed., (pp. 231258). New York and London: Plenum PressCrossRefGoogle Scholar
Harada, A., & Kano, Y. (1998). SEFA (Stepwise Exploratory Factor Analysis) program manual. Osaka: Osaka University, Faculty of Human SciencesGoogle Scholar
Harman, H. H. (1976). Modern factor analysis 3rd ed., Chicago and London: The University of Chicago PressGoogle Scholar
Ihara, M., & Kano, Y. (1992). Asymptotic equivalence of uniqueness estimators in marginal and conditional factor analysis models. Statistics & Probability Letters, 14, 337341CrossRefGoogle Scholar
Jöreskog, K. G. (1967). Some contributions to maximum likelihood factor analysis. Psychometrika, 34, 183202CrossRefGoogle Scholar
Jöreskog, K. G. (1978). Structural analysis of covariance and correlation matrices. Psychometrika, 43, 443477CrossRefGoogle Scholar
Jöreskog, K. G., & Sörbom, D. (1981). LISREL 5: User's guide. Chicago, IL.Google Scholar
Kano, Y. (1998). Improper solutions in exploratory factor analysis: Causes and treatments. In Rizzi, A., Vichi, M., & Bock, H. (Eds.), Advances in data sciences and classification (pp. 375382). Berlin: SpringerCrossRefGoogle Scholar
Kano, Y., Bentler, P. M., Mooijaart, A. (1993). Additional information and precision of estimators in multivariate analysis. In Matusita, K., Puri, M. L., & Hayakawa, T. (Eds.), Statistical sciences and data analysis: Proceedings of the third pacific area statistical conference (pp. 187196). Zeist, The Netherlands: VSP International Science PublisherCrossRefGoogle Scholar
Kano, Y., & Ihara, M. (1994). Identification of inconsistent variates in factor analysis. Psychometrika, 59, 520CrossRefGoogle Scholar
Lawley, D. N., & Maxwell, A. E. (1971). Factor analysis as a statistical method 2nd ed., London: ButterworthsGoogle Scholar
Magnus, J. R., & Neudecker, H. (1988). Matrix differential calculus with applications in statistics and econometrics. New York: WileyGoogle Scholar
Maxwell, A. E. (1961). Recent trends in factor analysis. Journal of the Royal Statistical Society, Series A, 124, 4959CrossRefGoogle Scholar
Mulaik, S. A. (1976). Comments on “The measurement factorial indeterminacy”. Psychometrika, 41, 249262CrossRefGoogle Scholar
Mulaik, S. A., & McDonald, R. P. (1978). The effect of additional variables on factor indeterminacy in models with a single common factor. Psychometrika, 43, 177192CrossRefGoogle Scholar
Rao, C. R. (1973). Linear statistical inference and its application 2nd ed., New York: WileyCrossRefGoogle Scholar
SAS (1989). Cary, NC: SAS Institute.Google Scholar
Sato, M. (1987). Pragmatic treatment of improper solutions in factor analysis. Annals of Institute of Statistical Mathematics, 39, 443455CrossRefGoogle Scholar
SPSS (1977). Chicago: SPSS.Google Scholar
Steiger, J. H. (1979). Factor indeterminacy in the 1930's and the 1970's: Some interesting parallels. Psychometrika, 44, 157167CrossRefGoogle Scholar
Tanaka, Y. (1983). Some criteria for variable selection in factor analysis. Behaviormetrika, 13, 3145CrossRefGoogle Scholar
van Driel, O. P. (1978). On various causes of improper solutions in maximum likelihood factor analysis. Psychometrika, 43, 225243CrossRefGoogle Scholar
West, R. W., Ogden, R. T., & Rossini, J. A. (1998). Statistical tools on the World Wide Web. The American Statistician, 52, 257262CrossRefGoogle Scholar
Yanai, H. (1980). A proposition of generalized method for forward selection of variables. Behaviormetrika, 7, 95107CrossRefGoogle Scholar