Hostname: page-component-cd9895bd7-7cvxr Total loading time: 0 Render date: 2025-01-05T09:30:56.143Z Has data issue: false hasContentIssue false

Regularized Generalized Canonical Correlation Analysis: A Framework for Sequential Multiblock Component Methods

Published online by Cambridge University Press:  01 January 2025

Michel Tenenhaus
Affiliation:
HEC Paris
Arthur Tenenhaus*
Affiliation:
CentraleSupelec-L2S-Université Paris-Sud Brain and Spine Institute
Patrick J. F. Groenen
Affiliation:
Erasmus University
*
Correspondence should be made to Arthur Tenenhaus, Laboratoire des Signaux et Systèmes (L2S, UMR CNRS 8506), CentraleSupelec-L2S-Université Paris-Sud, 3 rue Joliot-Curie, Plateau du Moulon, 91192 Gif-sur-Yvette Cedex, France. Email: [email protected]

Abstract

A new framework for sequential multiblock component methods is presented. This framework relies on a new version of regularized generalized canonical correlation analysis (RGCCA) where various scheme functions and shrinkage constants are considered. Two types of between block connections are considered: blocks are either fully connected or connected to the superblock (concatenation of all blocks). The proposed iterative algorithm is monotone convergent and guarantees obtaining at convergence a stationary point of RGCCA. In some cases, the solution of RGCCA is the first eigenvalue/eigenvector of a certain matrix. For the scheme functions x, |x|\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$${\vert }x{\vert }$$\end{document}, x2\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$x^{2}$$\end{document} or x4\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$x^{4}$$\end{document} and shrinkage constants 0 or 1, many multiblock component methods are recovered.

Type
Original paper
Copyright
Copyright © 2017 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Addinsoft (2016). XLSTAT software, Paris..Google Scholar
Carroll, J. D. (1968a). A generalization of canonical correlation analysis to three or more sets of variables. Proceedings of the 76th Convention - American Psychological Association, pp. 227–228..CrossRefGoogle Scholar
Carroll, J. D. (1968b). Equations and Tables for a generalization of canonical correlation analysis to three or more sets of variables. Unpublished companion paper to Carroll J.D..CrossRefGoogle Scholar
Chessel, D., & Hanafi, M. Analyses de la co-inertie de K\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$K$$\end{document} nuages de points Revue de Statistique Appliquée 1996 44, 3560.Google Scholar
Dahl, T., & Næs, T.. (2006). A bridge between Tucker-1 and Carroll’s generalized canonical analysis. Computational Statistics and Data Analysis, 50, 30863098. doi:10.1016/j.csda.2005.06.016.CrossRefGoogle Scholar
Dijkstra, T. K. (1981). Latent variables in linear stochastic models, PhD thesis. Amsterdam: Sociometric Research Foundation..Google Scholar
Dijkstra, T. K.. (1983). Some comments on maximum likelihood and partial least squares methods. Journal of Economics, 22, 6790. doi:10.1016/0304-4076(83)90094-5.CrossRefGoogle Scholar
Dijkstra, T. K., & Henseler, J.. (2015). Consistent and asymptotically normal PLS estimators for linear structural equations. Computational Statistics and Data Analysis, 81, 1023. doi:10.1016/j.csda.2014.07.008.CrossRefGoogle Scholar
Escofier, B., & Pagès, J.. (1994). Multiple factor analysis, (AFMULT package). Computational Statistics and Data Analysis, 18, 121140. doi:10.1016/0167-9473(94)90135-X.CrossRefGoogle Scholar
Fabrigar, L. R., Wegener, D. T., MacCallum, R. C., & Strahan, E. J.. (1999). Evaluating the use of exploratory factor analysis in psychological research. Psychological Methods, 4(3), 272299. doi:10.1037/1082-989X.4.3.272.CrossRefGoogle Scholar
Fessler, J. (2004). Monotone convergence. Lecture notes. https://web.eecs.umich.edu/~fessler/course/600/l/lmono.pdf.Google Scholar
Hair, J. F., Hult, G. T. M., Ringle, C. M., & Sarstedt, M.A primer on partial least squares structural equation modeling (PLS-SEM) 2014 Thousand Oaks, CA: SAGE.Google Scholar
Hanafi, M.. (2007). PLS path modelling: Computation of latent variables with the estimation mode B. Computational Statistics, 22, 275292. doi:10.1007/s00180-007-0042-3.CrossRefGoogle Scholar
Hanafi, M., & Kiers, H. A. L. Analysis of K\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$K$$\end{document} sets of data, with differential emphasis on agreement between and within sets Computational Statistics and Data Analysis 2006 51, 14911508. doi:10.1016/j.csda.2006.04.020.CrossRefGoogle Scholar
Hanafi, M., Kohler, A., & Qannari, E. M.. (2010). Shedding new light on hierarchical principal component analysis. Journal of Chemometrics, 24, 703709. doi:10.1002/cem.1334.CrossRefGoogle Scholar
Hanafi, M., Kohler, A., & Qannari, E. M.. (2011). Connections between multiple co-inertia analysis and consensus principal component analysis. Chemometrics and Intelligent Laboratory Systems, 106, 3740. doi:10.1016/j.chemolab.2010.05.010.CrossRefGoogle Scholar
Hassani, S., Hanafi, M., Qannari, E. M., & Kohler, A.. (2013). Deflation strategies for multi-block principal component analysis revisited. Chemometrics and Intelligent Laboratory Systems, 120, 154168. doi:10.1016/j.chemolab.2012.08.011.CrossRefGoogle Scholar
Horst, P. Relations among m\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$m$$\end{document} sets of measures Psychometrika 1961 26, 126149. doi:10.1007/BF02289710.Google Scholar
Horst, P.. (1961). Generalized canonical correlations and their applications to experimental data. Journal of Clinical Psychology (Monograph supplement), 14, 331347. doi:10.1002/1097-4679(196110)17:4<331::AID-JCLP2270170402>3.0.CO;2-D.3.0.CO;2-D>CrossRefGoogle Scholar
Horst, P.Factor analysis of data matrices 1965 New York: Holt, Rinehart and Winston.Google Scholar
Hotelling, H.. (1936). Relations between two sets of variates. Biometrika, 28, 321377. doi:10.1093/biomet/28.3-4.321.CrossRefGoogle Scholar
Hwang, H., & Takane, Y.Generalized structured component analysis: A component-based approach to structural equation modeling 2014 Boca Raton: CRC Press.CrossRefGoogle Scholar
Jöreskog, K. G., Wold, H.Jöreskog, K. G., & Wold, H.. (1982). The ML and PLS techniques for modeling with latent variables, historical and comparative aspects. Systems under indirect observation, Part 1. Amsterdam: North-Holland 263270.Google Scholar
Journée, M., Nesterov, Y., Richtárik, P., & Sepulchre, R.. (2010). Generalized power method for sparse principal component analysis. The Journal of Machine Learning Research, 11, 517553.Google Scholar
Kettenring, J. R. (1969). Canonical analysis of several sets of variables. Unpublished Ph. D. thesis, Institute of Statistics Mimeo Series No. 615, University of North Carolina at Chapel Hill..Google Scholar
Kettenring, J. R.. (1971). Canonical analysis of several sets of variables. Biometrika, 58, 433451. doi:10.1093/biomet/58.3.433.CrossRefGoogle Scholar
Krämer, N. (2007). Analysis of high-dimensional data with partial least squares and boosting. Doctoral dissertation. Technischen Universität Berlin..Google Scholar
Ledoit, O., & Wolf, M.. (2004). A well-conditioned estimator for large-dimensional covariance matrices. Journal of Multivariate Analysis, 88, 365411. doi:10.1016/S0047-259X(03)00096-4.CrossRefGoogle Scholar
Lohmöller, J.-B. (1989). Latent variables path modeling with partial least squares. Heildelberg: Springer (reprinted 2013)..CrossRefGoogle Scholar
McDonald, R. P.. (1968). A unified treatment of the weighting problem. Psychometrika, 33, 351381. doi:10.1007/BF02289330.CrossRefGoogle ScholarPubMed
McDonald, R. P.. (1996). Path analysis with composite variables. Multivariate Behavioral Research, 31, 239270. doi:10.1207/s15327906mbr3102_5.CrossRefGoogle ScholarPubMed
McKeon, J. J. (1966). Canonical analysis: Some relation between canonical correlation, factor analysis, discriminant analysis, and scaling theory. Psychometric Monograph, 13..Google Scholar
Meyer, R. R.. (1976). Sufficient conditions for the convergence of monotonic mathematical programming algorithms. Journal of Computer and System Sciences, 12(1), 108121. doi:10.1016/S0022-0000(76)80021-9.CrossRefGoogle Scholar
Ringle, C. M., Wende, S., & Becker, J-MSmartPLS 3 2015 Bönningstedt: SmartPLS GmbH.Google Scholar
Schäfer, J., & Strimmer, K. (2005). A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics. Statistical Applications in Genetics and Molecular Biology, 4(1), Article 32..CrossRefGoogle Scholar
Smilde, A. K., Westerhuis, J. A., & de Jong, S.. (2003). A framework for sequential multiblock component methods. Journal of Chemometrics, 17, 323337. doi:10.1002/cem.811.CrossRefGoogle Scholar
Steel, R. G. D.. (1951). Minimum generalized variance for a set of linear functions. Annals of Mathematical Statistics, 22, 456460. doi:10.1214/aoms/1177729594.CrossRefGoogle Scholar
Ten Berge, J. M. F.. (1988). Generalized approaches to the MAXBET problem and the MAXDIFF problem, with applications to canonical correlations. Psychometrika, 53, 487494. doi:10.1007/BF02294402.CrossRefGoogle Scholar
Tenenhaus, M.. (2008). Component-based structural equation modelling. Total Quality Management & Business Excellence, 19(7), 871886. doi:10.1080/14783360802159543.CrossRefGoogle Scholar
Tenenhaus, A., & Guillemot, V. (2017). RGCCA: Regularized and sparse generalized canonical correlation analysis for multiblock data. http://cran.project.org/web/packages/RGCCA/index.html.Google Scholar
Tenenhaus, A., & Tenenhaus, M.. (2011). Regularized generalized canonical correlation analysis. Psychometrika, 76, 257284. doi:10.1007/s11336-011-9206-8.CrossRefGoogle Scholar
Tenenhaus, A., & Tenenhaus, M.. (2014). Regularized generalized canonical correlation analysis for multiblock or multigroup data analysis. European Journal of Operational Research, 238, 391403. doi:10.1016/j.ejor.2014.01.008.CrossRefGoogle Scholar
Tenenhaus, M., Esposito, Vinzi V, Chatelin, Y-M, & Lauro, C.. (2005). PLS path modeling. Computational Statistics & Data Analysis, 48, 159205. doi:10.1016/j.csda.2004.03.005.CrossRefGoogle Scholar
Tucker, L. R.. (1958). An inter-battery method of factor analysis. Psychometrika, 23, 111136. doi:10.1007/BF02289009.CrossRefGoogle Scholar
Van de Geer, J. P. Linear relations among k\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k$$\end{document} sets of variables Psychometrika 1984 49, 7094.Google Scholar
Van den Wollenberg, A. L.. (1977). Redundancy analysis—An alternative to canonical correlation analysis. Psychometrika, 42, 207219. doi:10.1007/BF02294050.CrossRefGoogle Scholar
Wangen, L. E., & Kowalski, B. R.. (1989). A multiblock partial least squares algorithm for investigating complex chemical systems. Journal of Chemometrics, 3, 320. doi:10.1002/cem.1180030104.CrossRefGoogle Scholar
Westerhuis, J. A., Kourti, T., & MacGregor, J. F.. (1998). Analysis of multiblock and hierarchical PCA and PLS models. Journal of Chemometrics, 12, 301321. doi:10.1002/(SICI)1099-128X(199809/10)12:5<301::AID-CEM515>3.0.CO;2-S.3.0.CO;2-S>CrossRefGoogle Scholar
Widaman, K. F.. (1993). Common factor analysis versus principal component analysis: Differential bias in representing model parameters?. Multivariate Behavioral Research, 28(3), 263311. doi:10.1207/s15327906mbr2803_1.CrossRefGoogle ScholarPubMed
Wold, H.David, F. N.. (1966). Nonlinear estimation by iterative least square procedures. Festschrift for Jerzy Neyman, Research papers in Statistics. London: Wiley 411444.Google Scholar
Wold, H.Jöreskog, K. G., & Wold, H.. (1982). Soft modeling: The basic design and some extensions. Systems under indirect observation, Part 2. Amsterdam: North-Holland 154.Google Scholar
Wold, H.Kotz, S., & Johnson, N. L.. (1985). Partial least squares. Encyclopedia of statistical sciences. New York: Wiley 581591.Google Scholar
Wold, S., Hellberg, S., Lundstedt, T., Sjöström, M., & Wold, H. (1987): PLS modeling with latent variables in two or more dimensions. In Proceedings of the symposium on PLS model building: Theory and application pp. 1–21, Frankfurt am Main..Google Scholar
Wold, S., Kettaneh, N., & Tjessem, K.. (1996). Hierarchical multiblock PLS and PC models for easier model interpretation and as an alternative to variable selection. Journal of Chemometrics, 10, 463482. doi:10.1002/(SICI)1099-128X(199609)10:5/6<463::AID-CEM445>3.0.CO;2-L.3.0.CO;2-L>CrossRefGoogle Scholar