A generalization of partial least squares regression and correspondence analysis for categorical and mixed data: An application with the ADNI data
AbstractCurrent large scale studies of brain and behavior typically involve multiple populations, diverse types of data (e.g., genetics, brain structure, behavior, demographics, or “mutli-omics,” and “deep-phenotyping”) measured on various scales of measurement. To analyze these heterogeneous data sets we need simple but flexible methods able to integrate the inherent properties of these complex data sets. Here we introduce partial least squares-correspondence analysis-regression (PLS-CA-R) a method designed to address these constraints. PLS-CA-R generalizes PLS regression to most data types (e.g., continuous, ordinal, categorical, non-negative values). We also show that PLS-CA-R generalizes many “two-table” multivariate techniques and their respective algorithms, such as various PLS approaches, canonical correlation analysis, and redundancy analysis (a.k.a. reduced rank regression).