Exploring the relationship between two compositions using canonical correlation analysis

Glòria Mateu-Figueras; Josep Daunis-i-Estadella; Germà Coenders; Berta Ferrer-Rosell; Ricard Serlavós; Joan Batista-Foguet

doi:10.51936/epet8264

Exploring the relationship between two compositions using canonical correlation analysis

Advances in Methodology and Statistics ◽

10.51936/epet8264 ◽

2016 ◽

Vol 13 (2) ◽

Author(s):

Glòria Mateu-Figueras ◽

Josep Daunis-i-Estadella ◽

Germà Coenders ◽

Berta Ferrer-Rosell ◽

Ricard Serlavós ◽

...

Keyword(s):

Correlation Analysis ◽

Learning Style ◽

Canonical Correlation Analysis ◽

Canonical Correlation ◽

Compositional Data ◽

The Other ◽

Compositional Data Analysis ◽

Maximum Correlation ◽

Canonical Variates ◽

Log Ratio

The aim of this article is to describe a method for relating two compositions which combines compositional data analysis and canonical correlation analysis (CCA), and to examine its main statistical properties. We use additive log-ratio (alr) transformation on both compositions and apply standard CCA to the transformed data. We show that canonical variates are themselves log-ratios and log-contrasts. The first pair of canonical variates can be interpreted as the log-contrast of a composition that has the maximum correlation with a log-contrast of the other composition. The second pair can be interpreted as the log-contrast of a composition that has the maximum correlation with a log-contrast of the other composition, under the restriction that they are uncorrelated with the first pair, and so on. Using properties from changes of basis, we prove that both canonical correlations and canonical variates are invariant to the choice of divisors in alr transformation. We show how to implement the analysis and interpret the results by means of an illustration from the social sciences field using data from Kolb's Learning Style Inventory and Boyatzis' Philosophical Orientation Questionnaire, which distribute a fixed total score among several learning modes and philosophical orientations.

Download Full-text

Linking of pyrolysis-chemical ionisation mass spectrometric and monomer compositional data of 0-(2-hydroxyethyl) celluloses by canonical correlation analysis

Journal of Analytical and Applied Pyrolysis ◽

10.1016/0165-2370(94)00866-y ◽

1995 ◽

Vol 33 ◽

pp. 21-38 ◽

Cited By ~ 7

Author(s):

Peter W. Arisz ◽

Gert B. Eijkel ◽

Jaap J. Boon

Keyword(s):

Correlation Analysis ◽

Canonical Correlation Analysis ◽

Canonical Correlation ◽

Compositional Data ◽

Mass Spectrometric ◽

Chemical Ionisation ◽

Ionisation Mass

Download Full-text

Compositional Canonical Correlation Analysis

10.1101/144584 ◽

2017 ◽

Author(s):

Jan Graffelman ◽

Vera Pawlowsky-Glahn ◽

Juan José Egozcue ◽

Antonella Buccianti

Keyword(s):

Trace Elements ◽

Correlation Analysis ◽

Canonical Correlation Analysis ◽

Canonical Correlation ◽

Covariance Matrices ◽

Generalized Inverses ◽

Geological Data ◽

Data Set ◽

X Ray ◽

Log Ratio

AbstractThe study of the relationships between two compositions by means of canonical correlation analysis is addressed A coimnositional version of canonical correlation analysis is developed. and called CODA-CCO. We consider two approaches, using the centred log-ratio transformation and the calculation of all possible pairwise log-ratios within sets. The relationships between both approaches are pointed out, and their merits are discussed. The related covariance matrices are structurally singular, and this is efficiently dealt with by using generalized inverses. We develop compositional canonical biplots and detail their properties. The canonical biplots are shown to be powerful tools for discovering the most salient relationships between two compositions. Some guidelines for compositional canonical biplots construction are discussed. A geological data set with X-ray fluorescence spectrometry measurements on major oxides and trace elements is used to illustrate the proposed method. The relationships between an analysis based on centred log-ratios and on isometric log-ratios are also shown.

Download Full-text

Distribution related to temperature and salinity of the shrimps Acetes americanus and Peisos petrunkevitchi (Crustacea: Sergestoidea) in the south-eastern Brazilian littoral zone

Journal of the Marine Biological Association of the United Kingdom ◽

10.1017/s0025315412000902 ◽

2012 ◽

Vol 93 (3) ◽

pp. 753-759 ◽

Cited By ~ 5

Author(s):

Sabrina Morilhas Simões ◽

Antonio Leão Castilho ◽

Adilson Fransozo ◽

Maria Lúcia Negreiros-Fransozo ◽

Rogerio Caetano da Costa

Keyword(s):

Correlation Analysis ◽

Canonical Correlation Analysis ◽

Canonical Correlation ◽

Littoral Zone ◽

The Other ◽

Strongly Correlated ◽

Ecological Distribution ◽

The South ◽

South Eastern

The abundance and ecological distribution of Acetes americanus and Peisos petrunkevitchi were investigated from July 2006 to June 2007, in Ubatuba, Brazil. Eight transects were identified and sampled monthly: six of these transects were located in Ubatuba bay, with depths reaching 21 m, and the other two transects were in estuarine environments. A total of 33,888 A. americanus shrimp were captured, with the majority coming from the shallower transects (up to 10 m). Conversely, 6,173 of the P. petrunkevitchi shrimps were captured in deeper areas (from 9 to 21 m). No individuals from either species were found in the estuary. The highest abundances obtained for both species were sampled during the summer. Canonical correlation analysis resulted in a coefficient value of 0.68 (P = 0.00). The abundance of both species was strongly correlated with depth. Variations in temperature and salinity values were also informative in predicting the seasonal presence of P. petrunkevitchi in deeper areas and A. americanus in the shallower areas of the bay. It is conceivable that the shrimp adjust their ecological distribution according to their intrinsic physiological limitations.

Download Full-text

Structure of the Coolidge Axis II Inventory Personality Disorder Scales from the Five-Factor Model Perspective

Psychological Reports ◽

10.2466/pr0.1998.83.3.947 ◽

1998 ◽

Vol 83 (3) ◽

pp. 947-952 ◽

Cited By ~ 5

Author(s):

Nerella V. Ramanaiah ◽

J. Patrick Sharpe

Keyword(s):

Correlation Analysis ◽

Personality Disorders ◽

Canonical Correlation Analysis ◽

Canonical Correlation ◽

Simple Structure ◽

Factor Model ◽

Five Factor Model ◽

Canonical Variate ◽

Axis Ii ◽

Canonical Variates

Coolidge, et al. in 1994 tested the generality and comprehensiveness of the five-factor model of personality as applied to personality disorders by performing a canonical correlation analysis for the scales from the Coolidge Axis II Inventory and the NEO Personality Inventory testing 178 undergraduates (106 men and 72 women). Their results did not support the generality and comprehensiveness of the five-factor model for interpreting the structure of personality disorders. A major problem with this study was that the data did not show good simple structure and meaningfulness because no rotation was performed for the canonical variates. The present study tested the hypothesis that the results of Coolidge, et al. might be attributed to the failure to rotate canonical variates to obtain good simple structure. For 220 students in introductory psychology (104 men and 116 women), canonical correlation analysis with varimax rotation was performed for scores on the Coolidge Axis II Inventory scales and the NEO Five-Factor Inventory scales. The analysis indicated five canonical variate pairs which were interpreted as Neuroticism, Extraversion, Openness, Disagreeableness, and Conscientiousness, supporting the tested hypothesis as well as the generality and comprehensiveness of this model for describing the structure of personality disorders.

Download Full-text

Encoding Prior Knowledge with Eigenword Embeddings

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00108 ◽

2016 ◽

Vol 4 ◽

pp. 417-430 ◽

Cited By ~ 2

Author(s):

Dominique Osborne ◽

Shashi Narayan ◽

Shay B. Cohen

Keyword(s):

Correlation Analysis ◽

Prior Knowledge ◽

Canonical Correlation Analysis ◽

Canonical Correlation ◽

The Other ◽

Word Embeddings ◽

Theoretical Justification

Canonical correlation analysis (CCA) is a method for reducing the dimension of data represented using two views. It has been previously used to derive word embeddings, where one view indicates a word, and the other view indicates its context. We describe a way to incorporate prior knowledge into CCA, give a theoretical justification for it, and test it by deriving word embeddings and evaluating them on a myriad of datasets.

Download Full-text

A Multidimensional Approach to Competitiveness, Innovation and Well-Being in the EU Using Canonical Correlation Analysis

Journal of Competitiveness ◽

10.7441/joc.2020.04.01 ◽

2020 ◽

Vol 12 (4) ◽

pp. 5-21

Author(s):

Ane-Mari Androniceanu ◽

Jani Kinnunen ◽

Irina Georgescu ◽

Armenia Androniceanu

Keyword(s):

Correlation Analysis ◽

Canonical Correlation Analysis ◽

Canonical Correlation ◽

Well Being ◽

Eu Member States ◽

Canonical Variates ◽

Main Factors ◽

The Uk ◽

National Governments ◽

The Eu

Achieving a competitive economy and a competitive market generally proceeds from the desire to meet economic and social objectives and it ensures a growing level of social welfare. The objectives of our research are to determine and highlight the bidirectional linear correlations among competitiveness, well-being and innovation and to analyze the main factors that influence these relations. Our research includes the EU member states and the UK using these countries’ specific indicators from the databases of EUROSTAT, the World Economic Forum and the United Nations from 2016-2018. We used Canonical Correlation Analysis to determine a set of canonical variates which represent linear combinations of the variables from each set. The contributions of our research show a direct and strong link among the three pillars of competitiveness, innovation and well-being. This analysis allowed us to identify and analyze the influence of innovation on the economic development and competitiveness of each EU country and on the well-being of its population. Governments and organizations that invest more in research in terms of innovation to increase the competitiveness of their products and services have shown a growing GDP and a higher level of population well-being. This research is representative at the European level and may influence the decisions of national governments and other institutions to encourage innovation through drivers such as R&D expenditures and human resources as the main factors generating economic growth and competitiveness, thus with a direct effect on GDP and on well-being.

Download Full-text

Succession des communautés de gastéropodes dans deux milieux différant par leur degré d'eutrophisation

Canadian Journal of Zoology ◽

10.1139/z84-339 ◽

1984 ◽

Vol 62 (11) ◽

pp. 2317-2327 ◽

Cited By ~ 7

Author(s):

P. Legendre ◽

D. Planas ◽

M.-J. Auclair

Keyword(s):

Principal Component Analysis ◽

Correlation Analysis ◽

Canonical Correlation Analysis ◽

Canonical Correlation ◽

Plant Cover ◽

Principal Component ◽

The Other ◽

Nutrient Concentrations ◽

Benthic Species ◽

Gastropod Species

This paper compares the succession of gastropods in two environments that are adjacent in space but differ as to their eutrophic level. One is hypereutrophic (du Sud River), the other is mesotrophic (Richelieu River). Canonical correlation analysis brings out the main differences between these two stations, while principal component analysis is used to describe the succession of species within each community. These analyses indicate that the occurrence of gastropod species, as well as their development cycles, may be adapted to the particular synecological evolution of each environment. Thus, the species would not react directly to nutrient concentrations but indirectly, through the effects of these concentrations on oxygen content, plant cover, and predators. In these two environments, some benthic species seem to be good indicators of the eutrophic level of the ecosystem.

Download Full-text

ROBUST CO-TRAINING

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001411008981 ◽

2011 ◽

Vol 25 (07) ◽

pp. 1113-1126 ◽

Cited By ~ 47

Author(s):

SHILIANG SUN ◽

FENG JIN

Keyword(s):

Correlation Analysis ◽

Supervised Learning ◽

Canonical Correlation Analysis ◽

Canonical Correlation ◽

Learning Algorithm ◽

Experimental Results ◽

The Other ◽

Classification Problems ◽

Training Examples ◽

Low Dimensional

Co-training is a multiview semi-supervised learning algorithm to learn from both labeled and unlabeled data, which iteratively adopts a classifier trained on one view to teach the other view using some confident predictions given on unlabeled examples. However, as it does not examine the reliability of the labels provided by classifiers on either view, co-training might be problematic. Even very few inaccurately labeled examples can deteriorate the performance of learned classifiers to a large extent. In this paper, a new method named robust co-training is proposed, which integrates canonical correlation analysis (CCA) to inspect the predictions of co-training on those unlabeled training examples. CCA is applied to obtain a low-dimensional and closely correlated representation of the original multiview data. Based on this representation the similarities between an unlabeled example and the original labeled examples are determined. Only those examples whose predicted labels are consistent with the outcome of CCA examination are eligible to augment the original labeled data. The performance of robust co-training is evaluated on several different classification problems where encouraging experimental results are observed.

Download Full-text

Analysis of Maximum Expiratory Flow Volume Curves Using Canonical Correlation Analysis

Methods of Information in Medicine ◽

10.1055/s-0038-1635359 ◽

1985 ◽

Vol 24 (02) ◽

pp. 91-100 ◽

Cited By ~ 3

Author(s):

W. van Pelt ◽

Ph. H. Quanjer ◽

M. E. Wise ◽

E. van der Burg ◽

R. van der Lende

Keyword(s):

Correlation Analysis ◽

Canonical Correlation Analysis ◽

Canonical Correlation ◽

Flow Volume ◽

Non Linear ◽

Maximum Expiratory Flow ◽

Expiratory Flow ◽

Relationship Of ◽

The Relationship ◽

Age And Sex

SummaryAs part of a population study on chronic lung disease in the Netherlands, an investigation is made of the relationship of both age and sex with indices describing the maximum expiratory flow-volume (MEFV) curve. To determine the relationship, non-linear canonical correlation was used as realized in the computer program CANALS, a combination of ordinary canonical correlation analysis (CCA) and non-linear transformations of the variables. This method enhances the generality of the relationship to be found and has the advantage of showing the relative importance of categories or ranges within a variable with respect to that relationship. The above is exemplified by describing the relationship of age and sex with variables concerning respiratory symptoms and smoking habits. The analysis of age and sex with MEFV curve indices shows that non-linear canonical correlation analysis is an efficient tool in analysing size and shape of the MEFV curve and can be used to derive parameters concerning the whole curve.

Download Full-text

Microphone Classification Using Canonical Correlation Analysis

IEICE Transactions on Fundamentals of Electronics Communications and Computer Sciences ◽

10.1587/transfun.e97.a.1024 ◽

2014 ◽

Vol E97.A (4) ◽

pp. 1024-1026

Author(s):

Jongwon SEOK ◽

Keunsung BAE

Keyword(s):

Correlation Analysis ◽

Canonical Correlation Analysis ◽

Canonical Correlation

Download Full-text