scholarly journals Exploring the relationship between two compositions using canonical correlation analysis

2016 ◽  
Vol 13 (2) ◽  
Author(s):  
Glòria Mateu-Figueras ◽  
Josep Daunis-i-Estadella ◽  
Germà Coenders ◽  
Berta Ferrer-Rosell ◽  
Ricard Serlavós ◽  
...  

The aim of this article is to describe a method for relating two compositions which combines compositional data analysis and canonical correlation analysis (CCA), and to examine its main statistical properties. We use additive log-ratio (alr) transformation on both compositions and apply standard CCA to the transformed data. We show that canonical variates are themselves log-ratios and log-contrasts. The first pair of canonical variates can be interpreted as the log-contrast of a composition that has the maximum correlation with a log-contrast of the other composition. The second pair can be interpreted as the log-contrast of a composition that has the maximum correlation with a log-contrast of the other composition, under the restriction that they are uncorrelated with the first pair, and so on. Using properties from changes of basis, we prove that both canonical correlations and canonical variates are invariant to the choice of divisors in alr transformation. We show how to implement the analysis and interpret the results by means of an illustration from the social sciences field using data from Kolb's Learning Style Inventory and Boyatzis' Philosophical Orientation Questionnaire, which distribute a fixed total score among several learning modes and philosophical orientations.

2017 ◽  
Author(s):  
Jan Graffelman ◽  
Vera Pawlowsky-Glahn ◽  
Juan José Egozcue ◽  
Antonella Buccianti

AbstractThe study of the relationships between two compositions by means of canonical correlation analysis is addressed A coimnositional version of canonical correlation analysis is developed. and called CODA-CCO. We consider two approaches, using the centred log-ratio transformation and the calculation of all possible pairwise log-ratios within sets. The relationships between both approaches are pointed out, and their merits are discussed. The related covariance matrices are structurally singular, and this is efficiently dealt with by using generalized inverses. We develop compositional canonical biplots and detail their properties. The canonical biplots are shown to be powerful tools for discovering the most salient relationships between two compositions. Some guidelines for compositional canonical biplots construction are discussed. A geological data set with X-ray fluorescence spectrometry measurements on major oxides and trace elements is used to illustrate the proposed method. The relationships between an analysis based on centred log-ratios and on isometric log-ratios are also shown.


Author(s):  
Sabrina Morilhas Simões ◽  
Antonio Leão Castilho ◽  
Adilson Fransozo ◽  
Maria Lúcia Negreiros-Fransozo ◽  
Rogerio Caetano da Costa

The abundance and ecological distribution of Acetes americanus and Peisos petrunkevitchi were investigated from July 2006 to June 2007, in Ubatuba, Brazil. Eight transects were identified and sampled monthly: six of these transects were located in Ubatuba bay, with depths reaching 21 m, and the other two transects were in estuarine environments. A total of 33,888 A. americanus shrimp were captured, with the majority coming from the shallower transects (up to 10 m). Conversely, 6,173 of the P. petrunkevitchi shrimps were captured in deeper areas (from 9 to 21 m). No individuals from either species were found in the estuary. The highest abundances obtained for both species were sampled during the summer. Canonical correlation analysis resulted in a coefficient value of 0.68 (P = 0.00). The abundance of both species was strongly correlated with depth. Variations in temperature and salinity values were also informative in predicting the seasonal presence of P. petrunkevitchi in deeper areas and A. americanus in the shallower areas of the bay. It is conceivable that the shrimp adjust their ecological distribution according to their intrinsic physiological limitations.


1998 ◽  
Vol 83 (3) ◽  
pp. 947-952 ◽  
Author(s):  
Nerella V. Ramanaiah ◽  
J. Patrick Sharpe

Coolidge, et al. in 1994 tested the generality and comprehensiveness of the five-factor model of personality as applied to personality disorders by performing a canonical correlation analysis for the scales from the Coolidge Axis II Inventory and the NEO Personality Inventory testing 178 undergraduates (106 men and 72 women). Their results did not support the generality and comprehensiveness of the five-factor model for interpreting the structure of personality disorders. A major problem with this study was that the data did not show good simple structure and meaningfulness because no rotation was performed for the canonical variates. The present study tested the hypothesis that the results of Coolidge, et al. might be attributed to the failure to rotate canonical variates to obtain good simple structure. For 220 students in introductory psychology (104 men and 116 women), canonical correlation analysis with varimax rotation was performed for scores on the Coolidge Axis II Inventory scales and the NEO Five-Factor Inventory scales. The analysis indicated five canonical variate pairs which were interpreted as Neuroticism, Extraversion, Openness, Disagreeableness, and Conscientiousness, supporting the tested hypothesis as well as the generality and comprehensiveness of this model for describing the structure of personality disorders.


2016 ◽  
Vol 4 ◽  
pp. 417-430 ◽  
Author(s):  
Dominique Osborne ◽  
Shashi Narayan ◽  
Shay B. Cohen

Canonical correlation analysis (CCA) is a method for reducing the dimension of data represented using two views. It has been previously used to derive word embeddings, where one view indicates a word, and the other view indicates its context. We describe a way to incorporate prior knowledge into CCA, give a theoretical justification for it, and test it by deriving word embeddings and evaluating them on a myriad of datasets.


2020 ◽  
Vol 12 (4) ◽  
pp. 5-21
Author(s):  
Ane-Mari Androniceanu ◽  
Jani Kinnunen ◽  
Irina Georgescu ◽  
Armenia Androniceanu

Achieving a competitive economy and a competitive market generally proceeds from the desire to meet economic and social objectives and it ensures a growing level of social welfare. The objectives of our research are to determine and highlight the bidirectional linear correlations among competitiveness, well-being and innovation and to analyze the main factors that influence these relations. Our research includes the EU member states and the UK using these countries’ specific indicators from the databases of EUROSTAT, the World Economic Forum and the United Nations from 2016-2018. We used Canonical Correlation Analysis to determine a set of canonical variates which represent linear combinations of the variables from each set. The contributions of our research show a direct and strong link among the three pillars of competitiveness, innovation and well-being. This analysis allowed us to identify and analyze the influence of innovation on the economic development and competitiveness of each EU country and on the well-being of its population. Governments and organizations that invest more in research in terms of innovation to increase the competitiveness of their products and services have shown a growing GDP and a higher level of population well-being. This research is representative at the European level and may influence the decisions of national governments and other institutions to encourage innovation through drivers such as R&D expenditures and human resources as the main factors generating economic growth and competitiveness, thus with a direct effect on GDP and on well-being.


1984 ◽  
Vol 62 (11) ◽  
pp. 2317-2327 ◽  
Author(s):  
P. Legendre ◽  
D. Planas ◽  
M.-J. Auclair

This paper compares the succession of gastropods in two environments that are adjacent in space but differ as to their eutrophic level. One is hypereutrophic (du Sud River), the other is mesotrophic (Richelieu River). Canonical correlation analysis brings out the main differences between these two stations, while principal component analysis is used to describe the succession of species within each community. These analyses indicate that the occurrence of gastropod species, as well as their development cycles, may be adapted to the particular synecological evolution of each environment. Thus, the species would not react directly to nutrient concentrations but indirectly, through the effects of these concentrations on oxygen content, plant cover, and predators. In these two environments, some benthic species seem to be good indicators of the eutrophic level of the ecosystem.


Author(s):  
SHILIANG SUN ◽  
FENG JIN

Co-training is a multiview semi-supervised learning algorithm to learn from both labeled and unlabeled data, which iteratively adopts a classifier trained on one view to teach the other view using some confident predictions given on unlabeled examples. However, as it does not examine the reliability of the labels provided by classifiers on either view, co-training might be problematic. Even very few inaccurately labeled examples can deteriorate the performance of learned classifiers to a large extent. In this paper, a new method named robust co-training is proposed, which integrates canonical correlation analysis (CCA) to inspect the predictions of co-training on those unlabeled training examples. CCA is applied to obtain a low-dimensional and closely correlated representation of the original multiview data. Based on this representation the similarities between an unlabeled example and the original labeled examples are determined. Only those examples whose predicted labels are consistent with the outcome of CCA examination are eligible to augment the original labeled data. The performance of robust co-training is evaluated on several different classification problems where encouraging experimental results are observed.


1985 ◽  
Vol 24 (02) ◽  
pp. 91-100 ◽  
Author(s):  
W. van Pelt ◽  
Ph. H. Quanjer ◽  
M. E. Wise ◽  
E. van der Burg ◽  
R. van der Lende

SummaryAs part of a population study on chronic lung disease in the Netherlands, an investigation is made of the relationship of both age and sex with indices describing the maximum expiratory flow-volume (MEFV) curve. To determine the relationship, non-linear canonical correlation was used as realized in the computer program CANALS, a combination of ordinary canonical correlation analysis (CCA) and non-linear transformations of the variables. This method enhances the generality of the relationship to be found and has the advantage of showing the relative importance of categories or ranges within a variable with respect to that relationship. The above is exemplified by describing the relationship of age and sex with variables concerning respiratory symptoms and smoking habits. The analysis of age and sex with MEFV curve indices shows that non-linear canonical correlation analysis is an efficient tool in analysing size and shape of the MEFV curve and can be used to derive parameters concerning the whole curve.


Sign in / Sign up

Export Citation Format

Share Document