Encoding Prior Knowledge with Eigenword Embeddings

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00108 ◽

2016 ◽

Vol 4 ◽

pp. 417-430 ◽

Cited By ~ 2

Author(s):

Dominique Osborne ◽

Shashi Narayan ◽

Shay B. Cohen

Keyword(s):

Correlation Analysis ◽

Prior Knowledge ◽

Canonical Correlation Analysis ◽

Canonical Correlation ◽

The Other ◽

Word Embeddings ◽

Theoretical Justification

Canonical correlation analysis (CCA) is a method for reducing the dimension of data represented using two views. It has been previously used to derive word embeddings, where one view indicates a word, and the other view indicates its context. We describe a way to incorporate prior knowledge into CCA, give a theoretical justification for it, and test it by deriving word embeddings and evaluating them on a myriad of datasets.

Download Full-text

Cross-domain sentiment classification with word embeddings and canonical correlation analysis

Proceedings of the Seventh Symposium on Information and Communication Technology - SoICT '16 ◽

10.1145/3011077.3011104 ◽

2016 ◽

Cited By ~ 8

Author(s):

Ngo Xuan Bach ◽

Vu Thanh Hai ◽

Tu Minh Phuong

Keyword(s):

Correlation Analysis ◽

Canonical Correlation Analysis ◽

Canonical Correlation ◽

Sentiment Classification ◽

Word Embeddings ◽

Cross Domain

Download Full-text

Distribution related to temperature and salinity of the shrimps Acetes americanus and Peisos petrunkevitchi (Crustacea: Sergestoidea) in the south-eastern Brazilian littoral zone

Journal of the Marine Biological Association of the United Kingdom ◽

10.1017/s0025315412000902 ◽

2012 ◽

Vol 93 (3) ◽

pp. 753-759 ◽

Cited By ~ 5

Author(s):

Sabrina Morilhas Simões ◽

Antonio Leão Castilho ◽

Adilson Fransozo ◽

Maria Lúcia Negreiros-Fransozo ◽

Rogerio Caetano da Costa

Keyword(s):

Correlation Analysis ◽

Canonical Correlation Analysis ◽

Canonical Correlation ◽

Littoral Zone ◽

The Other ◽

Strongly Correlated ◽

Ecological Distribution ◽

The South ◽

South Eastern

The abundance and ecological distribution of Acetes americanus and Peisos petrunkevitchi were investigated from July 2006 to June 2007, in Ubatuba, Brazil. Eight transects were identified and sampled monthly: six of these transects were located in Ubatuba bay, with depths reaching 21 m, and the other two transects were in estuarine environments. A total of 33,888 A. americanus shrimp were captured, with the majority coming from the shallower transects (up to 10 m). Conversely, 6,173 of the P. petrunkevitchi shrimps were captured in deeper areas (from 9 to 21 m). No individuals from either species were found in the estuary. The highest abundances obtained for both species were sampled during the summer. Canonical correlation analysis resulted in a coefficient value of 0.68 (P = 0.00). The abundance of both species was strongly correlated with depth. Variations in temperature and salinity values were also informative in predicting the seasonal presence of P. petrunkevitchi in deeper areas and A. americanus in the shallower areas of the bay. It is conceivable that the shrimp adjust their ecological distribution according to their intrinsic physiological limitations.

Download Full-text

Succession des communautés de gastéropodes dans deux milieux différant par leur degré d'eutrophisation

Canadian Journal of Zoology ◽

10.1139/z84-339 ◽

1984 ◽

Vol 62 (11) ◽

pp. 2317-2327 ◽

Cited By ~ 7

Author(s):

P. Legendre ◽

D. Planas ◽

M.-J. Auclair

Keyword(s):

Principal Component Analysis ◽

Correlation Analysis ◽

Canonical Correlation Analysis ◽

Canonical Correlation ◽

Plant Cover ◽

Principal Component ◽

The Other ◽

Nutrient Concentrations ◽

Benthic Species ◽

Gastropod Species

This paper compares the succession of gastropods in two environments that are adjacent in space but differ as to their eutrophic level. One is hypereutrophic (du Sud River), the other is mesotrophic (Richelieu River). Canonical correlation analysis brings out the main differences between these two stations, while principal component analysis is used to describe the succession of species within each community. These analyses indicate that the occurrence of gastropod species, as well as their development cycles, may be adapted to the particular synecological evolution of each environment. Thus, the species would not react directly to nutrient concentrations but indirectly, through the effects of these concentrations on oxygen content, plant cover, and predators. In these two environments, some benthic species seem to be good indicators of the eutrophic level of the ecosystem.

Download Full-text

Exploring the relationship between two compositions using canonical correlation analysis

Advances in Methodology and Statistics ◽

10.51936/epet8264 ◽

2016 ◽

Vol 13 (2) ◽

Author(s):

Glòria Mateu-Figueras ◽

Josep Daunis-i-Estadella ◽

Germà Coenders ◽

Berta Ferrer-Rosell ◽

Ricard Serlavós ◽

...

Keyword(s):

Correlation Analysis ◽

Learning Style ◽

Canonical Correlation Analysis ◽

Canonical Correlation ◽

Compositional Data ◽

The Other ◽

Compositional Data Analysis ◽

Maximum Correlation ◽

Canonical Variates ◽

Log Ratio

The aim of this article is to describe a method for relating two compositions which combines compositional data analysis and canonical correlation analysis (CCA), and to examine its main statistical properties. We use additive log-ratio (alr) transformation on both compositions and apply standard CCA to the transformed data. We show that canonical variates are themselves log-ratios and log-contrasts. The first pair of canonical variates can be interpreted as the log-contrast of a composition that has the maximum correlation with a log-contrast of the other composition. The second pair can be interpreted as the log-contrast of a composition that has the maximum correlation with a log-contrast of the other composition, under the restriction that they are uncorrelated with the first pair, and so on. Using properties from changes of basis, we prove that both canonical correlations and canonical variates are invariant to the choice of divisors in alr transformation. We show how to implement the analysis and interpret the results by means of an illustration from the social sciences field using data from Kolb's Learning Style Inventory and Boyatzis' Philosophical Orientation Questionnaire, which distribute a fixed total score among several learning modes and philosophical orientations.

Download Full-text

ROBUST CO-TRAINING

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001411008981 ◽

2011 ◽

Vol 25 (07) ◽

pp. 1113-1126 ◽

Cited By ~ 47

Author(s):

SHILIANG SUN ◽

FENG JIN

Keyword(s):

Correlation Analysis ◽

Supervised Learning ◽

Canonical Correlation Analysis ◽

Canonical Correlation ◽

Learning Algorithm ◽

Experimental Results ◽

The Other ◽

Classification Problems ◽

Training Examples ◽

Low Dimensional

Co-training is a multiview semi-supervised learning algorithm to learn from both labeled and unlabeled data, which iteratively adopts a classifier trained on one view to teach the other view using some confident predictions given on unlabeled examples. However, as it does not examine the reliability of the labels provided by classifiers on either view, co-training might be problematic. Even very few inaccurately labeled examples can deteriorate the performance of learned classifiers to a large extent. In this paper, a new method named robust co-training is proposed, which integrates canonical correlation analysis (CCA) to inspect the predictions of co-training on those unlabeled training examples. CCA is applied to obtain a low-dimensional and closely correlated representation of the original multiview data. Based on this representation the similarities between an unlabeled example and the original labeled examples are determined. Only those examples whose predicted labels are consistent with the outcome of CCA examination are eligible to augment the original labeled data. The performance of robust co-training is evaluated on several different classification problems where encouraging experimental results are observed.

Download Full-text

Analysis of Maximum Expiratory Flow Volume Curves Using Canonical Correlation Analysis

Methods of Information in Medicine ◽

10.1055/s-0038-1635359 ◽

1985 ◽

Vol 24 (02) ◽

pp. 91-100 ◽

Cited By ~ 3

Author(s):

W. van Pelt ◽

Ph. H. Quanjer ◽

M. E. Wise ◽

E. van der Burg ◽

R. van der Lende

Keyword(s):

Correlation Analysis ◽

Canonical Correlation Analysis ◽

Canonical Correlation ◽

Flow Volume ◽

Non Linear ◽

Maximum Expiratory Flow ◽

Expiratory Flow ◽

Relationship Of ◽

The Relationship ◽

Age And Sex

SummaryAs part of a population study on chronic lung disease in the Netherlands, an investigation is made of the relationship of both age and sex with indices describing the maximum expiratory flow-volume (MEFV) curve. To determine the relationship, non-linear canonical correlation was used as realized in the computer program CANALS, a combination of ordinary canonical correlation analysis (CCA) and non-linear transformations of the variables. This method enhances the generality of the relationship to be found and has the advantage of showing the relative importance of categories or ranges within a variable with respect to that relationship. The above is exemplified by describing the relationship of age and sex with variables concerning respiratory symptoms and smoking habits. The analysis of age and sex with MEFV curve indices shows that non-linear canonical correlation analysis is an efficient tool in analysing size and shape of the MEFV curve and can be used to derive parameters concerning the whole curve.

Download Full-text