scholarly journals Improved Detection of Correlated Signals in Low-Rank-Plus-Noise Type Data Sets Using Informative Canonical Correlation Analysis (ICCA)

2017 ◽  
Vol 63 (6) ◽  
pp. 3451-3467 ◽  
Author(s):  
Nicholas Asendorf ◽  
Raj Rao Nadakuditi
2017 ◽  
Vol 29 (10) ◽  
pp. 2825-2859 ◽  
Author(s):  
Jia Cai ◽  
Hongwei Sun

Canonical correlation analysis (CCA) is a useful tool in detecting the latent relationship between two sets of multivariate variables. In theoretical analysis of CCA, a regularization technique is utilized to investigate the consistency of its analysis. This letter addresses the consistency property of CCA from a least squares view. We construct a constrained empirical risk minimization framework of CCA and apply a two-stage randomized Kaczmarz method to solve it. In the first stage, we remove the noise, and in the second stage, we compute the canonical weight vectors. Rigorous theoretical consistency is addressed. The statistical consistency of this novel scenario is extended to the kernel version of it. Moreover, experiments on both synthetic and real-world data sets demonstrate the effectiveness and efficiency of the proposed algorithms.


2011 ◽  
Vol 18 (3) ◽  
pp. 399-436
Author(s):  
SAMI VIRPIOJA ◽  
MARI-SANNA PAUKKERI ◽  
ABHISHEK TRIPATHI ◽  
TIINA LINDH-KNUUTILA ◽  
KRISTA LAGUS

AbstractVector space models are used in language processing applications for calculating semantic similarities of words or documents. The vector spaces are generated with feature extraction methods for text data. However, evaluation of the feature extraction methods may be difficult. Indirect evaluation in an application is often time-consuming and the results may not generalize to other applications, whereas direct evaluations that measure the amount of captured semantic information usually require human evaluators or annotated data sets. We propose a novel direct evaluation method based on canonical correlation analysis (CCA), the classical method for finding linear relationship between two data sets. In our setting, the two sets are parallel text documents in two languages. A good feature extraction method should provide representations that reflect the semantic content of the documents. Assuming that the underlying semantic content is independent of the language, we can study feature extraction methods that capture the content best by measuring dependence between the representations of a document and its translation. In the case of CCA, the applied measure of dependence is correlation. The evaluation method is based on unsupervised learning, it is language- and domain-independent, and it does not require additional resources besides a parallel corpus. In this paper, we demonstrate the evaluation method on a sentence-aligned parallel corpus. The method is validated by showing that the obtained results with bag-of-words representations are intuitive and agree well with the previous findings. Moreover, we examine the performance of the proposed evaluation method with indirect evaluation methods in simple sentence matching tasks, and a quantitative manual evaluation of word translations. The results of the proposed method correlate well with the results of the indirect and manual evaluations.


2019 ◽  
Vol 79 (45-46) ◽  
pp. 33771-33792 ◽  
Author(s):  
Qi Zhu ◽  
Nuoya Xu ◽  
Zheng Zhang ◽  
Donghai Guan ◽  
Ran Wang ◽  
...  

2013 ◽  
Vol 2013 ◽  
pp. 1-11 ◽  
Author(s):  
Xun Chen ◽  
Aiping Liu ◽  
Z. Jane Wang ◽  
Hu Peng

Corticomuscular activity modeling based on multiple data sets such as electroencephalography (EEG) and electromyography (EMG) signals provides a useful tool for understanding human motor control systems. In this paper, we propose modeling corticomuscular activity by combining partial least squares (PLS) and canonical correlation analysis (CCA). The proposed method takes advantage of both PLS and CCA to ensure that the extracted components are maximally correlated across two data sets and meanwhile can well explain the information within each data set. This complementary combination generalizes the statistical assumptions beyond both PLS and CCA methods. Simulations were performed to illustrate the performance of the proposed method. We also applied the proposed method to concurrent EEG and EMG data collected in a Parkinson’s disease (PD) study. The results reveal several highly correlated temporal patterns between EEG and EMG signals and indicate meaningful corresponding spatial activation patterns. In PD subjects, enhanced connections between occipital region and other regions are noted, which is consistent with previous medical knowledge. The proposed framework is a promising technique for performing multisubject and bimodal data analysis.


2020 ◽  
Vol 57 (1) ◽  
pp. 1-12
Author(s):  
Tomasz Górecki ◽  
Mirosław Krzyśko ◽  
Waldemar Wołyński

SummaryThere is a growing need to analyze data sets characterized by several sets of variables observed on the same set of individuals. Such complex data structures are known as multiblock (or multiple-set) data sets. Multi-block data sets are encountered in diverse fields including bioinformatics, chemometrics, food analysis, etc. Generalized Canonical Correlation Analysis (GCCA) is a very powerful method to study this kind of relationships between blocks. It can also be viewed as a method for the integration of information from K > 2 distinct sources (Takane and Oshima-Takane 2002). In this paper, GCCA is considered in the context of multivariate functional data. Such data are treated as realizations of multivariate random processes. GCCA is a technique that allows the joint analysis of several sets of data through dimensionality reduction. The central problem of GCCA is to construct a series of components aiming to maximize the association among the multiple variable sets. This method will be presented for multivariate functional data. Finally, a practical example will be discussed.


IEEE Access ◽  
2018 ◽  
Vol 6 ◽  
pp. 49967-49978 ◽  
Author(s):  
Xiao Jin ◽  
Yuting Su ◽  
Liang Zou ◽  
Yongwei Wang ◽  
Peiguang Jing ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document