scholarly journals Stabilized Independent Component Analysis outperforms other methods in finding reproducible signals in tumoral transcriptomes

2018 ◽  
Author(s):  
Laura Cantini ◽  
Ulykbek Kairov ◽  
Aurélien de Reyniès ◽  
Emmanuel Barillot ◽  
François Radvanyi ◽  
...  

AbstractMotivationMatrix factorization methods are widely exploited in order to reduce dimensionality of transcriptomic datasets to the action of few hidden factors (metagenes). Applying such methods to similar independent datasets should yield reproducible inter-series outputs, though it was never demonstrated yet.ResultsWe systematically test state-of-art methods of matrix factorization on several transcriptomic datasets of the same cancer type. Inspired by concepts of evolutionary bioinformatics, we design a new framework based on Reciprocally Best Hit (RBH) graphs in order to benchmark the method’s reproducibility. We show that a particular protocol of application of Independent Component Analysis (ICA), accompanied by a stabilisation procedure, leads to a significant increase in the inter-series output reproducibility. Moreover, we show that the signals detected through this method are systematically more interpretable than those of other state-of-art methods. We developed a user-friendly tool BIODICA for performing the Stabilized ICA-based RBH meta-analysis. We apply this methodology to the study of colorectal cancer (CRC) for which 14 independent publicly available transcriptomic datasets can be collected. The resulting RBH graph maps the landscape of interconnected factors that can be associated to biological processes or to technological artefacts. These factors can be used as clinical biomarkers or robust and tumor-type specific transcriptomic signatures of tumoral cells or tumoral microenvironment. Their intensities in different samples shed light on the mechanistic basis of CRC molecular subtyping.AvailabilityThe BIODICA tool is available from https://github.com/LabBandSB/[email protected] and [email protected] informationSupplementary data are available at Bioinformatics online.

2019 ◽  
Vol 35 (21) ◽  
pp. 4307-4313 ◽  
Author(s):  
Laura Cantini ◽  
Ulykbek Kairov ◽  
Aurélien de Reyniès ◽  
Emmanuel Barillot ◽  
François Radvanyi ◽  
...  

Abstract Motivation Matrix factorization (MF) methods are widely used in order to reduce dimensionality of transcriptomic datasets to the action of few hidden factors (metagenes). MF algorithms have never been compared based on the between-datasets reproducibility of their outputs in similar independent datasets. Lack of this knowledge might have a crucial impact when generalizing the predictions made in a study to others. Results We systematically test widely used MF methods on several transcriptomic datasets collected from the same cancer type (14 colorectal, 8 breast and 4 ovarian cancer transcriptomic datasets). Inspired by concepts of evolutionary bioinformatics, we design a novel framework based on Reciprocally Best Hit (RBH) graphs in order to benchmark the MF methods for their ability to produce generalizable components. We show that a particular protocol of application of independent component analysis (ICA), accompanied by a stabilization procedure, leads to a significant increase in the between-datasets reproducibility. Moreover, we show that the signals detected through this method are systematically more interpretable than those of other standard methods. We developed a user-friendly tool for performing the Stabilized ICA-based RBH meta-analysis. We apply this methodology to the study of colorectal cancer (CRC) for which 14 independent transcriptomic datasets can be collected. The resulting RBH graph maps the landscape of interconnected factors associated to biological processes or to technological artifacts. These factors can be used as clinical biomarkers or robust and tumor-type specific transcriptomic signatures of tumoral cells or tumoral microenvironment. Their intensities in different samples shed light on the mechanistic basis of CRC molecular subtyping. Availability and implementation The RBH construction tool is available from http://goo.gl/DzpwYp Supplementary information Supplementary data are available at Bioinformatics online.


Sensors ◽  
2020 ◽  
Vol 20 (11) ◽  
pp. 3238
Author(s):  
Ruisheng Lei ◽  
Bingo Wing-Kuen Ling ◽  
Peihua Feng ◽  
Jinrong Chen

This paper proposes a framework combining the complementary ensemble empirical mode decomposition with both the independent component analysis and the non-negative matrix factorization for estimating both the heart rate and the respiratory rate from the photoplethysmography (PPG) signal. After performing the complementary ensemble empirical mode decomposition on the PPG signal, a finite number of intrinsic mode functions are obtained. Then, these intrinsic mode functions are divided into two groups to perform the further analysis via both the independent component analysis and the non-negative matrix factorization. The surrogate cardiac signal related to the heart activity and another surrogate respiratory signal related to the respiratory activity are reconstructed to estimate the heart rate and the respiratory rate, respectively. Finally, different records of signals acquired from the Medical Information Mart for Intensive Care database downloaded from the Physionet Automated Teller Machine (ATM) data bank are employed for demonstrating the outperformance of our proposed method. The results show that our proposed method outperforms both the digital filtering approach and the conventional empirical mode decomposition based methods in terms of reconstructing both the surrogate cardiac signal and the respiratory signal from the PPG signal as well as both achieving the higher accuracy and the higher reliability for estimating both the heart rate and the respiratory rate.


2019 ◽  
Vol 20 (18) ◽  
pp. 4414 ◽  
Author(s):  
Nicolas Sompairac ◽  
Petr V. Nazarov ◽  
Urszula Czerwinska ◽  
Laura Cantini ◽  
Anne Biton ◽  
...  

Independent component analysis (ICA) is a matrix factorization approach where the signals captured by each individual matrix factors are optimized to become as mutually independent as possible. Initially suggested for solving source blind separation problems in various fields, ICA was shown to be successful in analyzing functional magnetic resonance imaging (fMRI) and other types of biomedical data. In the last twenty years, ICA became a part of the standard machine learning toolbox, together with other matrix factorization methods such as principal component analysis (PCA) and non-negative matrix factorization (NMF). Here, we review a number of recent works where ICA was shown to be a useful tool for unraveling the complexity of cancer biology from the analysis of different types of omics data, mainly collected for tumoral samples. Such works highlight the use of ICA in dimensionality reduction, deconvolution, data pre-processing, meta-analysis, and others applied to different data types (transcriptome, methylome, proteome, single-cell data). We particularly focus on the technical aspects of ICA application in omics studies such as using different protocols, determining the optimal number of components, assessing and improving reproducibility of the ICA results, and comparison with other popular matrix factorization techniques. We discuss the emerging ICA applications to the integrative analysis of multi-level omics datasets and introduce a conceptual view on ICA as a tool for defining functional subsystems of a complex biological system and their interactions under various conditions. Our review is accompanied by a Jupyter notebook which illustrates the discussed concepts and provides a practical tool for applying ICA to the analysis of cancer omics datasets.


Sign in / Sign up

Export Citation Format

Share Document