scholarly journals Integrative analysis of single cell genomics data by coupled nonnegative matrix factorizations

2018 ◽  
Author(s):  
Zhana Duren ◽  
Xi Chen ◽  
Mahdi Zamanighomi ◽  
Wanwen Zeng ◽  
Ansuman T Satpathy ◽  
...  

AbstractWhen different types of functional genomics data are generated on single cells from different samples of cells from the same heterogeneous population, the clustering of cells in the different samples should be coupled. We formulate this “coupled clustering” problem as an optimization problem, and propose the method of coupled nonnegative matrix factorizations (coupled NMF) for its solution. The method is illustrated by the integrative analysis of single cell RNA-seq and single cell ATAC-seq data.Significance StatementsBiological samples are often heterogeneous mixtures of different types of cells. Suppose we have two single cell data sets, each providing information on a different cellular feature and generated on a different sample from this mixture. Then, the clustering of cells in the two samples should be coupled as both clusterings are reflecting the underlying cell types in the same mixture. This “coupled clustering” problem is a new problem not covered by existing clustering methods. In this paper we develop an approach for its solution based the coupling of two nonnegative matrix factorizations. The method should be useful for integrative single cell genomics analysis tasks such as the joint analysis of single cell RNA-seq and single cell ATAC-seq data.

2018 ◽  
Vol 115 (30) ◽  
pp. 7723-7728 ◽  
Author(s):  
Zhana Duren ◽  
Xi Chen ◽  
Mahdi Zamanighomi ◽  
Wanwen Zeng ◽  
Ansuman T. Satpathy ◽  
...  

When different types of functional genomics data are generated on single cells from different samples of cells from the same heterogeneous population, the clustering of cells in the different samples should be coupled. We formulate this “coupled clustering” problem as an optimization problem and propose the method of coupled nonnegative matrix factorizations (coupled NMF) for its solution. The method is illustrated by the integrative analysis of single-cell RNA-sequencing (RNA-seq) and single-cell ATAC-sequencing (ATAC-seq) data.


2020 ◽  
Vol 11 ◽  
Author(s):  
Noudjoud Attaf ◽  
Iñaki Cervera-Marzal ◽  
Chuang Dong ◽  
Laurine Gil ◽  
Amédée Renand ◽  
...  

2019 ◽  
Author(s):  
Yuanhua Huang ◽  
Davis J McCarthy ◽  
Oliver Stegle

AbstractThe joint analysis of multiple samples using single-cell RNA-seq is a promising experimental design, offering both increased throughput while allowing to account for batch variation. To achieve multi-sample designs, genetic variants that segregate between the samples in the pool have been proposed as natural barcodes for cell demultiplexing. Existing demultiplexing strategies rely on access to complete genotype data from the pooled samples, which greatly limits the applicability of such methods, in particular when genetic variation is not the primary object of study. To address this, we here present Vireo, a computationally efficient Bayesian model to demultiplex single-cell data from pooled experimental designs. Uniquely, our model can be applied in settings when only partial or no genotype information is available. Using simulations based on synthetic mixtures and results on real data, we demonstrate the robustness of our model and illustrate the utility of multi-sample experimental designs for common expression analyses.


2020 ◽  
Vol 11 ◽  
Author(s):  
Noudjoud Attaf ◽  
Iñaki Cervera-Marzal ◽  
Chuang Dong ◽  
Laurine Gil ◽  
Amédée Renand ◽  
...  

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Boying Gong ◽  
Yun Zhou ◽  
Elizabeth Purdom

AbstractA growing number of single-cell sequencing platforms enable joint profiling of multiple omics from the same cells. We present , a novel method that not only allows for analyzing the data from joint-modality platforms, but provides a coherent framework for the integration of multiple datasets measured on different modalities. We demonstrate its performance on multi-modality data of gene expression and chromatin accessibility and illustrate the integration abilities of by jointly analyzing this multi-modality data with single-cell RNA-seq and ATAC-seq datasets.


2020 ◽  
Author(s):  
Xiaolu Zhang ◽  
Nianlai Huang ◽  
Rongfu Huang ◽  
Liangming Wang ◽  
Qingfeng Ke ◽  
...  

Abstract Background: Single-cell RNA sequencing (scRNA-seq) was recently adopted for exploring molecular programmes and lineage progression patterns of pathogenesis of important diseases. In this study, scRNA-seq was used to identify potential markers for chondrocytes in osteoarthritis (OA) and to explore the function of different types of chondrocytes in OA.Methods:Here we aimed to identify the biomarkers and differentiation of chondrocyte by Single-cell RNA seq analysis. GeneOntology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis were used to identify the function of candidate marker genes in chondrocytes. Protein–protein interaction (PPI) network was constructed to find the hub genes in 3 types of chondrocyte respectively. We also used qRT-PCR to detect the expression level of the candidate marker genes in different types of chondrocyte. Results: In this study, we characterized the single-cell expression profiling of 480 chondrocyte samples and found hypertrophic chondrocyte (HTC), homeostatic chondrocyte (HomC) and fibrocartilage chondrocyte (FC) respectively. The results of GO and KEGG analysis showed the candidate marker genes made specific function in these chondrocytes to regulate the development of OAs respectively. We further revealed the differential expression of top 10 marker genes in 3 types of chondrocyte. The marker genes of HTC and FC were mainly expressed in their cell subset respectively. The marker genes of HomC did not have obviously differential expression among different types of chondrocyte. Last, we predicted the key genes in each cell subset. CD44, JUN and FN1 were predicted tightly related to the proliferation and differentiation of chondrocytes in OAs and could be regarded as biomarkers to estimate the development of OA. Conclusion: Our results provide new insights into exploring the roles of different types of chondrocyte in OA. The biomarkers of chondrocyte were also valuable for estimating OA progression.


2017 ◽  
Author(s):  
Jesse M. Zhang ◽  
Jue Fan ◽  
H. Christina Fan ◽  
David Rosenfeld ◽  
David N. Tse

ABSTRACTBackgroundWith the recent proliferation of single-cell RNA-Seq experiments, several methods have been developed for unsupervised analysis of the resulting datasets. These methods often rely on unintuitive hyperparameters and do not explicitly address the subjectivity associated with clustering.ResultsIn this work, we present DendroSplit, an interpretable framework for analyzing single-cell RNA-Seq datasets that addresses both the clustering interpretability and clustering subjectivity issues. DendroSplit offers a novel perspective on the single-cell RNA-Seq clustering problem motivated by the definition of “cell type,” allowing us to cluster using feature selection to uncover multiple levels of biologically meaningful populations in the data. We analyze several landmark single-cell datasets, demonstrating both the method’s efficacy and computational efficiency.ConclusionDendroSplit offers a clustering framework that is comparable to existing methods in terms of accuracy and speed but is novel in its emphasis on interpretabilty. We provide the full DendroSplit software package at https://github.com/jessemzhang/dendrosplit.


2016 ◽  
Vol 6 (1) ◽  
Author(s):  
Ganlu Hu ◽  
Kevin Huang ◽  
Youjin Hu ◽  
Guizhen Du ◽  
Zhigang Xue ◽  
...  

Author(s):  
Mattia Forcato ◽  
Oriana Romano ◽  
Silvio Bicciato

Abstract Recent advances in single-cell technologies are providing exciting opportunities for dissecting tissue heterogeneity and investigating cell identity, fate and function. This is a pristine, exploding field that is flooding biologists with a new wave of data, each with its own specificities in terms of complexity and information content. The integrative analysis of genomic data, collected at different molecular layers from diverse cell populations, holds promise to address the full-scale complexity of biological systems. However, the combination of different single-cell genomic signals is computationally challenging, as these data are intrinsically heterogeneous for experimental, technical and biological reasons. Here, we describe the computational methods for the integrative analysis of single-cell genomic data, with a focus on the integration of single-cell RNA sequencing datasets and on the joint analysis of multimodal signals from individual cells.


Sign in / Sign up

Export Citation Format

Share Document