Tree-ensemble analysis tests for presence of multifurcations in single cell data

Mapping Intimacies ◽

10.1101/200923 ◽

2017 ◽

Author(s):

Will Macnair ◽

Laura De Vargas Roditi ◽

Stefan Ganscha ◽

Manfred Claassen

Keyword(s):

Single Cell ◽

Statistical Significance ◽

Experimental Studies ◽

Cell Maturation ◽

Biological Processes ◽

B Cell Differentiation ◽

Branch Points ◽

Multi Level ◽

T Cell Maturation ◽

Cell Data

AbstractWe introduce TreeTop, an algorithm for single-cell data analysis to identify and assess statistical significance of branch points in biological processes with possibly multi-level branching hierarchies. We demonstrate branch point identification for processes with varying topologies, including T cell maturation, B cell differentiation and hematopoiesis. Our analyses are consistent with recent experimental studies suggesting a shallow hierarchy of differentiation events in hematopoiesis, rather than the classical multi-level hierarchy.

Download Full-text

Poincaré Maps for Analyzing Complex Hierarchies in Single-Cell Data

10.1101/689547 ◽

2019 ◽

Cited By ~ 2

Author(s):

Anna Klimovskaia ◽

David Lopez-Paz ◽

Léon Bottou ◽

Maximilian Nickel

Keyword(s):

Data Analysis ◽

Single Cell ◽

Hyperbolic Geometry ◽

Continuous Extension ◽

Two Dimensions ◽

Biological Processes ◽

Poincaré Maps ◽

Poincare Maps ◽

Cell Trajectories ◽

Cell Data

AbstractThe need to understand cell developmental processes spawned a plethora of computational methods for discovering hierarchies from scRNAseq data. However, existing techniques are based on Euclidean geometry, a suboptimal choice for modeling complex cell trajectories with multiple branches. To overcome this fundamental representation issue we propose Poincaré maps, a method that harness the power of hyperbolic geometry into the realm of single-cell data analysis. Often understood as a continuous extension of trees, hyperbolic geometry enables the embedding of complex hierarchical data in only two dimensions while preserving the pairwise distances between points in the hierarchy. This enables direct exploratory analysis and the use of our embeddings in a wide variety of downstream data analysis tasks, such as visualization, clustering, lineage detection and pseudo-time inference. When compared to existing methods —unable to address all these important tasks using a single embedding— Poincaré maps produce state-of-the-art two-dimensional representations of cell trajectories on multiple scRNAseq datasets. More specifically, we demonstrate that Poincaré maps allow in a straightforward manner to formulate new hypotheses about biological processes unbeknown to prior methods.Significance statementThe discovery of hierarchies in biological processes is central to developmental biology. We propose Poincaré maps, a new method based on hyperbolic geometry to discover continuous hierarchies from pairwise similarities. We demonstrate the efficacy of our method on multiple single-cell datasets on tasks such as visualization, clustering, lineage identification, and pseudo-time inference.

Download Full-text

scCODA is a Bayesian model for compositional single-cell data analysis

Nature Communications ◽

10.1038/s41467-021-27150-6 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

M. Büttner ◽

J. Ostner ◽

C. L. Müller ◽

F. J. Theis ◽

B. Schubert

Keyword(s):

Data Analysis ◽

Single Cell ◽

Bayesian Model ◽

Cell Types ◽

Biological Processes ◽

Complex Cell ◽

Cell Type ◽

Compositional Changes ◽

False Discoveries ◽

Cell Data

AbstractCompositional changes of cell types are main drivers of biological processes. Their detection through single-cell experiments is difficult due to the compositionality of the data and low sample sizes. We introduce scCODA (https://github.com/theislab/scCODA), a Bayesian model addressing these issues enabling the study of complex cell type effects in disease, and other stimuli. scCODA demonstrated excellent detection performance, while reliably controlling for false discoveries, and identified experimentally verified cell type changes that were missed in original analyses.

Download Full-text

Mechanistic hierarchical population model identifies latent causes of cell-to-cell variability

10.1101/171561 ◽

2017 ◽

Cited By ~ 1

Author(s):

Carolin Loos ◽

Katharina Moeller ◽

Fabian Fröhlich ◽

Tim Hucho ◽

Jan Hasenauer

Keyword(s):

Single Cell ◽

Simulated Data ◽

Biological Processes ◽

Modeling Framework ◽

Functional Implications ◽

Multiple Levels ◽

Underlying Mechanisms ◽

Data Driven Modeling ◽

Cell To Cell Variability ◽

Cell Data

All biological systems exhibit cell-to-cell variability, and this variability often has functional implications. To gain a thorough understanding of biological processes, the latent causes and underlying mechanisms of this variability must be elucidated. Cell populations comprising multiple distinct subpopulations are commonplace in biology, yet no current methods allow the sources of variability between and within individual subpopulations to be identified. This limits the analysis of single-cell data, for example provided by flow cytometry and microscopy. In this study, we present a data-driven modeling framework for the analysis of populations comprising heterogeneous subpopulations. Our approach combines mixture modeling with frameworks for distribution approximation, facilitating the integration of multiple single-cell datasets and the detection of causal differences between and within subpopulations. The computational efficiency of our framework allows hundreds of competing hypotheses to be compared, giving unprecedented depth of a study. We demonstrated the ability of our method to capture multiple levels of heterogeneity in the analyzes of simulated data and data from highly heterogeneous sensory neurons involved in pain initiation. Our approach identified the sources of cell-to-cell variability and revealed mechanisms that underlie the modulation of nerve growth factor-induced Erk1/2 signaling by extracellular scaffolds.

Download Full-text

Laplacian eigenmaps and principal curves for high resolution pseudotemporal ordering of single-cell RNA-seq profiles

10.1101/027219 ◽

2015 ◽

Cited By ~ 18

Author(s):

Kieran Campbell ◽

Chris P Ponting ◽

Caleb Webber

Keyword(s):

Gene Expression ◽

Single Cell ◽

Biological Processes ◽

Rna Seq ◽

Principal Curves ◽

Cell Level ◽

Laplacian Eigenmaps ◽

Low Dimensional ◽

Cell Data ◽

Insight Into

Advances in RNA-seq technologies provide unprecedented insight into the variability and heterogeneity of gene expression at the single-cell level. However, such data offers only a snapshot of the transcriptome, whereas it is often the progression of cells through dynamic biological processes that is of interest. As a result, one outstanding challenge is to infer such progressions by ordering gene expression from single cell data alone, known as the cell ordering problem. Here, we introduce a new method that constructs a low-dimensional non-linear embedding of the data using laplacian eigenmaps before assigning each cell a pseudotime using principal curves. We characterise why on a theoretical level our method is more robust to the high levels of noise typical of single-cell RNA-seq data before demonstrating its utility on two existing datasets of differentiating cells.

Download Full-text

scCODA: A Bayesian model for compositional single-cell data analysis

10.1101/2020.12.14.422688 ◽

2020 ◽

Author(s):

M. Büttner ◽

J. Ostner ◽

CL. Müller ◽

FJ. Theis ◽

B. Schubert

Keyword(s):

Data Analysis ◽

Single Cell ◽

Bayesian Model ◽

Cell Types ◽

Detection Performance ◽

Biological Processes ◽

Complex Cell ◽

Cell Type ◽

Compositional Changes ◽

Cell Data

AbstractCompositional changes of cell types are main drivers of biological processes. Their detection through single-cell experiments is difficult due to the compositionality of the data and low sample sizes. We introduce scCODA (https://github.com/theislab/scCODA), a Bayesian model addressing these issues enabling the study of complex cell type effects in disease, and other stimuli. scCODA demonstrated excellent detection performance and identified experimentally verified cell type changes that were missed in original analyses.

Download Full-text

Faculty Opinions recommendation of CD8 binding to MHC class I molecules is influenced by T cell maturation and glycosylation.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.1007744.97407 ◽

2002 ◽

Author(s):

Nick Gascoigne

Keyword(s):

T Cell ◽

Mhc Class I ◽

Class I ◽

Cell Maturation ◽

T Cell Maturation ◽

Mhc Class I Molecules

Download Full-text

Faculty Opinions recommendation of Systems biology. Conditional density-based analysis of T cell signaling in single-cell data.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.723891088.793520867 ◽

2016 ◽

Author(s):

Anuj Kumar

Keyword(s):

Systems Biology ◽

T Cell ◽

Cell Signaling ◽

Single Cell ◽

Conditional Density ◽

T Cell Signaling ◽

Cell Data

Download Full-text

Prioritization of cell types responsive to biological perturbations in single-cell data with Augur

Nature Protocols ◽

10.1038/s41596-021-00561-x ◽

2021 ◽

Author(s):

Jordan W. Squair ◽

Michael A. Skinnider ◽

Matthieu Gautier ◽

Leonard J. Foster ◽

Grégoire Courtine

Keyword(s):

Single Cell ◽

Cell Types ◽

Cell Data

Download Full-text

Identifying cell types from single-cell data based on similarities and dissimilarities between cells

BMC Bioinformatics ◽

10.1186/s12859-020-03873-z ◽

2021 ◽

Vol 22 (S3) ◽

Author(s):

Yuanyuan Li ◽

Ping Luo ◽

Yi Lu ◽

Fang-Xiang Wu

Keyword(s):

Gene Expression ◽

Single Cell ◽

Spectral Clustering ◽

Incidence Matrix ◽

Expression Patterns ◽

Cell Types ◽

Clustering Method ◽

Different Types ◽

Cell Data ◽

Spectral Clustering Method

Abstract Background With the development of the technology of single-cell sequence, revealing homogeneity and heterogeneity between cells has become a new area of computational systems biology research. However, the clustering of cell types becomes more complex with the mutual penetration between different types of cells and the instability of gene expression. One way of overcoming this problem is to group similar, related single cells together by the means of various clustering analysis methods. Although some methods such as spectral clustering can do well in the identification of cell types, they only consider the similarities between cells and ignore the influence of dissimilarities on clustering results. This methodology may limit the performance of most of the conventional clustering algorithms for the identification of clusters, it needs to develop special methods for high-dimensional sparse categorical data. Results Inspired by the phenomenon that same type cells have similar gene expression patterns, but different types of cells evoke dissimilar gene expression patterns, we improve the existing spectral clustering method for clustering single-cell data that is based on both similarities and dissimilarities between cells. The method first measures the similarity/dissimilarity among cells, then constructs the incidence matrix by fusing similarity matrix with dissimilarity matrix, and, finally, uses the eigenvalues of the incidence matrix to perform dimensionality reduction and employs the K-means algorithm in the low dimensional space to achieve clustering. The proposed improved spectral clustering method is compared with the conventional spectral clustering method in recognizing cell types on several real single-cell RNA-seq datasets. Conclusions In summary, we show that adding intercellular dissimilarity can effectively improve accuracy and achieve robustness and that improved spectral clustering method outperforms the traditional spectral clustering method in grouping cells.

Download Full-text

Integrated analysis of multimodal single-cell data

Cell ◽

10.1016/j.cell.2021.04.048 ◽

2021 ◽

Author(s):

Yuhan Hao ◽

Stephanie Hao ◽

Erica Andersen-Nissen ◽

William M. Mauck ◽

Shiwei Zheng ◽

...

Keyword(s):

Single Cell ◽

Integrated Analysis ◽

Cell Data

Download Full-text