Unsupervised topological alignment for single-cell multi-omics integration

Kai Cao; Xiangqi Bai; Yiguang Hong; Lin Wan

doi:10.1093/bioinformatics/btaa443

Unsupervised topological alignment for single-cell multi-omics integration

Bioinformatics ◽

10.1093/bioinformatics/btaa443 ◽

2020 ◽

Vol 36 (Supplement_1) ◽

pp. i48-i56 ◽

Cited By ~ 2

Author(s):

Kai Cao ◽

Xiangqi Bai ◽

Yiguang Hong ◽

Lin Wan

Keyword(s):

Single Cell ◽

Distance Matrix ◽

Cell Types ◽

Optimization Method ◽

Supplementary Information ◽

Dimensional Structure ◽

Specific Cell ◽

Omics Integration ◽

Distance Matrices ◽

Low Dimensional

Abstract Motivation Single-cell multi-omics data provide a comprehensive molecular view of cells. However, single-cell multi-omics datasets consist of unpaired cells measured with distinct unmatched features across modalities, making data integration challenging. Results In this study, we present a novel algorithm, termed UnionCom, for the unsupervised topological alignment of single-cell multi-omics integration. UnionCom does not require any correspondence information, either among cells or among features. It first embeds the intrinsic low-dimensional structure of each single-cell dataset into a distance matrix of cells within the same dataset and then aligns the cells across single-cell multi-omics datasets by matching the distance matrices via a matrix optimization method. Finally, it projects the distinct unmatched features across single-cell datasets into a common embedding space for feature comparability of the aligned cells. To match the complex non-linear geometrical distorted low-dimensional structures across datasets, UnionCom proposes and adjusts a global scaling parameter on distance matrices for aligning similar topological structures. It does not require one-to-one correspondence among cells across datasets, and it can accommodate samples with dataset-specific cell types. UnionCom outperforms state-of-the-art methods on both simulated and real single-cell multi-omics datasets. UnionCom is robust to parameter choices, as well as subsampling of features. Availability and implementation UnionCom software is available at https://github.com/caokai1073/UnionCom. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Unsupervised Topological Alignment for Single-Cell Multi-Omics Integration

10.1101/2020.02.02.931394 ◽

2020 ◽

Cited By ~ 6

Author(s):

Kai Cao ◽

Xiangqi Bai ◽

Yiguang Hong ◽

Lin Wan

Keyword(s):

Single Cell ◽

Distance Matrix ◽

Cell Types ◽

Optimization Method ◽

Dimensional Structure ◽

Specific Cell ◽

Matrix Optimization ◽

Omics Integration ◽

Distance Matrices ◽

Low Dimensional

AbstractSingle-cell multi-omics data provide a comprehensive molecular view of cells. However, single-cell multi-omics datasets consist of unpaired cells measured with distinct unmatched features across modalities, making data integration challenging. In this study, we present a novel algorithm, termed UnionCom, for the unsupervised topological alignment of single-cell multi-omics integration. UnionCom does not require any correspondence information, either among cells or among features. It first embeds the intrinsic low-dimensional structure of each single-cell dataset into a distance matrix of cells within the same dataset and then aligns the cells across single-cell multi-omics datasets by matching the distance matrices via a matrix optimization method. Finally, it projects the distinct unmatched features across single-cell datasets into a common embedding space for feature comparability of the aligned cells. To match the complex nonlinear geometrical distorted low-dimensional structures across datasets, UnionCom proposes and adjusts a global scaling parameter on distance matrices for aligning similar topological structures. It does not require one-to-one correspondence among cells across datasets, and it can accommodate samples with dataset-specific cell types. UnionCom outperforms state-of-the-art methods on both simulated and real single-cell multi-omics datasets. UnionCom is robust to parameter choices, as well as subsampling of features. UnionCom software is available at https://github.com/caokai1073/UnionCom.

Download Full-text

Discovering a sparse set of pairwise discriminating features in high-dimensional data

Bioinformatics ◽

10.1093/bioinformatics/btaa690 ◽

2020 ◽

Author(s):

Samuel Melton ◽

Sharad Ramanathan

Keyword(s):

Single Cell ◽

Dimensional Space ◽

Cell Types ◽

Dimensional Subspace ◽

Supplementary Information ◽

High Dimensional ◽

Technological Advances ◽

Data Points ◽

Low Dimensional ◽

Sparse Set

Abstract Motivation Recent technological advances produce a wealth of high-dimensional descriptions of biological processes, yet extracting meaningful insight and mechanistic understanding from these data remains challenging. For example, in developmental biology, the dynamics of differentiation can now be mapped quantitatively using single-cell RNA sequencing, yet it is difficult to infer molecular regulators of developmental transitions. Here, we show that discovering informative features in the data is crucial for statistical analysis as well as making experimental predictions. Results We identify features based on their ability to discriminate between clusters of the data points. We define a class of problems in which linear separability of clusters is hidden in a low-dimensional space. We propose an unsupervised method to identify the subset of features that define a low-dimensional subspace in which clustering can be conducted. This is achieved by averaging over discriminators trained on an ensemble of proposed cluster configurations. We then apply our method to single-cell RNA-seq data from mouse gastrulation, and identify 27 key transcription factors (out of 409 total), 18 of which are known to define cell states through their expression levels. In this inferred subspace, we find clear signatures of known cell types that eluded classification prior to discovery of the correct low-dimensional subspace. Availability and implementation https://github.com/smelton/SMD. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Adult tissue-resident stem cells—fact or fiction?

Stem Cell Research & Therapy ◽

10.1186/s13287-021-02142-x ◽

2021 ◽

Vol 12 (1) ◽

Cited By ~ 1

Author(s):

Deepa Bhartiya

Keyword(s):

Stem Cells ◽

Single Cell ◽

Adult Stem Cells ◽

Cell Types ◽

Specific Cell ◽

Tissue Specific ◽

Cardiac Tissues ◽

Adult Tissues ◽

Resident Stem Cells ◽

Abundant Cytoplasm

AbstractLife-long tissue homeostasis of adult tissues is supposedly maintained by the resident stem cells. These stem cells are quiescent in nature and rarely divide to self-renew and give rise to tissue-specific “progenitors” (lineage-restricted and tissue-committed) which divide rapidly and differentiate into tissue-specific cell types. However, it has proved difficult to isolate these quiescent stem cells as a physical entity. Recent single-cell RNAseq studies on several adult tissues including ovary, prostate, and cardiac tissues have not been able to detect stem cells. Thus, it has been postulated that adult cells dedifferentiate to stem-like state to ensure regeneration and can be defined as cells capable to replace lost cells through mitosis. This idea challenges basic paradigm of development biology regarding plasticity that a cell enters point of no return once it initiates differentiation. The underlying reason for this dilemma is that we are putting stem cells and somatic cells together while processing for various studies. Stem cells and adult mature cell types are distinct entities; stem cells are quiescent, small in size, and with minimal organelles whereas the mature cells are metabolically active and have multiple organelles lying in abundant cytoplasm. As a result, they do not pellet down together when centrifuged at 100–350g. At this speed, mature cells get collected but stem cells remain buoyant and can be pelleted by centrifuging at 1000g. Thus, inability to detect stem cells in recently published single-cell RNAseq studies is because the stem cells were unknowingly discarded while processing and were never subjected to RNAseq. This needs to be kept in mind before proposing to redefine adult stem cells.

Download Full-text

A practical solution to pseudoreplication bias in single-cell studies

Nature Communications ◽

10.1038/s41467-021-21038-1 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Kip D. Zimmerman ◽

Mark A. Espeland ◽

Carl D. Langefeld

Keyword(s):

Single Cell ◽

Mixed Models ◽

Random Effect ◽

Cell Types ◽

Error Rates ◽

Specific Cell ◽

Experimental Conditions ◽

Type 1 Error ◽

Sample Correlation ◽

Inflated Type

AbstractCells from the same individual share common genetic and environmental backgrounds and are not statistically independent; therefore, they are subsamples or pseudoreplicates. Thus, single-cell data have a hierarchical structure that many current single-cell methods do not address, leading to biased inference, highly inflated type 1 error rates, and reduced robustness and reproducibility. This includes methods that use a batch effect correction for individual as a means of accounting for within-sample correlation. Here, we document this dependence across a range of cell types and show that pseudo-bulk aggregation methods are conservative and underpowered relative to mixed models. To compute differential expression within a specific cell type across treatment groups, we propose applying generalized linear mixed models with a random effect for individual, to properly account for both zero inflation and the correlation structure among measures from cells within an individual. Finally, we provide power estimates across a range of experimental conditions to assist researchers in designing appropriately powered studies.

Download Full-text

Single-cell analysis of human adipose tissue identifies depot- and disease-specific cell types

Nature Metabolism ◽

10.1038/s42255-019-0152-6 ◽

2019 ◽

Vol 2 (1) ◽

pp. 97-109 ◽

Cited By ~ 17

Author(s):

Jinchu Vijay ◽

Marie-Frédérique Gauthier ◽

Rebecca L. Biswell ◽

Daniel A. Louiselle ◽

Jeffrey J. Johnston ◽

...

Keyword(s):

Adipose Tissue ◽

Single Cell ◽

Single Cell Analysis ◽

Human Adipose Tissue ◽

Cell Types ◽

Specific Cell ◽

Cell Analysis ◽

Disease Specific

Download Full-text

JIND: Joint Integration and Discrimination for Automated Single-Cell Annotation

10.1101/2020.10.06.327601 ◽

2020 ◽

Author(s):

Mohit Goyal ◽

Guillermo Serrano ◽

Ilan Shomorony ◽

Mikel Hernaez ◽

Idoia Ochoa

Keyword(s):

Single Cell ◽

Cell Types ◽

Marker Genes ◽

Specific Marker ◽

Rna Seq ◽

Batch Effects ◽

Cell Type ◽

Latent Space ◽

Cell Type Specific ◽

Low Dimensional

AbstractSingle-cell RNA-seq is a powerful tool in the study of the cellular composition of different tissues and organisms. A key step in the analysis pipeline is the annotation of cell-types based on the expression of specific marker genes. Since manual annotation is labor-intensive and does not scale to large datasets, several methods for automated cell-type annotation have been proposed based on supervised learning. However, these methods generally require feature extraction and batch alignment prior to classification, and their performance may become unreliable in the presence of cell-types with very similar transcriptomic profiles, such as differentiating cells. We propose JIND, a framework for automated cell-type identification based on neural networks that directly learns a low-dimensional representation (latent code) in which cell-types can be reliably determined. To account for batch effects, JIND performs a novel asymmetric alignment in which the transcriptomic profile of unseen cells is mapped onto the previously learned latent space, hence avoiding the need of retraining the model whenever a new dataset becomes available. JIND also learns cell-type-specific confidence thresholds to identify and reject cells that cannot be reliably classified. We show on datasets with and without batch effects that JIND classifies cells more accurately than previously proposed methods while rejecting only a small proportion of cells. Moreover, JIND batch alignment is parallelizable, being more than five or six times faster than Seurat integration. Availability: https://github.com/mohit1997/JIND.

Download Full-text

Single cell analysis reveals immune cell–adipocyte crosstalk regulating the transcription of thermogenic adipocytes

eLife ◽

10.7554/elife.49501 ◽

2019 ◽

Vol 8 ◽

Cited By ~ 17

Author(s):

Prashant Rajbhandari ◽

Douglas Arneson ◽

Sydney K Hart ◽

In Sook Ahn ◽

Graciel Diamante ◽

...

Keyword(s):

Single Cell ◽

Immune Cells ◽

Immune Cell ◽

Single Cell Analysis ◽

Metabolic Response ◽

Cell Types ◽

Specific Cell ◽

Beneficial Effects ◽

Thermogenic Adipocytes ◽

And Function

Immune cells are vital constituents of the adipose microenvironment that influence both local and systemic lipid metabolism. Mice lacking IL10 have enhanced thermogenesis, but the roles of specific cell types in the metabolic response to IL10 remain to be defined. We demonstrate here that selective loss of IL10 receptor α in adipocytes recapitulates the beneficial effects of global IL10 deletion, and that local crosstalk between IL10-producing immune cells and adipocytes is a determinant of thermogenesis and systemic energy balance. Single Nuclei Adipocyte RNA-sequencing (SNAP-seq) of subcutaneous adipose tissue defined a metabolically-active mature adipocyte subtype characterized by robust expression of genes involved in thermogenesis whose transcriptome was selectively responsive to IL10Rα deletion. Furthermore, single-cell transcriptomic analysis of adipose stromal populations identified lymphocytes as a key source of IL10 production in response to thermogenic stimuli. These findings implicate adaptive immune cell-adipocyte communication in the maintenance of adipose subtype identity and function.

Download Full-text

Single-cell transcriptomes of pancreatic preinvasive lesions and cancer reveal acinar metaplastic cells’ heterogeneity

Nature Communications ◽

10.1038/s41467-020-18207-z ◽

2020 ◽

Vol 11 (1) ◽

Cited By ~ 1

Author(s):

Yehuda Schlesinger ◽

Oshri Yosefov-Levi ◽

Dror Kolodkin-Gal ◽

Roy Zvi Granit ◽

Luriano Peters ◽

...

Keyword(s):

Single Cell ◽

Malignant Lesion ◽

Tumor Development ◽

Initial Step ◽

Cell Types ◽

Tumor Formation ◽

Specific Cell ◽

Preinvasive Lesions ◽

Cell Transcriptome ◽

Single Cell Transcriptome

Abstract Acinar metaplasia is an initial step in a series of events that can lead to pancreatic cancer. Here we perform single-cell RNA-sequencing of mouse pancreas during the progression from preinvasive stages to tumor formation. Using a reporter gene, we identify metaplastic cells that originated from acinar cells and express two transcription factors, Onecut2 and Foxq1. Further analyses of metaplastic acinar cell heterogeneity define six acinar metaplastic cell types and states, including stomach-specific cell types. Localization of metaplastic cell types and mixture of different metaplastic cell types in the same pre-malignant lesion is shown. Finally, single-cell transcriptome analyses of tumor-associated stromal, immune, endothelial and fibroblast cells identify signals that may support tumor development, as well as the recruitment and education of immune cells. Our findings are consistent with the early, premalignant formation of an immunosuppressive environment mediated by interactions between acinar metaplastic cells and other cells in the microenvironment.

Download Full-text

Ensemble learning for classifying single-cell data and projection across reference atlases

Bioinformatics ◽

10.1093/bioinformatics/btaa137 ◽

2020 ◽

Vol 36 (11) ◽

pp. 3585-3587

Author(s):

Lin Wang ◽

Francisca Catalan ◽

Karin Shamardani ◽

Husam Babikir ◽

Aaron Diaz

Keyword(s):

Single Cell ◽

Cell Types ◽

Status Quo ◽

Supplementary Information ◽

Published Data ◽

Supplementary Data ◽

Cell Type ◽

Low Sensitivity ◽

Project Data ◽

Cell Data

Abstract Summary Single-cell data are being generated at an accelerating pace. How best to project data across single-cell atlases is an open problem. We developed a boosted learner that overcomes the greatest challenge with status quo classifiers: low sensitivity, especially when dealing with rare cell types. By comparing novel and published data from distinct scRNA-seq modalities that were acquired from the same tissues, we show that this approach preserves cell-type labels when mapping across diverse platforms. Availability and implementation https://github.com/diazlab/ELSA Contact [email protected] Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Single-Cell Transcriptomic Map of the Human and Mouse Bladders

Journal of the American Society of Nephrology ◽

10.1681/asn.2019040335 ◽

2019 ◽

Vol 30 (11) ◽

pp. 2159-2176 ◽

Cited By ~ 13

Author(s):

Zhenyuan Yu ◽

Jinling Liao ◽

Yang Chen ◽

Chunlin Zou ◽

Haiying Zhang ◽

...

Keyword(s):

Bladder Cancer ◽

Epithelial Cells ◽

Single Cell ◽

Cell Types ◽

Specific Cell ◽

Human Bladder ◽

Healthy Human ◽

Bladder Cell ◽

Bladder Cells ◽

Human And Mouse

BackgroundHaving a comprehensive map of the cellular anatomy of the normal human bladder is vital to understanding the cellular origins of benign bladder disease and bladder cancer.MethodsWe used single-cell RNA sequencing (scRNA-seq) of 12,423 cells from healthy human bladder tissue samples taken from patients with bladder cancer and 12,884 cells from mouse bladders to classify bladder cell types and their underlying functions.ResultsWe created a single-cell transcriptomic map of human and mouse bladders, including 16 clusters of human bladder cells and 15 clusters of mouse bladder cells. The homology and heterogeneity of human and mouse bladder cell types were compared and both conservative and heterogeneous aspects of human and mouse bladder evolution were identified. We also discovered two novel types of human bladder cells. One type is ADRA2A+ and HRH2+ interstitial cells which may be associated with nerve conduction and allergic reactions. The other type is TNNT1+ epithelial cells that may be involved with bladder emptying. We verify these TNNT1+ epithelial cells also occur in rat and mouse bladders.ConclusionsThis transcriptomic map provides a resource for studying bladder cell types, specific cell markers, signaling receptors, and genes that will help us to learn more about the relationship between bladder cell types and diseases.

Download Full-text