scholarly journals Unsupervised topological alignment for single-cell multi-omics integration

2020 ◽  
Vol 36 (Supplement_1) ◽  
pp. i48-i56 ◽  
Author(s):  
Kai Cao ◽  
Xiangqi Bai ◽  
Yiguang Hong ◽  
Lin Wan

Abstract Motivation Single-cell multi-omics data provide a comprehensive molecular view of cells. However, single-cell multi-omics datasets consist of unpaired cells measured with distinct unmatched features across modalities, making data integration challenging. Results In this study, we present a novel algorithm, termed UnionCom, for the unsupervised topological alignment of single-cell multi-omics integration. UnionCom does not require any correspondence information, either among cells or among features. It first embeds the intrinsic low-dimensional structure of each single-cell dataset into a distance matrix of cells within the same dataset and then aligns the cells across single-cell multi-omics datasets by matching the distance matrices via a matrix optimization method. Finally, it projects the distinct unmatched features across single-cell datasets into a common embedding space for feature comparability of the aligned cells. To match the complex non-linear geometrical distorted low-dimensional structures across datasets, UnionCom proposes and adjusts a global scaling parameter on distance matrices for aligning similar topological structures. It does not require one-to-one correspondence among cells across datasets, and it can accommodate samples with dataset-specific cell types. UnionCom outperforms state-of-the-art methods on both simulated and real single-cell multi-omics datasets. UnionCom is robust to parameter choices, as well as subsampling of features. Availability and implementation UnionCom software is available at https://github.com/caokai1073/UnionCom. Supplementary information Supplementary data are available at Bioinformatics online.

Author(s):  
Kai Cao ◽  
Xiangqi Bai ◽  
Yiguang Hong ◽  
Lin Wan

AbstractSingle-cell multi-omics data provide a comprehensive molecular view of cells. However, single-cell multi-omics datasets consist of unpaired cells measured with distinct unmatched features across modalities, making data integration challenging. In this study, we present a novel algorithm, termed UnionCom, for the unsupervised topological alignment of single-cell multi-omics integration. UnionCom does not require any correspondence information, either among cells or among features. It first embeds the intrinsic low-dimensional structure of each single-cell dataset into a distance matrix of cells within the same dataset and then aligns the cells across single-cell multi-omics datasets by matching the distance matrices via a matrix optimization method. Finally, it projects the distinct unmatched features across single-cell datasets into a common embedding space for feature comparability of the aligned cells. To match the complex nonlinear geometrical distorted low-dimensional structures across datasets, UnionCom proposes and adjusts a global scaling parameter on distance matrices for aligning similar topological structures. It does not require one-to-one correspondence among cells across datasets, and it can accommodate samples with dataset-specific cell types. UnionCom outperforms state-of-the-art methods on both simulated and real single-cell multi-omics datasets. UnionCom is robust to parameter choices, as well as subsampling of features. UnionCom software is available at https://github.com/caokai1073/UnionCom.


Author(s):  
Samuel Melton ◽  
Sharad Ramanathan

Abstract Motivation Recent technological advances produce a wealth of high-dimensional descriptions of biological processes, yet extracting meaningful insight and mechanistic understanding from these data remains challenging. For example, in developmental biology, the dynamics of differentiation can now be mapped quantitatively using single-cell RNA sequencing, yet it is difficult to infer molecular regulators of developmental transitions. Here, we show that discovering informative features in the data is crucial for statistical analysis as well as making experimental predictions. Results We identify features based on their ability to discriminate between clusters of the data points. We define a class of problems in which linear separability of clusters is hidden in a low-dimensional space. We propose an unsupervised method to identify the subset of features that define a low-dimensional subspace in which clustering can be conducted. This is achieved by averaging over discriminators trained on an ensemble of proposed cluster configurations. We then apply our method to single-cell RNA-seq data from mouse gastrulation, and identify 27 key transcription factors (out of 409 total), 18 of which are known to define cell states through their expression levels. In this inferred subspace, we find clear signatures of known cell types that eluded classification prior to discovery of the correct low-dimensional subspace. Availability and implementation https://github.com/smelton/SMD. Supplementary information Supplementary data are available at Bioinformatics online.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Deepa Bhartiya

AbstractLife-long tissue homeostasis of adult tissues is supposedly maintained by the resident stem cells. These stem cells are quiescent in nature and rarely divide to self-renew and give rise to tissue-specific “progenitors” (lineage-restricted and tissue-committed) which divide rapidly and differentiate into tissue-specific cell types. However, it has proved difficult to isolate these quiescent stem cells as a physical entity. Recent single-cell RNAseq studies on several adult tissues including ovary, prostate, and cardiac tissues have not been able to detect stem cells. Thus, it has been postulated that adult cells dedifferentiate to stem-like state to ensure regeneration and can be defined as cells capable to replace lost cells through mitosis. This idea challenges basic paradigm of development biology regarding plasticity that a cell enters point of no return once it initiates differentiation. The underlying reason for this dilemma is that we are putting stem cells and somatic cells together while processing for various studies. Stem cells and adult mature cell types are distinct entities; stem cells are quiescent, small in size, and with minimal organelles whereas the mature cells are metabolically active and have multiple organelles lying in abundant cytoplasm. As a result, they do not pellet down together when centrifuged at 100–350g. At this speed, mature cells get collected but stem cells remain buoyant and can be pelleted by centrifuging at 1000g. Thus, inability to detect stem cells in recently published single-cell RNAseq studies is because the stem cells were unknowingly discarded while processing and were never subjected to RNAseq. This needs to be kept in mind before proposing to redefine adult stem cells.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Kip D. Zimmerman ◽  
Mark A. Espeland ◽  
Carl D. Langefeld

AbstractCells from the same individual share common genetic and environmental backgrounds and are not statistically independent; therefore, they are subsamples or pseudoreplicates. Thus, single-cell data have a hierarchical structure that many current single-cell methods do not address, leading to biased inference, highly inflated type 1 error rates, and reduced robustness and reproducibility. This includes methods that use a batch effect correction for individual as a means of accounting for within-sample correlation. Here, we document this dependence across a range of cell types and show that pseudo-bulk aggregation methods are conservative and underpowered relative to mixed models. To compute differential expression within a specific cell type across treatment groups, we propose applying generalized linear mixed models with a random effect for individual, to properly account for both zero inflation and the correlation structure among measures from cells within an individual. Finally, we provide power estimates across a range of experimental conditions to assist researchers in designing appropriately powered studies.


2019 ◽  
Vol 2 (1) ◽  
pp. 97-109 ◽  
Author(s):  
Jinchu Vijay ◽  
Marie-Frédérique Gauthier ◽  
Rebecca L. Biswell ◽  
Daniel A. Louiselle ◽  
Jeffrey J. Johnston ◽  
...  

2020 ◽  
Author(s):  
Mohit Goyal ◽  
Guillermo Serrano ◽  
Ilan Shomorony ◽  
Mikel Hernaez ◽  
Idoia Ochoa

AbstractSingle-cell RNA-seq is a powerful tool in the study of the cellular composition of different tissues and organisms. A key step in the analysis pipeline is the annotation of cell-types based on the expression of specific marker genes. Since manual annotation is labor-intensive and does not scale to large datasets, several methods for automated cell-type annotation have been proposed based on supervised learning. However, these methods generally require feature extraction and batch alignment prior to classification, and their performance may become unreliable in the presence of cell-types with very similar transcriptomic profiles, such as differentiating cells. We propose JIND, a framework for automated cell-type identification based on neural networks that directly learns a low-dimensional representation (latent code) in which cell-types can be reliably determined. To account for batch effects, JIND performs a novel asymmetric alignment in which the transcriptomic profile of unseen cells is mapped onto the previously learned latent space, hence avoiding the need of retraining the model whenever a new dataset becomes available. JIND also learns cell-type-specific confidence thresholds to identify and reject cells that cannot be reliably classified. We show on datasets with and without batch effects that JIND classifies cells more accurately than previously proposed methods while rejecting only a small proportion of cells. Moreover, JIND batch alignment is parallelizable, being more than five or six times faster than Seurat integration. Availability: https://github.com/mohit1997/JIND.


eLife ◽  
2019 ◽  
Vol 8 ◽  
Author(s):  
Prashant Rajbhandari ◽  
Douglas Arneson ◽  
Sydney K Hart ◽  
In Sook Ahn ◽  
Graciel Diamante ◽  
...  

Immune cells are vital constituents of the adipose microenvironment that influence both local and systemic lipid metabolism. Mice lacking IL10 have enhanced thermogenesis, but the roles of specific cell types in the metabolic response to IL10 remain to be defined. We demonstrate here that selective loss of IL10 receptor α in adipocytes recapitulates the beneficial effects of global IL10 deletion, and that local crosstalk between IL10-producing immune cells and adipocytes is a determinant of thermogenesis and systemic energy balance. Single Nuclei Adipocyte RNA-sequencing (SNAP-seq) of subcutaneous adipose tissue defined a metabolically-active mature adipocyte subtype characterized by robust expression of genes involved in thermogenesis whose transcriptome was selectively responsive to IL10Rα deletion. Furthermore, single-cell transcriptomic analysis of adipose stromal populations identified lymphocytes as a key source of IL10 production in response to thermogenic stimuli. These findings implicate adaptive immune cell-adipocyte communication in the maintenance of adipose subtype identity and function.


2020 ◽  
Vol 11 (1) ◽  
Author(s):  
Yehuda Schlesinger ◽  
Oshri Yosefov-Levi ◽  
Dror Kolodkin-Gal ◽  
Roy Zvi Granit ◽  
Luriano Peters ◽  
...  

Abstract Acinar metaplasia is an initial step in a series of events that can lead to pancreatic cancer. Here we perform single-cell RNA-sequencing of mouse pancreas during the progression from preinvasive stages to tumor formation. Using a reporter gene, we identify metaplastic cells that originated from acinar cells and express two transcription factors, Onecut2 and Foxq1. Further analyses of metaplastic acinar cell heterogeneity define six acinar metaplastic cell types and states, including stomach-specific cell types. Localization of metaplastic cell types and mixture of different metaplastic cell types in the same pre-malignant lesion is shown. Finally, single-cell transcriptome analyses of tumor-associated stromal, immune, endothelial and fibroblast cells identify signals that may support tumor development, as well as the recruitment and education of immune cells. Our findings are consistent with the early, premalignant formation of an immunosuppressive environment mediated by interactions between acinar metaplastic cells and other cells in the microenvironment.


2020 ◽  
Vol 36 (11) ◽  
pp. 3585-3587
Author(s):  
Lin Wang ◽  
Francisca Catalan ◽  
Karin Shamardani ◽  
Husam Babikir ◽  
Aaron Diaz

Abstract Summary Single-cell data are being generated at an accelerating pace. How best to project data across single-cell atlases is an open problem. We developed a boosted learner that overcomes the greatest challenge with status quo classifiers: low sensitivity, especially when dealing with rare cell types. By comparing novel and published data from distinct scRNA-seq modalities that were acquired from the same tissues, we show that this approach preserves cell-type labels when mapping across diverse platforms. Availability and implementation https://github.com/diazlab/ELSA Contact [email protected] Supplementary information Supplementary data are available at Bioinformatics online.


2019 ◽  
Vol 30 (11) ◽  
pp. 2159-2176 ◽  
Author(s):  
Zhenyuan Yu ◽  
Jinling Liao ◽  
Yang Chen ◽  
Chunlin Zou ◽  
Haiying Zhang ◽  
...  

BackgroundHaving a comprehensive map of the cellular anatomy of the normal human bladder is vital to understanding the cellular origins of benign bladder disease and bladder cancer.MethodsWe used single-cell RNA sequencing (scRNA-seq) of 12,423 cells from healthy human bladder tissue samples taken from patients with bladder cancer and 12,884 cells from mouse bladders to classify bladder cell types and their underlying functions.ResultsWe created a single-cell transcriptomic map of human and mouse bladders, including 16 clusters of human bladder cells and 15 clusters of mouse bladder cells. The homology and heterogeneity of human and mouse bladder cell types were compared and both conservative and heterogeneous aspects of human and mouse bladder evolution were identified. We also discovered two novel types of human bladder cells. One type is ADRA2A+ and HRH2+ interstitial cells which may be associated with nerve conduction and allergic reactions. The other type is TNNT1+ epithelial cells that may be involved with bladder emptying. We verify these TNNT1+ epithelial cells also occur in rat and mouse bladders.ConclusionsThis transcriptomic map provides a resource for studying bladder cell types, specific cell markers, signaling receptors, and genes that will help us to learn more about the relationship between bladder cell types and diseases.


Sign in / Sign up

Export Citation Format

Share Document