VPAC: Variational projection for accurate clustering of single-cell transcriptomic data

Mapping Intimacies ◽

10.1101/523993 ◽

2019 ◽

Author(s):

Shengquan Chen ◽

Kui Hua ◽

Hongfei Cui ◽

Rui Jiang

Keyword(s):

Single Cell ◽

Scale Up ◽

Mixture Distribution ◽

Cell Types ◽

Gaussian Mixture ◽

Specific Cell ◽

Transcriptomic Data ◽

Cell Clustering ◽

Dataset Size ◽

User Friendly

AbstractBackgroundSingle-cell RNA-sequencing (scRNA-seq) technologies have advanced rapidly in recent years and enabled the quantitative characterization at a microscopic resolution. With the exponential growth of the number of cells profiled in individual scRNA-seq experiments, the demand for identifying putative cell types from the data has become a great challenge that appeals for novel computational methods. Although a variety of algorithms have recently been proposed for single-cell clustering, such limitations as low accuracy, inferior robustness, and inadequate stability greatly impede the scope of applications of these methods.ResultsWe propose a novel model-based algorithm, named VPAC, for accurate clustering of single-cell transcriptomic data through variational projection, which assumes that single-cell samples follow a Gaussian mixture distribution in a latent space. Through comprehensive validation experiments, we demonstrate that VPAC can not only be applied to datasets of discrete counts and normalized continuous data, but also scale up well to various data dimensionality, different dataset size and different data sparsity. We further illustrate the ability of VPAC to detect genes with strong unique signatures of a specific cell type, which may shed light on the studies in system biology. We have released a user-friendly python package of VPAC in Github (https://github.com/ShengquanChen/VPAC). Users can directly import our VPAC class and conduct clustering without tedious installation of dependency packages.ConclusionsVPAC enables highly accurate clustering of single-cell transcriptomic data via a statistical model. We expect to see wide applications of our method to not only transcriptome studies for fully understanding the cell identity and functionality, but also the clustering of more general data.

Download Full-text

EnClaSC: A novel ensemble approach for accurate and robust cell-type classification of single-cell transcriptomes

10.1101/754085 ◽

2019 ◽

Author(s):

Xiaoyang Chen ◽

Shengquan Chen ◽

Rui Jiang

Keyword(s):

Single Cell ◽

Rapid Development ◽

Scale Up ◽

Cell Types ◽

Cell Type ◽

Species Classification ◽

Transcriptomic Data ◽

General Data ◽

Type Classification

AbstractBackgroundIn recent years, the rapid development of single-cell RNA-sequencing (scRNA-seq) techniques enables the quantitative characterization of cell types at a single-cell resolution. With the explosive growth of the number of cells profiled in individual scRNA-seq experiments, there is a demand for novel computational methods for classifying newly-generated scRNA-seq data onto annotated labels. Although several methods have recently been proposed for the cell-type classification of single-cell transcriptomic data, such limitations as inadequate accuracy, inferior robustness, and low stability greatly limit their wide applications.ResultsWe propose a novel ensemble approach, named EnClaSC, for accurate and robust cell-type classification of single-cell transcriptomic data. Through comprehensive validation experiments, we demonstrate that EnClaSC can not only be applied to the self-projection within a specific dataset and the cell-type classification across different datasets, but also scale up well to various data dimensionality and different data sparsity. We further illustrate the ability of EnClaSC to effectively make cross-species classification, which may shed light on the studies in correlation of different species. EnClaSC is freely available at https://github.com/xy-chen16/EnClaSC.ConclusionsEnClaSC enables highly accurate and robust cell-type classification of single-cell transcriptomic data via an ensemble learning method. We expect to see wide applications of our method to not only transcriptome studies, but also the classification of more general data.

Download Full-text

EnClaSC: a novel ensemble approach for accurate and robust cell-type classification of single-cell transcriptomes

BMC Bioinformatics ◽

10.1186/s12859-020-03679-z ◽

2020 ◽

Vol 21 (S13) ◽

Author(s):

Xiaoyang Chen ◽

Shengquan Chen ◽

Rui Jiang

Keyword(s):

Single Cell ◽

Rapid Development ◽

Scale Up ◽

Cell Types ◽

Cell Type ◽

Species Classification ◽

Transcriptomic Data ◽

General Data ◽

Type Classification

Abstract Background In recent years, the rapid development of single-cell RNA-sequencing (scRNA-seq) techniques enables the quantitative characterization of cell types at a single-cell resolution. With the explosive growth of the number of cells profiled in individual scRNA-seq experiments, there is a demand for novel computational methods for classifying newly-generated scRNA-seq data onto annotated labels. Although several methods have recently been proposed for the cell-type classification of single-cell transcriptomic data, such limitations as inadequate accuracy, inferior robustness, and low stability greatly limit their wide applications. Results We propose a novel ensemble approach, named EnClaSC, for accurate and robust cell-type classification of single-cell transcriptomic data. Through comprehensive validation experiments, we demonstrate that EnClaSC can not only be applied to the self-projection within a specific dataset and the cell-type classification across different datasets, but also scale up well to various data dimensionality and different data sparsity. We further illustrate the ability of EnClaSC to effectively make cross-species classification, which may shed light on the studies in correlation of different species. EnClaSC is freely available at https://github.com/xy-chen16/EnClaSC. Conclusions EnClaSC enables highly accurate and robust cell-type classification of single-cell transcriptomic data via an ensemble learning method. We expect to see wide applications of our method to not only transcriptome studies, but also the classification of more general data.

Download Full-text

Adult tissue-resident stem cells—fact or fiction?

Stem Cell Research & Therapy ◽

10.1186/s13287-021-02142-x ◽

2021 ◽

Vol 12 (1) ◽

Cited By ~ 1

Author(s):

Deepa Bhartiya

Keyword(s):

Stem Cells ◽

Single Cell ◽

Adult Stem Cells ◽

Cell Types ◽

Specific Cell ◽

Tissue Specific ◽

Cardiac Tissues ◽

Adult Tissues ◽

Resident Stem Cells ◽

Abundant Cytoplasm

AbstractLife-long tissue homeostasis of adult tissues is supposedly maintained by the resident stem cells. These stem cells are quiescent in nature and rarely divide to self-renew and give rise to tissue-specific “progenitors” (lineage-restricted and tissue-committed) which divide rapidly and differentiate into tissue-specific cell types. However, it has proved difficult to isolate these quiescent stem cells as a physical entity. Recent single-cell RNAseq studies on several adult tissues including ovary, prostate, and cardiac tissues have not been able to detect stem cells. Thus, it has been postulated that adult cells dedifferentiate to stem-like state to ensure regeneration and can be defined as cells capable to replace lost cells through mitosis. This idea challenges basic paradigm of development biology regarding plasticity that a cell enters point of no return once it initiates differentiation. The underlying reason for this dilemma is that we are putting stem cells and somatic cells together while processing for various studies. Stem cells and adult mature cell types are distinct entities; stem cells are quiescent, small in size, and with minimal organelles whereas the mature cells are metabolically active and have multiple organelles lying in abundant cytoplasm. As a result, they do not pellet down together when centrifuged at 100–350g. At this speed, mature cells get collected but stem cells remain buoyant and can be pelleted by centrifuging at 1000g. Thus, inability to detect stem cells in recently published single-cell RNAseq studies is because the stem cells were unknowingly discarded while processing and were never subjected to RNAseq. This needs to be kept in mind before proposing to redefine adult stem cells.

Download Full-text

A practical solution to pseudoreplication bias in single-cell studies

Nature Communications ◽

10.1038/s41467-021-21038-1 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Kip D. Zimmerman ◽

Mark A. Espeland ◽

Carl D. Langefeld

Keyword(s):

Single Cell ◽

Mixed Models ◽

Random Effect ◽

Cell Types ◽

Error Rates ◽

Specific Cell ◽

Experimental Conditions ◽

Type 1 Error ◽

Sample Correlation ◽

Inflated Type

AbstractCells from the same individual share common genetic and environmental backgrounds and are not statistically independent; therefore, they are subsamples or pseudoreplicates. Thus, single-cell data have a hierarchical structure that many current single-cell methods do not address, leading to biased inference, highly inflated type 1 error rates, and reduced robustness and reproducibility. This includes methods that use a batch effect correction for individual as a means of accounting for within-sample correlation. Here, we document this dependence across a range of cell types and show that pseudo-bulk aggregation methods are conservative and underpowered relative to mixed models. To compute differential expression within a specific cell type across treatment groups, we propose applying generalized linear mixed models with a random effect for individual, to properly account for both zero inflation and the correlation structure among measures from cells within an individual. Finally, we provide power estimates across a range of experimental conditions to assist researchers in designing appropriately powered studies.

Download Full-text

Selecting single cell clustering parameter values using subsampling-based robustness metrics

BMC Bioinformatics ◽

10.1186/s12859-021-03957-4 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Ryan B. Patterson-Cross ◽

Ariel J. Levine ◽

Vilas Menon

Keyword(s):

Single Cell ◽

Optimal Parameter ◽

Clustering Algorithms ◽

Cell Types ◽

Parameter Selection ◽

Data Set ◽

Biologically Relevant ◽

Cell Clustering ◽

Parameter Values ◽

Robustness Metrics

Abstract Background Generating and analysing single-cell data has become a widespread approach to examine tissue heterogeneity, and numerous algorithms exist for clustering these datasets to identify putative cell types with shared transcriptomic signatures. However, many of these clustering workflows rely on user-tuned parameter values, tailored to each dataset, to identify a set of biologically relevant clusters. Whereas users often develop their own intuition as to the optimal range of parameters for clustering on each data set, the lack of systematic approaches to identify this range can be daunting to new users of any given workflow. In addition, an optimal parameter set does not guarantee that all clusters are equally well-resolved, given the heterogeneity in transcriptomic signatures in most biological systems. Results Here, we illustrate a subsampling-based approach (chooseR) that simultaneously guides parameter selection and characterizes cluster robustness. Through bootstrapped iterative clustering across a range of parameters, chooseR was used to select parameter values for two distinct clustering workflows (Seurat and scVI). In each case, chooseR identified parameters that produced biologically relevant clusters from both well-characterized (human PBMC) and complex (mouse spinal cord) datasets. Moreover, it provided a simple “robustness score” for each of these clusters, facilitating the assessment of cluster quality. Conclusion chooseR is a simple, conceptually understandable tool that can be used flexibly across clustering algorithms, workflows, and datasets to guide clustering parameter selection and characterize cluster robustness.

Download Full-text

Single-cell analysis of human adipose tissue identifies depot- and disease-specific cell types

Nature Metabolism ◽

10.1038/s42255-019-0152-6 ◽

2019 ◽

Vol 2 (1) ◽

pp. 97-109 ◽

Cited By ~ 17

Author(s):

Jinchu Vijay ◽

Marie-Frédérique Gauthier ◽

Rebecca L. Biswell ◽

Daniel A. Louiselle ◽

Jeffrey J. Johnston ◽

...

Keyword(s):

Adipose Tissue ◽

Single Cell ◽

Single Cell Analysis ◽

Human Adipose Tissue ◽

Cell Types ◽

Specific Cell ◽

Cell Analysis ◽

Disease Specific

Download Full-text

Single cell analysis reveals immune cell–adipocyte crosstalk regulating the transcription of thermogenic adipocytes

eLife ◽

10.7554/elife.49501 ◽

2019 ◽

Vol 8 ◽

Cited By ~ 17

Author(s):

Prashant Rajbhandari ◽

Douglas Arneson ◽

Sydney K Hart ◽

In Sook Ahn ◽

Graciel Diamante ◽

...

Keyword(s):

Single Cell ◽

Immune Cells ◽

Immune Cell ◽

Single Cell Analysis ◽

Metabolic Response ◽

Cell Types ◽

Specific Cell ◽

Beneficial Effects ◽

Thermogenic Adipocytes ◽

And Function

Immune cells are vital constituents of the adipose microenvironment that influence both local and systemic lipid metabolism. Mice lacking IL10 have enhanced thermogenesis, but the roles of specific cell types in the metabolic response to IL10 remain to be defined. We demonstrate here that selective loss of IL10 receptor α in adipocytes recapitulates the beneficial effects of global IL10 deletion, and that local crosstalk between IL10-producing immune cells and adipocytes is a determinant of thermogenesis and systemic energy balance. Single Nuclei Adipocyte RNA-sequencing (SNAP-seq) of subcutaneous adipose tissue defined a metabolically-active mature adipocyte subtype characterized by robust expression of genes involved in thermogenesis whose transcriptome was selectively responsive to IL10Rα deletion. Furthermore, single-cell transcriptomic analysis of adipose stromal populations identified lymphocytes as a key source of IL10 production in response to thermogenic stimuli. These findings implicate adaptive immune cell-adipocyte communication in the maintenance of adipose subtype identity and function.

Download Full-text

Impact of data preprocessing on cell-type clustering based on single-cell RNA-seq data

BMC Bioinformatics ◽

10.1186/s12859-020-03797-8 ◽

2020 ◽

Vol 21 (1) ◽

Author(s):

Chunxiang Wang ◽

Xin Gao ◽

Juntao Liu

Keyword(s):

Single Cell ◽

Clustering Algorithm ◽

Clustering Algorithms ◽

Data Preprocessing ◽

Cell Types ◽

Rna Seq ◽

Cell Type ◽

Preprocessing Method ◽

Cell Clustering ◽

Cell Gene Expression

Abstract Background Advances in single-cell RNA-seq technology have led to great opportunities for the quantitative characterization of cell types, and many clustering algorithms have been developed based on single-cell gene expression. However, we found that different data preprocessing methods show quite different effects on clustering algorithms. Moreover, there is no specific preprocessing method that is applicable to all clustering algorithms, and even for the same clustering algorithm, the best preprocessing method depends on the input data. Results We designed a graph-based algorithm, SC3-e, specifically for discriminating the best data preprocessing method for SC3, which is currently the most widely used clustering algorithm for single cell clustering. When tested on eight frequently used single-cell RNA-seq data sets, SC3-e always accurately selects the best data preprocessing method for SC3 and therefore greatly enhances the clustering performance of SC3. Conclusion The SC3-e algorithm is practically powerful for discriminating the best data preprocessing method, and therefore largely enhances the performance of cell-type clustering of SC3. It is expected to play a crucial role in the related studies of single-cell clustering, such as the studies of human complex diseases and discoveries of new cell types.

Download Full-text

Single-cell transcriptomes of pancreatic preinvasive lesions and cancer reveal acinar metaplastic cells’ heterogeneity

Nature Communications ◽

10.1038/s41467-020-18207-z ◽

2020 ◽

Vol 11 (1) ◽

Cited By ~ 1

Author(s):

Yehuda Schlesinger ◽

Oshri Yosefov-Levi ◽

Dror Kolodkin-Gal ◽

Roy Zvi Granit ◽

Luriano Peters ◽

...

Keyword(s):

Single Cell ◽

Malignant Lesion ◽

Tumor Development ◽

Initial Step ◽

Cell Types ◽

Tumor Formation ◽

Specific Cell ◽

Preinvasive Lesions ◽

Cell Transcriptome ◽

Single Cell Transcriptome

Abstract Acinar metaplasia is an initial step in a series of events that can lead to pancreatic cancer. Here we perform single-cell RNA-sequencing of mouse pancreas during the progression from preinvasive stages to tumor formation. Using a reporter gene, we identify metaplastic cells that originated from acinar cells and express two transcription factors, Onecut2 and Foxq1. Further analyses of metaplastic acinar cell heterogeneity define six acinar metaplastic cell types and states, including stomach-specific cell types. Localization of metaplastic cell types and mixture of different metaplastic cell types in the same pre-malignant lesion is shown. Finally, single-cell transcriptome analyses of tumor-associated stromal, immune, endothelial and fibroblast cells identify signals that may support tumor development, as well as the recruitment and education of immune cells. Our findings are consistent with the early, premalignant formation of an immunosuppressive environment mediated by interactions between acinar metaplastic cells and other cells in the microenvironment.

Download Full-text

Single-Cell Transcriptomic Map of the Human and Mouse Bladders

Journal of the American Society of Nephrology ◽

10.1681/asn.2019040335 ◽

2019 ◽

Vol 30 (11) ◽

pp. 2159-2176 ◽

Cited By ~ 13

Author(s):

Zhenyuan Yu ◽

Jinling Liao ◽

Yang Chen ◽

Chunlin Zou ◽

Haiying Zhang ◽

...

Keyword(s):

Bladder Cancer ◽

Epithelial Cells ◽

Single Cell ◽

Cell Types ◽

Specific Cell ◽

Human Bladder ◽

Healthy Human ◽

Bladder Cell ◽

Bladder Cells ◽

Human And Mouse

BackgroundHaving a comprehensive map of the cellular anatomy of the normal human bladder is vital to understanding the cellular origins of benign bladder disease and bladder cancer.MethodsWe used single-cell RNA sequencing (scRNA-seq) of 12,423 cells from healthy human bladder tissue samples taken from patients with bladder cancer and 12,884 cells from mouse bladders to classify bladder cell types and their underlying functions.ResultsWe created a single-cell transcriptomic map of human and mouse bladders, including 16 clusters of human bladder cells and 15 clusters of mouse bladder cells. The homology and heterogeneity of human and mouse bladder cell types were compared and both conservative and heterogeneous aspects of human and mouse bladder evolution were identified. We also discovered two novel types of human bladder cells. One type is ADRA2A+ and HRH2+ interstitial cells which may be associated with nerve conduction and allergic reactions. The other type is TNNT1+ epithelial cells that may be involved with bladder emptying. We verify these TNNT1+ epithelial cells also occur in rat and mouse bladders.ConclusionsThis transcriptomic map provides a resource for studying bladder cell types, specific cell markers, signaling receptors, and genes that will help us to learn more about the relationship between bladder cell types and diseases.

Download Full-text