scholarly journals VASC: dimension reduction and visualization of single cell RNA sequencing data by deep variational autoencoder

2017 ◽  
Author(s):  
Dongfang Wang ◽  
Jin Gu

AbstractSingle cell RNA sequencing (scRNA-seq) is a powerful technique to analyze the transcriptomic heterogeneities in single cell level. It is an important step for studying cell sub-populations and lineages based on scRNA-seq data by finding an effective low-dimensional representation and visualization of the original data. The scRNA-seq data are much noiser than traditional bulk RNA-Seq: in the single cell level, the transcriptional fluctuations are much larger than the average of a cell population and the low amount of RNA transcripts will increase the rate of technical dropout events. In this study, we proposed VASC (deep Variational Autoencoder for scRNA-seq data), a deep multi-layer generative model, for the unsupervised dimension reduction and visualization of scRNA-seq data. It can explicitly model the dropout events and find the nonlinear hierarchical feature representations of the original data. Tested on twenty datasets, VASC shows superior performances in most cases and broader dataset compatibility compared with four state-of-the-art dimension reduction methods. Then, for a case study of pre-implantation embryos, VASC successfully re-establishes the cell dynamics and identifies several candidate marker genes associated with the early embryo development.

Author(s):  
Zilong Zhang ◽  
Feifei Cui ◽  
Chen Lin ◽  
Lingling Zhao ◽  
Chunyu Wang ◽  
...  

Abstract Single-cell RNA sequencing (scRNA-seq) has enabled us to study biological questions at the single-cell level. Currently, many analysis tools are available to better utilize these relatively noisy data. In this review, we summarize the most widely used methods for critical downstream analysis steps (i.e. clustering, trajectory inference, cell-type annotation and integrating datasets). The advantages and limitations are comprehensively discussed, and we provide suggestions for choosing proper methods in different situations. We hope this paper will be useful for scRNA-seq data analysts and bioinformatics tool developers.


2019 ◽  
Vol 21 (5) ◽  
pp. 1581-1595 ◽  
Author(s):  
Xinlei Zhao ◽  
Shuang Wu ◽  
Nan Fang ◽  
Xiao Sun ◽  
Jue Fan

Abstract Single-cell RNA sequencing (scRNA-seq) has been rapidly developing and widely applied in biological and medical research. Identification of cell types in scRNA-seq data sets is an essential step before in-depth investigations of their functional and pathological roles. However, the conventional workflow based on clustering and marker genes is not scalable for an increasingly large number of scRNA-seq data sets due to complicated procedures and manual annotation. Therefore, a number of tools have been developed recently to predict cell types in new data sets using reference data sets. These methods have not been generally adapted due to a lack of tool benchmarking and user guidance. In this article, we performed a comprehensive and impartial evaluation of nine classification software tools specifically designed for scRNA-seq data sets. Results showed that Seurat based on random forest, SingleR based on correlation analysis and CaSTLe based on XGBoost performed better than others. A simple ensemble voting of all tools can improve the predictive accuracy. Under nonideal situations, such as small-sized and class-imbalanced reference data sets, tools based on cluster-level similarities have superior performance. However, even with the function of assigning ‘unassigned’ labels, it is still challenging to catch novel cell types by solely using any of the single-cell classifiers. This article provides a guideline for researchers to select and apply suitable classification tools in their analysis workflows and sheds some lights on potential direction of future improvement on classification tools.


Author(s):  
Abha S Bais ◽  
Dennis Kostka

Abstract Motivation Single-cell RNA sequencing (scRNA-seq) technologies enable the study of transcriptional heterogeneity at the resolution of individual cells and have an increasing impact on biomedical research. However, it is known that these methods sometimes wrongly consider two or more cells as single cells, and that a number of so-called doublets is present in the output of such experiments. Treating doublets as single cells in downstream analyses can severely bias a study’s conclusions, and therefore computational strategies for the identification of doublets are needed. Results With scds, we propose two new approaches for in silico doublet identification: Co-expression based doublet scoring (cxds) and binary classification based doublet scoring (bcds). The co-expression based approach, cxds, utilizes binarized (absence/presence) gene expression data and, employing a binomial model for the co-expression of pairs of genes, yields interpretable doublet annotations. bcds, on the other hand, uses a binary classification approach to discriminate artificial doublets from original data. We apply our methods and existing computational doublet identification approaches to four datasets with experimental doublet annotations and find that our methods perform at least as well as the state of the art, at comparably little computational cost. We observe appreciable differences between methods and across datasets and that no approach dominates all others. In summary, scds presents a scalable, competitive approach that allows for doublet annotation of datasets with thousands of cells in a matter of seconds. Availability and implementation scds is implemented as a Bioconductor R package (doi: 10.18129/B9.bioc.scds). Supplementary information Supplementary data are available at Bioinformatics online.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Fen Ma ◽  
Siwei Zhang ◽  
Lianhao Song ◽  
Bozhi Wang ◽  
Lanlan Wei ◽  
...  

Abstract Background Cellular communication is an essential feature of multicellular organisms. Binding of ligands to their homologous receptors, which activate specific cell signaling pathways, is a basic type of cellular communication and intimately linked to many degeneration processes leading to diseases. Main body This study reviewed the history of ligand-receptor and presents the databases which store ligand-receptor pairs. The recently applications and research tools of ligand-receptor interactions for cell communication at single cell level by using single cell RNA sequencing have been sorted out. Conclusion The summary of the advantages and disadvantages of analysis tools will greatly help researchers analyze cell communication at the single cell level. Learning cell communication based on ligand-receptor interactions by single cell RNA sequencing gives way to developing new target drugs and personalizing treatment.


2021 ◽  
Author(s):  
Michael E Nelson ◽  
Simone G Riva ◽  
Ann Cvejic

Spatial transcriptomics is revolutionising the study of single-cell RNA and tissue-wide cell heterogeneity, but few robust methods connecting spatially resolved cells to so-called marker genes from single-cell RNA sequencing, which generate significant insight gleaned from spatial methods, exist. Here we present SMaSH, a general computational framework for extracting key marker genes from single-cell RNA sequencing data for spatial transcriptomics approaches. SMaSH extracts robust and biologically well-motivated marker genes, which characterise the given data-set better than existing and limited computational approaches for global marker gene calculation.


2019 ◽  
Vol 21 (Supplement_6) ◽  
pp. vi248-vi248
Author(s):  
Aaron Mochizuki ◽  
Alexander Lee ◽  
Joey Orpilla ◽  
Jenny Kienzler ◽  
Mildred Galvez ◽  
...  

Abstract INTRODUCTION Glioblastoma (GBM) is the most common malignant brain tumor in adults and is associated with a dismal prognosis. Neoadjuvant anti-PD-1 blockade has demonstrated efficacy in melanoma, non-small cell lung cancer and recurrent GBM; however, responses vary. While T cells have garnered considerable attention in the context of immunotherapy, the role of myeloid cells in the GBM microenvironment remains controversial. METHODS We isolated CD45+ immune populations from patients who underwent brain tumor resection at UCLA. We hypothesized that myeloid cells in glioblastoma contribute to T cell dysfunction; however, this immune suppression can be mitigated by neoadjuvant PD-1 inhibition. To test this, we utilized mass cytometry and single-cell RNA sequencing to characterize these immune populations. RESULTS Mass cytometry profiling of tumor infiltrating lymphocytes from patients with GBM demonstrated a preponderance of CD11b+ myeloid populations (75% versus 25% CD3+). At the transcriptomic level, myeloid cells in newly diagnosed GBMs exhibited decreased expression of CCL4 (loge fold change -1.18, Bonferroni-adjusted P = 1.62x10-254) and its ligands compared to anaplastic astrocytoma. In ranked gene set enrichment analysis, patients who received neoadjuvant pembrolizumab demonstrated enrichment in TNFα-, NFκB- and lipid metabolism-related gene sets by bootstrapped Kolmogorov-Smirnov test (Benjamini-Hochberg adjusted P = 4.74x10-3, 1.45x10-2 and 2.48x10-3, respectively) in tumor-associated myeloid populations. Additionally, single-cell trajectory analysis demonstrated increased CCL4 and decreased ISG15 with neoadjuvant checkpoint inhibition. CONCLUSIONS Here, we utilize mass cytometry and single-cell RNA sequencing to demonstrate the predominance and transcriptomic features of myeloid populations in GBM. Myeloid cells in patients who receive neoadjuvant PD-1 blockade re-express increased levels NFκB, TNFα and CCL4, a cytokine crucial for the recruitment of dendritic cells to the tumor for antigen-specific T cell activation. By delving into the GBM microenvironment at the single-cell level, we hope to better delineate the role of myeloid populations in this uniformly fatal tumor.


2021 ◽  
Author(s):  
Mohammad Lotfollahi ◽  
leander Dony ◽  
Harshita Agarwala ◽  
Fabian J Theis

Learning robust representations can help uncover underlying biological variation in scRNA-seq data. Disentangled representation learning is one approach to obtain such informative as well interpretable representations. Here, we learn disentangled representations of scRNA-seq data using β-variational autoencoder (β-VAE) and apply the model for out-of-distribution (OOD) prediction. We demonstrate accurate gene expression predictions for cell types absent from training in a perturbation and a developmental dataset. We further show that β-VAE outperforms a state-of-the-art disentanglement method for scRNA-seq in OOD prediction while achieving better disentanglement performance.


2017 ◽  
Author(s):  
Wenfa Ng

Single cell studies increasing reveal myriad cellular subtypes beyond those postulated or observed through optical and fluorescence microscopy as well as DNA sequencing studies. While gene sequencing at the single cell level offer a path towards illuminating, in totality, the different subtypes of cells present, the technique nevertheless does not offer answers concerning the functional repertoire of the cell, which is defined by the collection of RNA transcribed from the genome. Known as the transcriptome, transcribed RNA defines the function of the cell as proteins or effector RNA molecules, while the genome is the collection of all information endowed in the cell type, expressed or not. Thus, a particular cell state, lineage, cell fate or cellular differentiation is more fully depicted by transcriptomic analysis compared to delineating the genomic context at the single cell level. While conceptually sound and could be analysed by contemporary single cell RNA sequencing technology and data analysis pipelines, the relative instability of RNA in view of RNase in the environment would make sample preparation particularly challenging, where degradation of cellular RNA by extraneous factors could provide a misinterpretation of specific functions available to a cell type. Hence, RNA as the de facto functional molecule of the cell defining the proteomics landscape as well as effector RNA repertoire, meant that RNA transcriptomics at the single cell level is the way forward if the goal is to understand all available cell types, lineage, cell fate and cellular differentiation. Given that a cell state is defined by the functions encoded by functional molecules such as proteins and RNA, single cell RNA sequencing offers a larger contextual basis for understanding cellular decision making and functions, for example, proteins are increasingly known to work in concert with RNA effector molecules in enabling a function. Hence, providing a view of the diverse cell types and lineages present in a body, single cell RNA sequencing is only hampered by the high sensitivity required to analyse the small amount of RNA available in single cells, as well as the perennial problem of RNA studies: how to prevent or reduce RNA degradation by environmental RNase enzymes. Ability to reduce RNA degradation would provide the cell biologist a unique view of the functional landscape of different cells in the body through the language of RNA.


2017 ◽  
Author(s):  
Wenfa Ng

Single cell studies increasing reveal myriad cellular subtypes beyond those postulated or observed through optical and fluorescence microscopy as well as DNA sequencing studies. While gene sequencing at the single cell level offer a path towards illuminating, in totality, the different subtypes of cells present, the technique nevertheless does not offer answers concerning the functional repertoire of the cell, which is defined by the collection of RNA transcribed from the genome. Known as the transcriptome, transcribed RNA defines the function of the cell as proteins or effector RNA molecules, while the genome is the collection of all information endowed in the cell type, expressed or not. Thus, a particular cell state, lineage, cell fate or cellular differentiation is more fully depicted by transcriptomic analysis compared to delineating the genomic context at the single cell level. While conceptually sound and could be analysed by contemporary single cell RNA sequencing technology and data analysis pipelines, the relative instability of RNA in view of RNase in the environment would make sample preparation particularly challenging, where degradation of cellular RNA by extraneous factors could provide a misinterpretation of specific functions available to a cell type. Hence, RNA as the de facto functional molecule of the cell defining the proteomics landscape as well as effector RNA repertoire, meant that RNA transcriptomics at the single cell level is the way forward if the goal is to understand all available cell types, lineage, cell fate and cellular differentiation. Given that a cell state is defined by the functions encoded by functional molecules such as proteins and RNA, single cell RNA sequencing offers a larger contextual basis for understanding cellular decision making and functions, for example, proteins are increasingly known to work in concert with RNA effector molecules in enabling a function. Hence, providing a view of the diverse cell types and lineages present in a body, single cell RNA sequencing is only hampered by the high sensitivity required to analyse the small amount of RNA available in single cells, as well as the perennial problem of RNA studies: how to prevent or reduce RNA degradation by environmental RNase enzymes. Ability to reduce RNA degradation would provide the cell biologist a unique view of the functional landscape of different cells in the body through the language of RNA.


2020 ◽  
Vol 218 (1) ◽  
Author(s):  
Puneeth Guruprasad ◽  
Yong Gu Lee ◽  
Ki Hyun Kim ◽  
Marco Ruella

Immunotherapies such as immune checkpoint blockade and adoptive cell transfer have revolutionized cancer treatment, but further progress is hindered by our limited understanding of tumor resistance mechanisms. Emerging technologies now enable the study of tumors at the single-cell level, providing unprecedented high-resolution insights into the genetic makeup of the tumor microenvironment and immune system that bulk genomics cannot fully capture. Here, we highlight the recent key findings of the use of single-cell RNA sequencing to deconvolute heterogeneous tumors and immune populations during immunotherapy. Single-cell RNA sequencing has identified new crucial factors and cellular subpopulations that either promote tumor progression or leave tumors vulnerable to immunotherapy. We anticipate that the strategic use of single-cell analytics will promote the development of the next generation of successful, rationally designed immunotherapeutics.


Sign in / Sign up

Export Citation Format

Share Document