scholarly journals Fast and accurate single-cell RNA-Seq analysis by clustering of transcript-compatibility counts

2016 ◽  
Author(s):  
Vasilis Ntranos ◽  
Govinda M. Kamath ◽  
Jesse Zhang ◽  
Lior Pachter ◽  
David N. Tse

Current approaches to single-cell transcriptomic analysis are computationally intensive and require assay-specific modeling which limit their scope and generality. We propose a novel method that departs from standard analysis pipelines, comparing and clustering cells based not on their transcript or gene quantifications but on their transcript-compatibility read counts. In re-analysis of two landmark yet disparate single-cell RNA-Seq datasets, we show that our method is up to two orders of magnitude faster than previous approaches, provides accurate and in some cases improved results, and is directly applicable to data from a wide variety of assays.

2018 ◽  
Author(s):  
Jong-Eun Park ◽  
Krzysztof Polański ◽  
Kerstin Meyer ◽  
Sarah A. Teichmann

AbstractIncreasing numbers of large scale single cell RNA-Seq projects are leading to a data explosion, which can only be fully exploited through data integration. Therefore, efficient computational tools for combining diverse datasets are crucial for biology in the single cell genomics era. A number of methods have been developed to assist data integration by removing technical batch effects, but most are computationally intensive. To overcome the challenge of enormous datasets, we have developed BBKNN, an extremely fast graph-based data integration method. We illustrate the power of BBKNN for dimensionalityreduced visualisation and clustering in multiple biological scenarios, including a massive integrative study over several murine atlases. BBKNN successfully connects cell populations across experimentally heterogeneous mouse scRNA-Seq datasets, which reveals global markers of cell type and organspecificity and provides the foundation for inferring the underlying transcription factor network. BBKNN is available at https://github.com/Teichlab/bbknn.


2020 ◽  
Author(s):  
Xiaomei Li ◽  
Lin Liu ◽  
Greg Goodall ◽  
Andreas Schreiber ◽  
Taosheng Xu ◽  
...  

AbstractBreast cancer prognosis is challenging due to the heterogeneity of the disease. Various computational methods using bulk RNA-seq data have been proposed for breast cancer prognosis. However, these methods suffer from limited performances or ambiguous biological relevance, as a result of the neglect of intra-tumor heterogeneity. Recently, single cell RNA-sequencing (scRNA-seq) has emerged for studying tumor heterogeneity at cellular levels. In this paper, we propose a novel method, scPrognosis, to improve breast cancer prognosis with scRNA-seq data. scPrognosis uses the scRNA-seq data of the biological process Epithelial-to-Mesenchymal Transition (EMT). It firstly infers the EMT pseudotime and a dynamic gene co-expression network, then uses an integrative model to select genes important in EMT based on their expression variation and differentiation in different stages of EMT, and their roles in the dynamic gene co-expression network. To validate and apply the selected signatures to breast cancer prognosis, we use them as the features to build a prediction model with bulk RNA-seq data. The experimental results show that scPrognosis outperforms other benchmark breast cancer prognosis methods that use bulk RNA-seq data. Moreover, the dynamic changes in the expression of the selected signature genes in EMT may provide clues to the link between EMT and clinical outcomes of breast cancer. scPrognosis will also be useful when applied to scRNA-seq datasets of different biological processes other than EMT.Author summaryVarious computational methods have been developed for breast cancer prognosis. However, those methods mainly use the gene expression data generated by the bulk RNA sequencing techniques, which average the expression level of a gene across different cell types. As breast cancer is a heterogenous disease, the bulk gene expression may not be the ideal resource for cancer prognosis. In this study, we propose a novel method to improve breast cancer prognosis using scRNA-seq data. The proposed method has been applied to the EMT scRNA-seq dataset for identifying breast cancer signatures for prognosis. In comparison with existing bulk expression data based methods in breast cancer prognosis, our method shows a better performance. Our single-cell-based signatures provide clues to the relation between EMT and clinical outcomes of breast cancer. In addition, the proposed method can also be useful when applied to scRNA-seq datasets of different biological processes other than EMT.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Boying Gong ◽  
Yun Zhou ◽  
Elizabeth Purdom

AbstractA growing number of single-cell sequencing platforms enable joint profiling of multiple omics from the same cells. We present , a novel method that not only allows for analyzing the data from joint-modality platforms, but provides a coherent framework for the integration of multiple datasets measured on different modalities. We demonstrate its performance on multi-modality data of gene expression and chromatin accessibility and illustrate the integration abilities of by jointly analyzing this multi-modality data with single-cell RNA-seq and ATAC-seq datasets.


2012 ◽  
Vol 241 (10) ◽  
pp. 1584-1590 ◽  
Author(s):  
Scott Brouilette ◽  
Scott Kuersten ◽  
Charles Mein ◽  
Monika Bozek ◽  
Anna Terry ◽  
...  

2021 ◽  
Author(s):  
Marmar Moussa ◽  
Ion Mandoiu

Single cell RNA-Seq (scRNA-Seq) is critical for studying cellular function and phenotypic heterogeneity as well as the development of tissues and tumors. Here, we present SC1 a web-based highly interactive scRNA-Seq data analysis tool publicly accessible at https://sc1.engr.uconn.edu. The tool presents an integrated workflow for scRNA-Seq analysis, implements a novel method of selecting informative genes based on Term-Frequency Inverse-Document-Frequency (TF-IDF) scores, and provides a broad range of methods for clustering, differential expression analysis, gene enrichment, interactive visualization, and cell cycle analysis. The tool integrates other single cell omics data modalities like TCR-Seq and supports several single cell sequencing technologies. In just a few steps, researchers can generate a comprehensive analysis and gain powerful insights from their scRNA-Seq data.


2018 ◽  
Author(s):  
Paul A. Reyfman ◽  
James M. Walter ◽  
Nikita Joshi ◽  
Kishore R. Anekalla ◽  
Alexandra C. McQuattie-Pimentel ◽  
...  

AbstractPulmonary fibrosis is a devastating disorder that results in the progressive replacement of normal lung tissue with fibrotic scar. Available therapies slow disease progression, but most patients go on to die or require lung transplantation. Single-cell RNA-seq is a powerful tool that can reveal cellular identity via analysis of the transcriptome, but its ability to provide biologically or clinically meaningful insights in a disease context is largely unexplored. Accordingly, we performed single-cell RNA-seq on lung tissue obtained from eight transplant donors and eight recipients with pulmonary fibrosis and one bronchoscopic cryobiospy sample. Integrated single-cell transcriptomic analysis of donors and patients with pulmonary fibrosis identified the emergence of distinct populations of epithelial cells and macrophages that were common to all patients with lung fibrosis. Analysis of transcripts in the Wnt pathway suggested that within the same cell type, Wnt secretion and response are restricted to distinct non-overlapping cells, which was confirmed using in situ RNA hybridization. Single-cell RNA-seq revealed heterogeneity within alveolar macrophages from individual patients, which was confirmed by immunohistochemistry. These results support the feasibility of discovery-based approaches applying next generation sequencing technologies to clinically obtained samples with a goal of developing personalized therapies.One Sentence SummarySingle-cell RNA-seq applied to tissue from diseased and donor lungs and a living patient with pulmonary fibrosis identifies cell type-specific disease-associated molecular pathways.


Author(s):  
Dylan Kotliar ◽  
Andrés Colubri

Abstract Motivation Visualizing two-dimensional embeddings (such as UMAP or tSNE) is a useful step in interrogating single-cell RNA sequencing (scRNA-Seq) data. Subsequently, users typically iterate between programmatic analyses (including clustering and differential expression) and visual exploration (e.g. coloring cells by interesting features) to uncover biological signals in the data. Interactive tools exist to facilitate visual exploration of embeddings such as performing differential expression on user-selected cells. However, the practical utility of these tools is limited because they don’t support rapid movement of data and results to and from the programming environments where most of the data analysis takes place, interrupting the iterative process. Results Here, we present the Single-cell Interactive Viewer (Sciviewer), a tool that overcomes this limitation by allowing interactive visual interrogation of embeddings from within Python. Beyond differential expression analysis of user-selected cells, Sciviewer implements a novel method to identify genes varying locally along any user-specified direction on the embedding. Sciviewer enables rapid and flexible iteration between interactive and programmatic modes of scRNA-Seq exploration, illustrating a useful approach for analyzing high-dimensional data. Availability and implementation Code and examples are provided at https://github.com/colabobio/sciviewer.


2018 ◽  
Author(s):  
Gökcen Eraslan ◽  
Lukas M. Simon ◽  
Maria Mircea ◽  
Nikola S. Mueller ◽  
Fabian J. Theis

AbstractSingle-cell RNA sequencing (scRNA-seq) has enabled researchers to study gene expression at a cellular resolution. However, noise due to amplification and dropout may obstruct analyses, so scalable denoising methods for increasingly large but sparse scRNAseq data are needed. We propose a deep count autoencoder network (DCA) to denoise scRNA-seq datasets. DCA takes the count distribution, overdispersion and sparsity of the data into account using a zero-inflated negative binomial noise model, and nonlinear gene-gene or gene-dispersion interactions are captured. Our method scales linearly with the number of cells and can therefore be applied to datasets of millions of cells. We demonstrate that DCA denoising improves a diverse set of typical scRNA-seq data analyses using simulated and real datasets. DCA outperforms existing methods for data imputation in quality and speed, enhancing biological discovery.


2021 ◽  
Author(s):  
Dylan Kotliar ◽  
Andres Colubri

Visualizing two-dimensional (2D) embeddings (e.g. UMAP or tSNE) is a key step in interrogating single-cell RNA sequencing (scRNA-Seq) data. Subsequently, users typically iterate between programmatic analyses (e.g. clustering and differential expression) and visual exploration (e.g. coloring cells by interesting features) to uncover biological signals in the data. Interactive tools exist to facilitate visual exploration of embeddings such as performing differential expression on user-selected cells. However, the practical utility of existing tools is limited because they do not support rapid movement of data and results to and from the programming environments where the bulk of data analysis takes place, interrupting the iterative process. Here, we present the Single-cell Interactive Viewer (Sciviewer), a tool that overcomes this limitation by allowing interactive visual interrogation of embeddings from within Python. Beyond differential expression analysis of user-selected cells, Sciviewer implements a novel method to identify genes varying locally along any user-specified direction on the embedding. Sciviewer enables rapid and flexible iteration between interactive and programmatic modes of scRNA-Seq exploration, illustrating a useful approach for analyzing high-dimensional data.


Sign in / Sign up

Export Citation Format

Share Document