scholarly journals Reproducibility of Methods to Detect Differentially Expressed Genes from Single-Cell RNA Sequencing

2020 ◽  
Vol 10 ◽  
Author(s):  
Tian Mou ◽  
Wenjiang Deng ◽  
Fengyun Gu ◽  
Yudi Pawitan ◽  
Trung Nghia Vu
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Bobby Ranjan ◽  
Florian Schmidt ◽  
Wenjie Sun ◽  
Jinyu Park ◽  
Mohammad Amin Honardoost ◽  
...  

Abstract Background Clustering is a crucial step in the analysis of single-cell data. Clusters identified in an unsupervised manner are typically annotated to cell types based on differentially expressed genes. In contrast, supervised methods use a reference panel of labelled transcriptomes to guide both clustering and cell type identification. Supervised and unsupervised clustering approaches have their distinct advantages and limitations. Therefore, they can lead to different but often complementary clustering results. Hence, a consensus approach leveraging the merits of both clustering paradigms could result in a more accurate clustering and a more precise cell type annotation. Results We present scConsensus, an $${\mathbf {R}}$$ R framework for generating a consensus clustering by (1) integrating results from both unsupervised and supervised approaches and (2) refining the consensus clusters using differentially expressed genes. The value of our approach is demonstrated on several existing single-cell RNA sequencing datasets, including data from sorted PBMC sub-populations. Conclusions scConsensus combines the merits of unsupervised and supervised approaches to partition cells with better cluster separation and homogeneity, thereby increasing our confidence in detecting distinct cell types. scConsensus is implemented in $${\mathbf {R}}$$ R and is freely available on GitHub at https://github.com/prabhakarlab/scConsensus.


2019 ◽  
Author(s):  
Florian Klimm ◽  
Enrique M. Toledo ◽  
Thomas Monfeuga ◽  
Fang Zhang ◽  
Charlotte M. Deane ◽  
...  

AbstractRecent advances in single-cell RNA sequencing (scRNA-seq) have allowed researchers to explore transcriptional function at a cellular level. In this study, we present scPPIN, a method for integrating single-cell RNA sequencing data with protein–protein interaction networks (PPINs) that detects active modules in cells of different transcriptional states. We achieve this by clustering RNA-sequencing data, identifying differentially expressed genes, constructing node-weighted PPINs, and finding the maximum-weight connected subgraphs with an exact Steiner-tree approach. As a case study, we investigate RNA-sequencing data from human liver spheroids but the techniques described here are applicable to other organisms and tissues. scPPIN allows us to expand the output of differential expressed genes analysis with information from protein interactions. We find that different transcriptional states have different subnetworks of the PPIN significantly enriched which represent biological pathways. In these pathways, scPPIN also identifies proteins that are not differentially expressed but have a crucial biological function (e.g., as receptors) and therefore reveals biology beyond a standard differentially expressed gene analysis.


BMC Genomics ◽  
2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Florian Klimm ◽  
Enrique M. Toledo ◽  
Thomas Monfeuga ◽  
Fang Zhang ◽  
Charlotte M. Deane ◽  
...  

Abstract Background Recent advances in single-cell RNA sequencing have allowed researchers to explore transcriptional function at a cellular level. In particular, single-cell RNA sequencing reveals that there exist clusters of cells with similar gene expression profiles, representing different transcriptional states. Results In this study, we present scPPIN, a method for integrating single-cell RNA sequencing data with protein–protein interaction networks that detects active modules in cells of different transcriptional states. We achieve this by clustering RNA-sequencing data, identifying differentially expressed genes, constructing node-weighted protein–protein interaction networks, and finding the maximum-weight connected subgraphs with an exact Steiner-tree approach. As case studies, we investigate two RNA-sequencing data sets from human liver spheroids and human adipose tissue, respectively. With scPPIN we expand the output of differential expressed genes analysis with information from protein interactions. We find that different transcriptional states have different subnetworks of the protein–protein interaction networks significantly enriched which represent biological pathways. In these pathways, scPPIN identifies proteins that are not differentially expressed but have a crucial biological function (e.g., as receptors) and therefore reveals biology beyond a standard differential expressed gene analysis. Conclusions The introduced scPPIN method can be used to systematically analyse differentially expressed genes in single-cell RNA sequencing data by integrating it with protein interaction data. The detected modules that characterise each cluster help to identify and hypothesise a biological function associated to those cells. Our analysis suggests the participation of unexpected proteins in these pathways that are undetectable from the single-cell RNA sequencing data alone. The techniques described here are applicable to other organisms and tissues.


2021 ◽  
Author(s):  
Jiyeon Kim Denninger ◽  
Logan A Walker ◽  
Xi Chen ◽  
Altan M Turkoglu ◽  
Alexander Pan ◽  
...  

Multipotent neural stem cells (NSCs) are found in several isolated niches of the adult mammalian brain where they have unique potential to assist in tissue repair. Modern transcriptomics offer high-throughput methods for identifying disease or injury associated gene expression signatures in endogenous adult NSCs, but they require adaptation to accommodate the rarity of NSCs. Bulk RNA sequencing (RNAseq) of NSCs requires pooling several mice, which impedes application to labor-intensive injury models. Alternatively, single cell RNAseq can profile hundreds to thousands of cells from a single mouse and is increasingly used to study NSCs. The consequences of the low RNA input from a single NSC on downstream identification of differentially expressed genes (DEGs) remains largely unexplored. Here, to clarify the role that low RNA input plays in NSC DEG identification, we directly compared DEGs in an oxidative stress model of cultured NSCs by bulk and single cell sequencing. While both methods yielded DEGs that were replicable, single cell sequencing DEGs derived from genes with higher relative transcript counts compared to all detected genes and exhibited smaller fold changes than DEGs identified by bulk RNAseq. The loss of high fold-change DEGs in the single cell platform presents an important limitation for identifying disease-relevant genes. To facilitate identification of such genes, we determined an RNA-input threshold that enables transcriptional profiling of NSCs comparable to standard bulk sequencing and used it to establish a workflow for in vivo profiling of endogenous NSCs. We then applied this workflow to identify DEGs after lateral fluid percussion injury, a labor-intensive animal model of traumatic brain injury. Our work suggests that single cell RNA sequencing may underestimate the diversity of pathologic DEGs but population level transcriptomic analysis can be adapted to capture more of these DEGs with similar efficacy and diversity as standard bulk sequencing. Together, our data and workflow will be useful for investigators interested in understanding and manipulating adult hippocampal NSC responses to various stimuli.


2020 ◽  
Author(s):  
Bobby Ranjan ◽  
Florian Schmidt ◽  
Wenjie Sun ◽  
Jinyu Park ◽  
Mohammad Amin Honardoost ◽  
...  

Clustering is a crucial step in the analysis of single-cell data. Clusters identified using unsupervised clustering are typically annotated to cell types based on differentially expressed genes. In contrast, supervised methods use a reference panel of labelled transcriptomes to guide both clustering and cell type identification. Supervised and unsupervised clustering strategies have their distinct advantages and limitations. Therefore, they can lead to different but often complementary clustering results. Hence, a consensus approach leveraging the merits of both clustering paradigms could result in a more accurate clustering and a more precise cell type annotation. We present scConsensus, an R framework for generating a consensus clustering by (i) integrating the results from both unsupervised and supervised approaches and (ii) refining the consensus clusters using differentially expressed (DE) genes. The value of our approach is demonstrated on several existing single-cell RNA sequencing datasets, including data from sorted PBMC sub-populations. scConsensus is freely available on GitHub at https://github.com/prabhakarlab/scConsensus.


Sign in / Sign up

Export Citation Format

Share Document