Single-Cell Transcriptomics Unveils Gene Regulatory Network Plasticity

Mapping Intimacies ◽

10.1101/446104 ◽

2018 ◽

Cited By ~ 1

Author(s):

Giovanni Iacono ◽

Ramon Massoni-Badosa ◽

Holger Heyn

Keyword(s):

Single Cell ◽

Regulatory Network ◽

Regulatory Networks ◽

Large Scale ◽

Differential Expression Analysis ◽

Cellular Heterogeneity ◽

Computational Framework ◽

Holistic View ◽

Regulatory Changes ◽

Cell Data

SUMMARYSingle-cell RNA sequencing (scRNA-seq) plays a pivotal role in our understanding of cellular heterogeneity. Current analytical workflows are driven by categorizing principles that consider cells as individual entities and classify them into complex taxonomies. We have devised a conceptually different computational framework based on a holistic view, where single-cell datasets are used to infer global, large-scale regulatory networks. We developed correlation metrics that are specifically tailored to single-cell data, and then generated, validated and interpreted single-cell-derived regulatory networks from organs and perturbed systems, such as diabetes and Alzheimer’s disease. Using advanced tools from graph theory, we computed an unbiased quantification of a gene’s biological relevance, and accurately pinpointed key players in organ function and drivers of diseases. Our approach detected multiple latent regulatory changes that are invisible to single-cell workflows based on clustering or differential expression analysis. In summary, we have established the feasibility and value of regulatory network analysis using scRNA-seq datasets, which significantly broadens the biological insights that can be obtained with this leading technology.

Download Full-text

Leveraging high-powered RNA-Seq datasets to improve inference of regulatory activity in single-cell RNA-Seq data

10.1101/553040 ◽

2019 ◽

Cited By ~ 1

Author(s):

Ning Wang ◽

Andrew E. Teschendorff

Keyword(s):

Transcription Factors ◽

Single Cell ◽

Cell Fate ◽

Regulatory Networks ◽

Large Scale ◽

Single Cells ◽

Differential Expression Analysis ◽

Dropout Rate ◽

Rna Seq ◽

Regulatory Activity

AbstractInferring the activity of transcription factors in single cells is a key task to improve our understanding of development and complex genetic diseases. This task is, however, challenging due to the relatively large dropout rate and noisy nature of single-cell RNA-Seq data. Here we present a novel statistical inference framework called SCIRA (Single Cell Inference of Regulatory Activity), which leverages the power of large-scale bulk RNA-Seq datasets to infer high-quality tissue-specific regulatory networks, from which regulatory activity estimates in single cells can be subsequently obtained. We show that SCIRA can correctly infer regulatory activity of transcription factors affected by high technical dropouts. In particular, SCIRA can improve sensitivity by as much as 70% compared to differential expression analysis and current state-of-the-art methods. Importantly, SCIRA can reveal novel regulators of cell-fate in tissue-development, even for cell-types that only make up 5% of the tissue, and can identify key novel tumor suppressor genes in cancer at single cell resolution. In summary, SCIRA will be an invaluable tool for single-cell studies aiming to accurately map activity patterns of key transcription factors during development, and how these are altered in disease.

Download Full-text

EpiScanpy: integrated single-cell epigenomic analysis

10.1101/648097 ◽

2019 ◽

Cited By ~ 4

Author(s):

Anna Danese ◽

Maria L. Richter ◽

David S. Fischer ◽

Fabian J. Theis ◽

Maria Colomé-Tatché

Keyword(s):

Dna Methylation ◽

Single Cell ◽

Large Scale ◽

Feature Space ◽

Rna Seq ◽

Computational Framework ◽

Learning Techniques ◽

Multiple Feature ◽

The Many ◽

Cell Data

ABSTRACTEpigenetic single-cell measurements reveal a layer of regulatory information not accessible to single-cell transcriptomics, however single-cell-omics analysis tools mainly focus on gene expression data. To address this issue, we present epiScanpy, a computational framework for the analysis of single-cell DNA methylation and single-cell ATAC-seq data. EpiScanpy makes the many existing RNA-seq workflows from scanpy available to large-scale single-cell data from other -omics modalities. We introduce and compare multiple feature space constructions for epigenetic data and show the feasibility of common clustering, dimension reduction and trajectory learning techniques. We benchmark epiScanpy by interrogating different single-cell brain mouse atlases of DNA methylation, ATAC-seq and transcriptomics. We find that differentially methylated and differentially open markers between cell clusters enrich transcriptome-based cell type labels by orthogonal epigenetic information.

Download Full-text

PARC: ultrafast and accurate clustering of phenotypic data of millions of single cells

Bioinformatics ◽

10.1093/bioinformatics/btaa042 ◽

2020 ◽

Vol 36 (9) ◽

pp. 2778-2786 ◽

Cited By ~ 5

Author(s):

Shobana V Stassen ◽

Dickson M D Siu ◽

Kelvin C M Lee ◽

Joshua W K Ho ◽

Hayden K H So ◽

...

Keyword(s):

Single Cell ◽

Large Scale ◽

Clustering Algorithm ◽

Single Cells ◽

Clustering Algorithms ◽

Cellular Heterogeneity ◽

Supplementary Information ◽

Phenotypic Data ◽

Scalable Algorithm ◽

Cell Data

Abstract Motivation New single-cell technologies continue to fuel the explosive growth in the scale of heterogeneous single-cell data. However, existing computational methods are inadequately scalable to large datasets and therefore cannot uncover the complex cellular heterogeneity. Results We introduce a highly scalable graph-based clustering algorithm PARC—Phenotyping by Accelerated Refined Community-partitioning—for large-scale, high-dimensional single-cell data (>1 million cells). Using large single-cell flow and mass cytometry, RNA-seq and imaging-based biophysical data, we demonstrate that PARC consistently outperforms state-of-the-art clustering algorithms without subsampling of cells, including Phenograph, FlowSOM and Flock, in terms of both speed and ability to robustly detect rare cell populations. For example, PARC can cluster a single-cell dataset of 1.1 million cells within 13 min, compared with >2 h for the next fastest graph-clustering algorithm. Our work presents a scalable algorithm to cope with increasingly large-scale single-cell analysis. Availability and implementation https://github.com/ShobiStassen/PARC. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

PARC: ultrafast and accurate clustering of phenotypic data of millions of single cells

10.1101/765628 ◽

2019 ◽

Author(s):

Shobana V. Stassen ◽

Dickson M. D. Siu ◽

Kelvin C. M. Lee ◽

Joshua W. K. Ho ◽

Hayden K. H. So ◽

...

Keyword(s):

Single Cell ◽

Large Scale ◽

Clustering Algorithm ◽

Single Cells ◽

Clustering Algorithms ◽

Cell Mass ◽

Cellular Heterogeneity ◽

Phenotypic Data ◽

Data Set ◽

Cell Data

AbstractMotivationNew single-cell technologies continue to fuel the explosive growth in the scale of heterogeneous single-cell data. However, existing computational methods are inadequately scalable to large datasets and therefore cannot uncover the complex cellular heterogeneity.ResultsWe introduce a highly scalable graph-based clustering algorithm PARC - phenotyping by accelerated refined community-partitioning – for ultralarge-scale, high-dimensional single-cell data (> 1 million cells). Using large single cell mass cytometry, RNA-seq and imaging-based biophysical data, we demonstrate that PARC consistently outperforms state-of-the-art clustering algorithms without sub-sampling of cells, including Phenograph, FlowSOM, and Flock, in terms of both speed and ability to robustly detect rare cell populations. For example, PARC can cluster a single cell data set of 1.1M cells within 13 minutes, compared to >2 hours to the next fastest graph-clustering algorithm, Phenograph. Our work presents a scalable algorithm to cope with increasingly large-scale single-cell analysis.Availability and Implementationhttps://github.com/ShobiStassen/PARC

Download Full-text

Normalisr: normalization and association testing for single-cell CRISPR screen and co-expression

10.1101/2021.04.12.439500 ◽

2021 ◽

Author(s):

Lingfei Wang

Keyword(s):

Single Cell ◽

Regulatory Networks ◽

Large Scale ◽

High Sensitivity ◽

Statistical Hypothesis ◽

P Value ◽

Statistical Hypothesis Testing ◽

Experimental Conditions ◽

Library Size ◽

Association Testing

AbstractSingle-cell RNA sequencing (scRNA-seq) provides unprecedented technical and statistical potential to study gene regulation but is subject to technical variations and sparsity. Here we present Normalisr, a linear-model-based normalization and statistical hypothesis testing framework that unifies single-cell differential expression, co-expression, and CRISPR scRNA-seq screen analyses. By systematically detecting and removing nonlinear confounding from library size, Normalisr achieves high sensitivity, specificity, speed, and generalizability across multiple scRNA-seq protocols and experimental conditions with unbiased P-value estimation. We use Normalisr to reconstruct robust gene regulatory networks from trans-effects of gRNAs in large-scale CRISPRi scRNA-seq screens and gene-level co-expression networks from conventional scRNA-seq.

Download Full-text

SIN-KNO: A method of gene regulatory network inference using single-cell transcription and gene knockout data

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720019500355 ◽

2019 ◽

Vol 17 (06) ◽

pp. 1950035

Author(s):

Huiqing Wang ◽

Yuanyuan Lian ◽

Chun Li ◽

Yue Ma ◽

Zhiliang Yan ◽

...

Keyword(s):

Gene Expression ◽

Steady State ◽

Single Cell ◽

Gene Regulatory Network ◽

Regulatory Network ◽

Gene Knockout ◽

Cell Heterogeneity ◽

State Information ◽

Gene Regulatory ◽

Cell Data

As a tool of interpreting and analyzing genetic data, gene regulatory network (GRN) could reveal regulatory relationships between genes, proteins, and small molecules, as well as understand physiological activities and functions within biological cells, interact in pathways, and how to make changes in the organism. Traditional GRN research focuses on the analysis of the regulatory relationships through the average of cellular gene expressions. These methods are difficult to identify the cell heterogeneity of gene expression. Existing methods for inferring GRN using single-cell transcriptional data lack expression information when genes reach steady state, and the high dimensionality of single-cell data leads to high temporal and spatial complexity of the algorithm. In order to solve the problem in traditional GRN inference methods, including the lack of cellular heterogeneity information, single-cell data complexity and lack of steady-state information, we propose a method for GRN inference using single-cell transcription and gene knockout data, called SINgle-cell transcription data-KNOckout data (SIN-KNO), which focuses on combining dynamic and steady-state information of regulatory relationship contained in gene expression. Capturing cell heterogeneity information could help understand the gene expression difference in different cells. So, we could observe gene expression changes more accurately. Gene knockout data could observe the gene expression levels at steady-state of all other genes when one gene is knockout. Classifying the genes before analyzing the single-cell data could determine a large number of non-existent regulation, greatly reducing the number of regulation required for inference. In order to show the efficiency, the proposed method has been compared with several typical methods in this area including GENIE3, JUMP3, and SINCERITIES. The results of the evaluation indicate that the proposed method can analyze the diversified information contained in the two types of data, establish a more accurate gene regulation network, and improve the computational efficiency. The method provides a new thinking for dealing with large datasets and high computational complexity of single-cell data in the GRN inference.

Download Full-text

VoPo leverages cellular heterogeneity for predictive modeling of single-cell data

Nature Communications ◽

10.1038/s41467-020-17569-8 ◽

2020 ◽

Vol 11 (1) ◽

Cited By ~ 2

Author(s):

Natalie Stanley ◽

Ina A. Stelzer ◽

Amy S. Tsai ◽

Ramin Fallahzadeh ◽

Edward Ganio ◽

...

Keyword(s):

Single Cell ◽

Predictive Modeling ◽

Cellular Heterogeneity ◽

Cell Data

Download Full-text

TISCH: a comprehensive web resource enabling interactive single-cell transcriptome visualization of tumor microenvironment

Nucleic Acids Research ◽

10.1093/nar/gkaa1020 ◽

2020 ◽

Vol 49 (D1) ◽

pp. D1420-D1430

Author(s):

Dongqing Sun ◽

Jin Wang ◽

Ya Han ◽

Xin Dong ◽

Jun Ge ◽

...

Keyword(s):

Gene Expression ◽

Tumor Microenvironment ◽

Single Cell ◽

Large Scale ◽

Differential Expression Analysis ◽

Enrichment Analysis ◽

Functional Enrichment ◽

Multiple Cancer ◽

Web Resource ◽

Cancer Types

Abstract Cancer immunotherapy targeting co-inhibitory pathways by checkpoint blockade shows remarkable efficacy in a variety of cancer types. However, only a minority of patients respond to treatment due to the stochastic heterogeneity of tumor microenvironment (TME). Recent advances in single-cell RNA-seq technologies enabled comprehensive characterization of the immune system heterogeneity in tumors but posed computational challenges on integrating and utilizing the massive published datasets to inform immunotherapy. Here, we present Tumor Immune Single Cell Hub (TISCH, http://tisch.comp-genomics.org), a large-scale curated database that integrates single-cell transcriptomic profiles of nearly 2 million cells from 76 high-quality tumor datasets across 27 cancer types. All the data were uniformly processed with a standardized workflow, including quality control, batch effect removal, clustering, cell-type annotation, malignant cell classification, differential expression analysis and functional enrichment analysis. TISCH provides interactive gene expression visualization across multiple datasets at the single-cell level or cluster level, allowing systematic comparison between different cell-types, patients, tissue origins, treatment and response groups, and even different cancer-types. In summary, TISCH provides a user-friendly interface for systematically visualizing, searching and downloading gene expression atlas in the TME from multiple cancer types, enabling fast, flexible and comprehensive exploration of the TME.

Download Full-text

SSCC: A Novel Computational Framework for Rapid and Accurate Clustering Large-scale Single Cell RNA-seq Data

Genomics Proteomics & Bioinformatics ◽

10.1016/j.gpb.2018.10.003 ◽

2019 ◽

Vol 17 (2) ◽

pp. 201-210 ◽

Cited By ~ 5

Author(s):

Xianwen Ren ◽

Liangtao Zheng ◽

Zemin Zhang

Keyword(s):

Single Cell ◽

Large Scale ◽

Rna Seq ◽

Computational Framework

Download Full-text

A multiple genomic data fused SF2 prediction model, signature identification, and gene regulatory network inference for personalized radiotherapy

Technology in Cancer Research & Treatment ◽

10.1177/1533033820909112 ◽

2020 ◽

Vol 19 ◽

pp. 153303382090911

Author(s):

Qi-en He ◽

Yi-fan Tong ◽

Zhou Ye ◽

Li-xia Gao ◽

Yi-zhi Zhang ◽

...

Keyword(s):

Prediction Model ◽

Regulatory Network ◽

Regulatory Networks ◽

Large Scale ◽

Network Inference ◽

Treatment Options ◽

Genomic Data ◽

Full Potential ◽

Gene Regulatory Network Inference ◽

Signature Genes

Radiotherapy is one of the most important cancer treatments, but its response varies greatly among individual patients. Therefore, the prediction of radiosensitivity, identification of potential signature genes, and inference of their regulatory networks are important for clinical and oncological reasons. Here, we proposed a novel multiple genomic fused partial least squares deep regression method to simultaneously analyze multi-genomic data. Using 60 National Cancer Institute cell lines as examples, we aimed to identify signature genes by optimizing the radiosensitivity prediction model and uncovering regulatory relationships. A total of 113 signature genes were selected from more than 20,000 genes. The root mean square error of the model was only 0.0025, which was much lower than previously published results, suggesting that our method can predict radiosensitivity with the highest accuracy. Additionally, our regulatory network analysis identified 24 highly important ‘hub’ genes. The data analysis workflow we propose provides a unified and computational framework to harness the full potential of large-scale integrated cancer genomic data for integrative signature discovery. Furthermore, the regression model, signature genes, and their regulatory network should provide a reliable quantitative reference for optimizing personalized treatment options, and may aid our understanding of cancer progress mechanisms.

Download Full-text