Comprehensive evaluation of computational cell-type quantification methods for immuno-oncology

Mapping Intimacies ◽

10.1101/463828 ◽

2018 ◽

Cited By ~ 4

Author(s):

Gregor Sturm ◽

Francesca Finotello ◽

Florent Petitprez ◽

Jitao David Zhang ◽

Jan Baumbach ◽

...

Keyword(s):

Tumor Microenvironment ◽

Single Cell ◽

Computational Methods ◽

Immune Cell ◽

Comprehensive Evaluation ◽

Supplementary Information ◽

Rna Seq ◽

Cell Type ◽

Link Type ◽

Real World Datasets

AbstractMotivationThe composition and density of immune cells in the tumor microenvironment profoundly influence tumor progression and success of anti-cancer therapies. Flow cytometry, immunohistochemistry staining, or single-cell sequencing is often unavailable such that we rely on computational methods to estimate the immune-cell composition from bulk RNA-sequencing (RNA-seq) data. Various methods have been proposed recently, yet their capabilities and limitations have not been evaluated systematically. A general guideline leading the research community through cell type deconvolution is missing.ResultsWe developed a systematic approach for benchmarking such computational methods and assessed the accuracy of tools at estimating nine different immune- and stromal cells from bulk RNA-seq samples. We used a single-cell RNA-seq dataset of ∼11,000 cells from the tumor microenvironment to simulate bulk samples of known cell type proportions, and validated the results using independent, publicly available gold-standard estimates. This allowed us to analyze and condense the results of more than a hundred thousand predictions to provide an exhaustive evaluation across seven computational methods over nine cell types and ∼1,800 samples from five simulated and real-world datasets. We demonstrate that computational deconvolution performs at high accuracy for well-defined cell-type signatures and propose how fuzzy cell-type signatures can be improved. We suggest that future efforts should be dedicated to refining cell population definitions and finding reliable signatures.AvailabilityA snakemake pipeline to reproduce the benchmark is available at https://github.com/grst/immune_deconvolution_benchmark. An R package allows the community to perform integrated deconvolution using different methods (https://grst.github.io/immunedeconv)[email protected] informationSupplementary data are available at Bioinformatics online.

Download Full-text

Comprehensive evaluation of transcriptome-based cell-type quantification methods for immuno-oncology

Bioinformatics ◽

10.1093/bioinformatics/btz363 ◽

2019 ◽

Vol 35 (14) ◽

pp. i436-i445 ◽

Cited By ~ 71

Author(s):

Gregor Sturm ◽

Francesca Finotello ◽

Florent Petitprez ◽

Jitao David Zhang ◽

Jan Baumbach ◽

...

Keyword(s):

Single Cell ◽

Computational Methods ◽

Immune Cell ◽

Comprehensive Evaluation ◽

Cell Types ◽

R Package ◽

Supplementary Information ◽

Rna Seq ◽

Cell Type ◽

Real World Datasets

Abstract Motivation The composition and density of immune cells in the tumor microenvironment (TME) profoundly influence tumor progression and success of anti-cancer therapies. Flow cytometry, immunohistochemistry staining or single-cell sequencing are often unavailable such that we rely on computational methods to estimate the immune-cell composition from bulk RNA-sequencing (RNA-seq) data. Various methods have been proposed recently, yet their capabilities and limitations have not been evaluated systematically. A general guideline leading the research community through cell type deconvolution is missing. Results We developed a systematic approach for benchmarking such computational methods and assessed the accuracy of tools at estimating nine different immune- and stromal cells from bulk RNA-seq samples. We used a single-cell RNA-seq dataset of ∼11 000 cells from the TME to simulate bulk samples of known cell type proportions, and validated the results using independent, publicly available gold-standard estimates. This allowed us to analyze and condense the results of more than a hundred thousand predictions to provide an exhaustive evaluation across seven computational methods over nine cell types and ∼1800 samples from five simulated and real-world datasets. We demonstrate that computational deconvolution performs at high accuracy for well-defined cell-type signatures and propose how fuzzy cell-type signatures can be improved. We suggest that future efforts should be dedicated to refining cell population definitions and finding reliable signatures. Availability and implementation A snakemake pipeline to reproduce the benchmark is available at https://github.com/grst/immune_deconvolution_benchmark. An R package allows the community to perform integrated deconvolution using different methods (https://grst.github.io/immunedeconv). Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Flexible comparison of batch correction methods for single-cell RNA-seq using BatchBench

10.1101/2020.05.22.111211 ◽

2020 ◽

Author(s):

Ruben Chazarra-Gil ◽

Stijn van Dongen ◽

Vladimir Yu Kiselev ◽

Martin Hemberg

Keyword(s):

Single Cell ◽

Computational Methods ◽

Rna Seq ◽

Batch Effects ◽

Systematic Comparison ◽

Batch Correction ◽

Link Type ◽

Biological Signals ◽

The Cost

AbstractAs the cost of single-cell RNA-seq experiments has decreased, an increasing number of datasets are now available. Combining newly generated and publicly accessible datasets is challenging due to non-biological signals, commonly known as batch effects. Although there are several computational methods available that can remove batch effects, evaluating which method performs best is not straightforward. Here we present BatchBench (https://github.com/cellgeni/batchbench), a modular and flexible pipeline for comparing batch correction methods for single-cell RNA-seq data. We apply BatchBench to eight methods, highlighting their methodological differences and assess their performance and computational requirements through a compendium of well-studied datasets. This systematic comparison guides users in the choice of batch correction tool, and the pipeline makes it easy to evaluate other datasets.

Download Full-text

Probabilistic cell-type assignment of single-cell RNA-seq for tumor microenvironment profiling

Nature Methods ◽

10.1038/s41592-019-0529-1 ◽

2019 ◽

Vol 16 (10) ◽

pp. 1007-1015 ◽

Cited By ~ 46

Author(s):

Allen W. Zhang ◽

Ciara O’Flanagan ◽

Elizabeth A. Chavez ◽

Jamie L. P. Lim ◽

Nicholas Ceglia ◽

...

Keyword(s):

Tumor Microenvironment ◽

Single Cell ◽

Rna Seq ◽

Cell Type ◽

Type Assignment

Download Full-text

Deciphering the evolution of vertebrate immune cell types with single-cell RNA-seq

10.7287/peerj.preprints.26858 ◽

2018 ◽

Author(s):

Santiago J Carmona ◽

David Gfeller

Keyword(s):

Single Cell ◽

Immune Cell ◽

Adaptive Immune System ◽

Cell Types ◽

Rna Seq ◽

Cell Type ◽

Recent Developments ◽

Species Specific ◽

Development And Validation ◽

Whole Transcriptome

Single-cell RNA-seq is revolutionizing our understanding of cell type heterogeneity in many fields of biology, ranging from neuroscience to cancer to immunology. In Immunology, one of the main promises of this approach is the ability to define cell types as clusters in the whole transcriptome space (i.e., without relying on specific surface markers), thereby providing an unbiased classification of immune cell types. So far, this technology has been mainly applied in mouse and human. However, technically it could be used for immune cell-type identification in any species without requiring the development and validation of species-specific antibodies for cell sorting. Here we review recent developments using single-cell RNA-seq to characterize immune cell populations in non-mammalian vertebrates, with a focus on zebrafish (Danio rerio). We advocate that single-cell RNA-seq technology is likely to provide key insights into our understanding of the evolution of the adaptive immune system.

Download Full-text

Deciphering the evolution of vertebrate immune cell types with single-cell RNA-seq

10.7287/peerj.preprints.26858v1 ◽

2018 ◽

Author(s):

Santiago J Carmona ◽

David Gfeller

Keyword(s):

Single Cell ◽

Immune Cell ◽

Adaptive Immune System ◽

Cell Types ◽

Rna Seq ◽

Cell Type ◽

Recent Developments ◽

Species Specific ◽

Development And Validation ◽

Whole Transcriptome

Download Full-text

Differential transcript usage analysis of bulk and single-cell RNA-seq data with DTUrtle

Bioinformatics ◽

10.1093/bioinformatics/btab629 ◽

2021 ◽

Author(s):

Tobias Tekath ◽

Martin Dugas

Keyword(s):

Single Cell ◽

Transcript Level ◽

R Package ◽

Supplementary Information ◽

Data Sets ◽

Rna Seq ◽

Cell Type ◽

Gene Level ◽

Analysis Workflow ◽

Usage Analysis

Abstract Motivation Each year, the number of published bulk and single-cell RNA-seq data sets is growing exponentially. Studies analyzing such data are commonly looking at gene-level differences, while the collected RNA-seq data inherently represents reads of transcript isoform sequences. Utilizing transcriptomic quantifiers, RNA-seq reads can be attributed to specific isoforms, allowing for analysis of transcript-level differences. A differential transcript usage (DTU) analysis is testing for proportional differences in a gene’s transcript composition, and has been of rising interest for many research questions, such as analysis of differential splicing or cell type identification. Results We present the R package DTUrtle, the first DTU analysis workflow for both bulk and single-cell RNA-seq data sets, and the first package to conduct a ‘classical’ DTU analysis in a single-cell context. DTUrtle extends established statistical frameworks, offers various result aggregation and visualization options and a novel detection probability score for tagged-end data. It has been successfully applied to bulk and single-cell RNA-seq data of human and mouse, confirming and extending key results. Additionally, we present novel potential DTU applications like the identification of cell type specific transcript isoforms as biomarkers. Availability The R package DTUrtle is available at https://github.com/TobiTekath/DTUrtle with extensive vignettes and documentation at https://tobitekath.github.io/DTUrtle/. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Agile workflow for interactive analysis of mass cytometry data

10.1101/2020.05.28.120527 ◽

2020 ◽

Author(s):

Julia Casado ◽

Oskari Lehtonen ◽

Ville Rantanen ◽

Katja Kaipio ◽

Luca Pasquini ◽

...

Keyword(s):

Single Cell ◽

Peripheral Blood ◽

Large Scale ◽

Immune Cell ◽

Single Cell Analysis ◽

Supplementary Information ◽

Mass Cytometry ◽

Interactive Analysis ◽

Link Type ◽

Cell Subpopulations

AbstractMotivationSingle-cell proteomics technologies, such as mass cytometry, have enabled characterization of cell-to-cell variation and cell populations at a single cell resolution. These large amounts of data, however, require dedicated, interactive tools for translating the data into knowledge.ResultsWe present a comprehensive, interactive method called Cyto to streamline analysis of large-scale cytometry data. Cyto is a workflow-based open-source solution that automatizes the use of of state-of-the-art single-cell analysis methods with interactive visualization. We show the utility of Cyto by applying it to mass cytometry data from peripheral blood and high-grade serous ovarian cancer (HGSOC) samples. Our results show that Cyto is able to reliably capture the immune cell sub-populations from peripheral blood as well as cellular compositions of unique immune- and cancer cell subpopulations in HGSOC tumor and ascites samples.AvailabilityThe method is available as a Docker container at https://hub.docker.com/r/anduril/cyto and the user guide and source code are available at https://bitbucket.org/anduril-dev/[email protected] informationSupplementary material is available and FCS files are hosted at flowrepository.org/id/FR-FCM-Z2LW

Download Full-text

ACTINN: automated identification of cell types in single cell RNA sequencing

Bioinformatics ◽

10.1093/bioinformatics/btz592 ◽

2019 ◽

Cited By ~ 7

Author(s):

Feiyang Ma ◽

Matteo Pellegrini

Keyword(s):

Neural Network ◽

Single Cell ◽

Rna Sequencing ◽

Immune Cell ◽

Cell Types ◽

Mouse Cell ◽

Supplementary Information ◽

Cell Type ◽

Human T Cell ◽

Single Cell Rna Sequencing

Abstract Motivation Cell type identification is one of the major goals in single cell RNA sequencing (scRNA-seq). Current methods for assigning cell types typically involve the use of unsupervised clustering, the identification of signature genes in each cluster, followed by a manual lookup of these genes in the literature and databases to assign cell types. However, there are several limitations associated with these approaches, such as unwanted sources of variation that influence clustering and a lack of canonical markers for certain cell types. Here, we present ACTINN (Automated Cell Type Identification using Neural Networks), which employs a neural network with three hidden layers, trains on datasets with predefined cell types and predicts cell types for other datasets based on the trained parameters. Results We trained the neural network on a mouse cell type atlas (Tabula Muris Atlas) and a human immune cell dataset, and used it to predict cell types for mouse leukocytes, human PBMCs and human T cell sub types. The results showed that our neural network is fast and accurate, and should therefore be a useful tool to complement existing scRNA-seq pipelines. Availability and implementation The codes and datasets are available at https://figshare.com/articles/ACTINN/8967116. Tutorial is available at https://github.com/mafeiyang/ACTINN. All codes are implemented in python. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

scTIM: seeking cell-type-indicative marker from single cell RNA-seq data by consensus optimization

Bioinformatics ◽

10.1093/bioinformatics/btz936 ◽

2019 ◽

Vol 36 (8) ◽

pp. 2474-2485 ◽

Cited By ~ 2

Author(s):

Zhanying Feng ◽

Xianwen Ren ◽

Yuan Fang ◽

Yining Yin ◽

Chutian Huang ◽

...

Keyword(s):

Single Cell ◽

Large Scale ◽

Cell Types ◽

Mouse Cell ◽

Supplementary Information ◽

Rna Seq ◽

Cell Type ◽

Robust Solution ◽

Development Trajectory ◽

Consensus Optimization

Abstract Motivation Single cell RNA-seq data offers us new resource and resolution to study cell type identity and its conversion. However, data analyses are challenging in dealing with noise, sparsity and poor annotation at single cell resolution. Detecting cell-type-indicative markers is promising to help denoising, clustering and cell type annotation. Results We developed a new method, scTIM, to reveal cell-type-indicative markers. scTIM is based on a multi-objective optimization framework to simultaneously maximize gene specificity by considering gene-cell relationship, maximize gene’s ability to reconstruct cell–cell relationship and minimize gene redundancy by considering gene–gene relationship. Furthermore, consensus optimization is introduced for robust solution. Experimental results on three diverse single cell RNA-seq datasets show scTIM’s advantages in identifying cell types (clustering), annotating cell types and reconstructing cell development trajectory. Applying scTIM to the large-scale mouse cell atlas data identifies critical markers for 15 tissues as ‘mouse cell marker atlas’, which allows us to investigate identities of different tissues and subtle cell types within a tissue. scTIM will serve as a useful method for single cell RNA-seq data mining. Availability and implementation scTIM is freely available at https://github.com/Frank-Orwell/scTIM. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Interpretable factor models of single-cell RNA-seq via variational autoencoders

Bioinformatics ◽

10.1093/bioinformatics/btaa169 ◽

2020 ◽

Vol 36 (11) ◽

pp. 3418-3421 ◽

Cited By ~ 9

Author(s):

Valentine Svensson ◽

Adam Gayoso ◽

Nir Yosef ◽

Lior Pachter

Keyword(s):

Single Cell ◽

Factor Model ◽

Factor Models ◽

Supplementary Information ◽

Rna Seq ◽

Cell Type ◽

Massive Datasets ◽

Domain Specific ◽

Variational Autoencoder ◽

Inference Methods

Abstract Motivation Single-cell RNA-seq makes possible the investigation of variability in gene expression among cells, and dependence of variation on cell type. Statistical inference methods for such analyses must be scalable, and ideally interpretable. Results We present an approach based on a modification of a recently published highly scalable variational autoencoder framework that provides interpretability without sacrificing much accuracy. We demonstrate that our approach enables identification of gene programs in massive datasets. Our strategy, namely the learning of factor models with the auto-encoding variational Bayes framework, is not domain specific and may be useful for other applications. Availability and implementation The factor model is available in the scVI package hosted at https://github.com/YosefLab/scVI/. Contact [email protected] Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text