scholarly journals Inference and effects of barcode multiplets in droplet-based single-cell assays

2019 ◽  
Author(s):  
Caleb A. Lareau ◽  
Sai Ma ◽  
Fabiana M. Duarte ◽  
Jason D. Buenrostro

AbstractA widespread assumption for single-cell analyses specifies that one cell’s nucleic acids are predominantly captured by one oligonucleotide barcode. However, we show that ∼13-21% of cell barcodes from the 10x Chromium scATAC-seq assay may have been derived from a droplet with more than one oligonucleotide sequence, which we call “barcode multiplets”. We demonstrate that barcode multiplets can be derived from at least two different sources. First, we confirm that ∼4% of droplets from the 10x platform may contain multiple beads. Additionally, we find that ∼5-7% of beads may contain multiple oligonucleotide barcodes. We show that this artifact can confound single-cell analyses, including the interpretation of clonal diversity and proliferation of intra-tumor lymphocytes. Overall, our work provides a conceptual and computational framework to identify and assess the impacts of barcode multiplets in single-cell data.

2019 ◽  
Author(s):  
Anna Danese ◽  
Maria L. Richter ◽  
David S. Fischer ◽  
Fabian J. Theis ◽  
Maria Colomé-Tatché

ABSTRACTEpigenetic single-cell measurements reveal a layer of regulatory information not accessible to single-cell transcriptomics, however single-cell-omics analysis tools mainly focus on gene expression data. To address this issue, we present epiScanpy, a computational framework for the analysis of single-cell DNA methylation and single-cell ATAC-seq data. EpiScanpy makes the many existing RNA-seq workflows from scanpy available to large-scale single-cell data from other -omics modalities. We introduce and compare multiple feature space constructions for epigenetic data and show the feasibility of common clustering, dimension reduction and trajectory learning techniques. We benchmark epiScanpy by interrogating different single-cell brain mouse atlases of DNA methylation, ATAC-seq and transcriptomics. We find that differentially methylated and differentially open markers between cell clusters enrich transcriptome-based cell type labels by orthogonal epigenetic information.


2021 ◽  
Author(s):  
Belinda Phipson ◽  
Choon Boon Sim ◽  
Enzo R. Porrello ◽  
Alex W Hewitt ◽  
Joseph Powell ◽  
...  

Single cell RNA Sequencing (scRNA-seq) has rapidly gained popularity over the last few years for profiling the transcriptomes of thousands to millions of single cells. To date, there are more than a thousand software packages that have been developed to analyse scRNA-seq data. These focus predominantly on visualization, dimensionality reduction and cell type identification. Single cell technology is now being used to analyse experiments with complex designs including biological replication. One question that can be asked from single cell experiments which has not been possible to address with bulk RNA-seq data is whether the cell type proportions are different between two or more experimental conditions. As well as gene expression changes, the relative depletion or enrichment of a particular cell type can be the functional consequence of disease or treatment. However, cell type proportions estimates from scRNA-seq data are variable and statistical methods that can correctly account for different sources of variability are needed to confidently identify statistically significant shifts in cell type composition between experimental conditions. We present propeller, a robust and flexible method that leverages biological replication to find statistically significant differences in cell type proportions between groups. The propeller method is publicly available in the open source speckle R package (https://github.com/Oshlack/speckle).


2021 ◽  
Author(s):  
Zhi-Jie Cao ◽  
Ge Gao

With the ever-increasing amount of single-cell multi-omics data accumulated during the past years, effective and efficient computational integration is becoming a serious challenge. One major obstacle of unpaired multi-omics integration is the feature discrepancies among omics layers. Here, we propose a computational framework called GLUE (graph-linked unified embedding), which utilizes accessible prior knowledge about regulatory interactions to bridge the gaps between feature spaces. Systematic benchmarks demonstrated that GLUE is accurate, robust and scalable. We further employed GLUE for various challenging tasks, including triple-omics integration, model-based regulatory inference and multi-omics human cell atlas construction (over millions of cells) and found that GLUE achieved superior performance for each task. As a generalizable framework, GLUE features a modular design that can be flexibly extended and enhanced for new analysis tasks. The full package is available online at https://github.com/gao-lab/GLUE for the community.


2021 ◽  
Vol 12 ◽  
Author(s):  
David F. Stein ◽  
Huidong Chen ◽  
Michael E. Vinyard ◽  
Qian Qin ◽  
Rebecca D. Combs ◽  
...  

Single-cell assays have transformed our ability to model heterogeneity within cell populations. As these assays have advanced in their ability to measure various aspects of molecular processes in cells, computational methods to analyze and meaningfully visualize such data have required matched innovation. Independently, Virtual Reality (VR) has recently emerged as a powerful technology to dynamically explore complex data and shows promise for adaptation to challenges in single-cell data visualization. However, adopting VR for single-cell data visualization has thus far been hindered by expensive prerequisite hardware or advanced data preprocessing skills. To address current shortcomings, we present singlecellVR, a user-friendly web application for visualizing single-cell data, designed for cheap and easily available virtual reality hardware (e.g., Google Cardboard, ∼$8). singlecellVR can visualize data from a variety of sequencing-based technologies including transcriptomic, epigenomic, and proteomic data as well as combinations thereof. Analysis modalities supported include approaches to clustering as well as trajectory inference and visualization of dynamical changes discovered through modelling RNA velocity. We provide a companion software package, scvr to streamline data conversion from the most widely-adopted single-cell analysis tools as well as a growing database of pre-analyzed datasets to which users can contribute.


2021 ◽  
Author(s):  
Mark S Keller ◽  
Ilan Gold ◽  
Chuck McCallum ◽  
Trevor Manz ◽  
Peter V Kharchenko ◽  
...  

Vitessce is an open-source interactive visualization framework for exploration of multi-modal and spatially-resolved single-cell data, with a modular architecture compatible with transcriptomic, proteomic, genome-mapped, and imaging data types. Its modular, coordinated multiple view implementation facilitates a wide range of visualization tasks to support all common single-cell assays. Vitessce is a client-side web application designed to be integrated with computational analysis tools and data resources and does not require specialized server infrastructure. The software is available at http://vitessce.io.


2020 ◽  
Author(s):  
David F. Stein ◽  
Huidong Chen ◽  
Michael E. Vinyard ◽  
Luca Pinello

ABSTRACTSingle-cell assays have transformed our ability to model heterogeneity within cell populations and tissues. Virtual Reality (VR) has recently emerged as a powerful technology to dynamically explore complex data. However, expensive hardware or advanced data preprocessing skills are required to adapt such technology to single-cell data. To address current shortcomings, we present singlecellVR, a user-friendly website for visualizing single-cell data, designed for cheap and easily available virtual reality hardware (e.g., Google Cardboard, ∼$8). We provide a companion package, scvr to streamline data conversion from the most widely-adopted single-cell analysis tools and a database of pre-analyzed datasets to which users can contribute.


2021 ◽  
Author(s):  
Yidi Deng ◽  
Jarny Choi ◽  
Kim-Anh Le Cao

Characterizing the molecular identity of a cell is an essential step in single cell RNA-sequencing (scRNA-seq) data analysis. Numerous tools exist for predicting cell identity using single cell reference atlases. However, many challenges remain, including correcting for inherent batch effects between reference and query data and insufficient phenotype data from the reference. One solution is to project single cell data onto established bulk reference atlases to leverage their rich phenotype information. Sincast is a computational framework to query scRNA-seq data based on bulk reference atlases. Prior to projection, single cell data are transformed to be directly comparable to bulk data, either with pseudo-bulk aggregation or graph-based imputation to address sparse single cell expression profiles. Sincast avoids batch effect correction, and cell identity is predicted along a continuum to highlight new cell states not found in the reference atlas. In several case study scenarios, we show that Sincast projects single cells into the correct biological niches in the expression space of the bulk reference atlas. We demonstrate the effectiveness of our imputation approach that was specifically developed for querying scRNA-seq data based on bulk reference atlases. We show that Sincast is an efficient and powerful tool for single cell profiling that will facilitate downstream analysis of scRNA-seq data.


2018 ◽  
Author(s):  
Giovanni Iacono ◽  
Ramon Massoni-Badosa ◽  
Holger Heyn

SUMMARYSingle-cell RNA sequencing (scRNA-seq) plays a pivotal role in our understanding of cellular heterogeneity. Current analytical workflows are driven by categorizing principles that consider cells as individual entities and classify them into complex taxonomies. We have devised a conceptually different computational framework based on a holistic view, where single-cell datasets are used to infer global, large-scale regulatory networks. We developed correlation metrics that are specifically tailored to single-cell data, and then generated, validated and interpreted single-cell-derived regulatory networks from organs and perturbed systems, such as diabetes and Alzheimer’s disease. Using advanced tools from graph theory, we computed an unbiased quantification of a gene’s biological relevance, and accurately pinpointed key players in organ function and drivers of diseases. Our approach detected multiple latent regulatory changes that are invisible to single-cell workflows based on clustering or differential expression analysis. In summary, we have established the feasibility and value of regulatory network analysis using scRNA-seq datasets, which significantly broadens the biological insights that can be obtained with this leading technology.


2021 ◽  
Author(s):  
Jordan W. Squair ◽  
Michael A. Skinnider ◽  
Matthieu Gautier ◽  
Leonard J. Foster ◽  
Grégoire Courtine
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document