scholarly journals scBASE: A Bayesian mixture model for the analysis of allelic expression in single cells

2018 ◽  
Author(s):  
Kwangbom Choi ◽  
Narayanan Raghupathy ◽  
Gary A. Churchill

Allele-specific expression (ASE) at single-cell resolution is a critical tool for understanding the stochastic and dynamic features of gene expression. However, low read coverage and high biological variability present challenges for analyzing ASE. We propose a new method for ASE analysis from single cell RNA-Seq data that accurately classifies allelic expression states and improves estimation of allelic proportions by pooling information across cells.

2019 ◽  
Vol 10 (1) ◽  
Author(s):  
Kwangbom Choi ◽  
Narayanan Raghupathy ◽  
Gary A. Churchill

AbstractAllele-specific expression (ASE) at single-cell resolution is a critical tool for understanding the stochastic and dynamic features of gene expression. However, low read coverage and high biological variability present challenges for analyzing ASE. We demonstrate that discarding multi-mapping reads leads to higher variability in estimates of allelic proportions, an increased frequency of sampling zeros, and can lead to spurious findings of dynamic and monoallelic gene expression. Here, we report a method for ASE analysis from single-cell RNA-Seq data that accurately classifies allelic expression states and improves estimation of allelic proportions by pooling information across cells. We further demonstrate that combining information across cells using a hierarchical mixture model reduces sampling variability without sacrificing cell-to-cell heterogeneity. We applied our approach to re-evaluate the statistical independence of allelic bursting and track changes in the allele-specific expression patterns of cells sampled over a developmental time course.


Genes ◽  
2020 ◽  
Vol 11 (3) ◽  
pp. 240 ◽  
Author(s):  
Prashant N. M. ◽  
Hongyu Liu ◽  
Pavlos Bousounis ◽  
Liam Spurr ◽  
Nawaf Alomran ◽  
...  

With the recent advances in single-cell RNA-sequencing (scRNA-seq) technologies, the estimation of allele expression from single cells is becoming increasingly reliable. Allele expression is both quantitative and dynamic and is an essential component of the genomic interactome. Here, we systematically estimate the allele expression from heterozygous single nucleotide variant (SNV) loci using scRNA-seq data generated on the 10×Genomics Chromium platform. We analyzed 26,640 human adipose-derived mesenchymal stem cells (from three healthy donors), sequenced to an average of 150K sequencing reads per cell (more than 4 billion scRNA-seq reads in total). High-quality SNV calls assessed in our study contained approximately 15% exonic and >50% intronic loci. To analyze the allele expression, we estimated the expressed variant allele fraction (VAFRNA) from SNV-aware alignments and analyzed its variance and distribution (mono- and bi-allelic) at different minimum sequencing read thresholds. Our analysis shows that when assessing positions covered by a minimum of three unique sequencing reads, over 50% of the heterozygous SNVs show bi-allelic expression, while at a threshold of 10 reads, nearly 90% of the SNVs are bi-allelic. In addition, our analysis demonstrates the feasibility of scVAFRNA estimation from current scRNA-seq datasets and shows that the 3′-based library generation protocol of 10×Genomics scRNA-seq data can be informative in SNV-based studies, including analyses of transcriptional kinetics.


2019 ◽  
Author(s):  
Charlotte A. Darby ◽  
Michael J. T. Stubbington ◽  
Patrick J. Marks ◽  
Álvaro Martínez Barrio ◽  
Ian T. Fiddes

AbstractStudies in bulk RNA sequencing data suggest cell-type and allele-specific expression of the human leukocyte antigen (HLA) genes. These loci are extremely diverse and they function as part of the major histocompatibility complex (MHC) which is responsible for antigen presentation. Mutation and or misregulation of expression of HLA genes has implications in diseases, especially cancer. Immune responses to tumor cells can be evaded through HLA loss of function. However, bulk RNA-seq does not fully disentangle cell type specificity and allelic expression. Here we present scHLAcount, a workflow for computing allele-specific molecule counts of the HLA genes in single cells an individualized reference. We demonstrate that scHLAcount can be used to find cell-type specific allelic expression of HLA genes in blood cells, and detect different allelic expression patterns between tumor and normal cells in patient biopsies. scHLAcount is available at https://github.com/10XGenomics/scHLAcount.


2018 ◽  
Author(s):  
Marco Garieri ◽  
Georgios Stamoulis ◽  
Emilie Falconnet ◽  
Pascale Ribaux ◽  
Christelle Borel ◽  
...  

ABSTRACTIn eutherian mammals, X chromosome inactivation (XCI) provides a dosage compensation mechanism where in each female cell one of the two X chromosomes is randomly silenced. However, some genes on the inactive X chromosome and outside the pseudoautosomal regions escape from XCI and are expressed from both alleles (escapees). Given the relevance of the escapees in biology and medicine, we investigated XCI at an unprecedented single-cell resolution. We combined deep single-cell RNA sequencing with whole genome sequencing to examine allelic specific expression (ASE) in 935 primary fibroblast and 48 lymphoblastoid single cells from five female individuals. In this framework we integrated an original method to identify and exclude doublets of cells. We have identified 55 genes as escapees including 5 novel escapee genes. Moreover, we observed that all genes exhibit a variable propensity to escape XCI in each cell and cell type, and that each cell displays a distinct expression profile of the escapee genes. We devised a novel metric, the Inactivation Score (IS), defined as the mean of the allelic expression profiles of the escapees per cell, and discovered a heterogeneous and continuous degree of cellular XCI with extremes represented by “inactive” cells, i.e., exclusively expressing the escaping genes from the active X chromosome, and “escaping” cells, expressing the escapees from both alleles. Intriguingly we found that XIST is the major genetic determinant of IS, and that XIST expression, higher in G0 phase, is negatively correlated with the expression of escapees, inactivated and pseudoautosomal genes. In this study we use single-cell allele specific expression to identify novel escapees in different tissues and provide evidence of an unexpected cellular heterogeneity of XCI driven by a possible regulatory activity of XIST.


2018 ◽  
Vol 115 (51) ◽  
pp. 13015-13020 ◽  
Author(s):  
Marco Garieri ◽  
Georgios Stamoulis ◽  
Xavier Blanc ◽  
Emilie Falconnet ◽  
Pascale Ribaux ◽  
...  

X-chromosome inactivation (XCI) provides a dosage compensation mechanism where, in each female cell, one of the two X chromosomes is randomly silenced. However, some genes on the inactive X chromosome and outside the pseudoautosomal regions escape from XCI and are expressed from both alleles (escapees). We investigated XCI at single-cell resolution combining deep single-cell RNA sequencing with whole-genome sequencing to examine allelic-specific expression in 935 primary fibroblast and 48 lymphoblastoid single cells from five female individuals. In this framework we integrated an original method to identify and exclude doublets of cells. In fibroblast cells, we have identified 55 genes as escapees including five undescribed escapee genes. Moreover, we observed that all genes exhibit a variable propensity to escape XCI in each cell and cell type and that each cell displays a distinct expression profile of the escapee genes. A metric, the Inactivation Score—defined as the mean of the allelic expression profiles of the escapees per cell—enables us to discover a heterogeneous and continuous degree of cellular XCI with extremes represented by “inactive” cells, i.e., cells exclusively expressing the escaping genes from the active X chromosome and “escaping” cells expressing the escapees from both alleles. We found that this effect is associated with cell-cycle phases and, independently, with the XIST expression level, which is higher in the quiescent phase (G0). Single-cell allele-specific expression is a powerful tool to identify novel escapees in different tissues and provide evidence of an unexpected cellular heterogeneity of XCI.


2021 ◽  
Vol 7 (8) ◽  
pp. eabe3610
Author(s):  
Conor J. Kearney ◽  
Stephin J. Vervoort ◽  
Kelly M. Ramsbottom ◽  
Izabela Todorovski ◽  
Emily J. Lelliott ◽  
...  

Multimodal single-cell RNA sequencing enables the precise mapping of transcriptional and phenotypic features of cellular differentiation states but does not allow for simultaneous integration of critical posttranslational modification data. Here, we describe SUrface-protein Glycan And RNA-seq (SUGAR-seq), a method that enables detection and analysis of N-linked glycosylation, extracellular epitopes, and the transcriptome at the single-cell level. Integrated SUGAR-seq and glycoproteome analysis identified tumor-infiltrating T cells with unique surface glycan properties that report their epigenetic and functional state.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
M. Joseph Tomlinson ◽  
Shawn W. Polson ◽  
Jing Qiu ◽  
Juniper A. Lake ◽  
William Lee ◽  
...  

AbstractDifferential abundance of allelic transcripts in a diploid organism, commonly referred to as allele specific expression (ASE), is a biologically significant phenomenon and can be examined using single nucleotide polymorphisms (SNPs) from RNA-seq. Quantifying ASE aids in our ability to identify and understand cis-regulatory mechanisms that influence gene expression, and thereby assist in identifying causal mutations. This study examines ASE in breast muscle, abdominal fat, and liver of commercial broiler chickens using variants called from a large sub-set of the samples (n = 68). ASE analysis was performed using a custom software called VCF ASE Detection Tool (VADT), which detects ASE of biallelic SNPs using a binomial test. On average ~ 174,000 SNPs in each tissue passed our filtering criteria and were considered informative, of which ~ 24,000 (~ 14%) showed ASE. Of all ASE SNPs, only 3.7% exhibited ASE in all three tissues, with ~ 83% showing ASE specific to a single tissue. When ASE genes (genes containing ASE SNPs) were compared between tissues, the overlap among all three tissues increased to 20.1%. Our results indicate that ASE genes show tissue-specific enrichment patterns, but all three tissues showed enrichment for pathways involved in translation.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Asia Mendelevich ◽  
Svetlana Vinogradova ◽  
Saumya Gupta ◽  
Andrey A. Mironov ◽  
Shamil R. Sunyaev ◽  
...  

AbstractA sensitive approach to quantitative analysis of transcriptional regulation in diploid organisms is analysis of allelic imbalance (AI) in RNA sequencing (RNA-seq) data. A near-universal practice in such studies is to prepare and sequence only one library per RNA sample. We present theoretical and experimental evidence that data from a single RNA-seq library is insufficient for reliable quantification of the contribution of technical noise to the observed AI signal; consequently, reliance on one-replicate experimental design can lead to unaccounted-for variation in error rates in allele-specific analysis. We develop a computational approach, Qllelic, that accurately accounts for technical noise by making use of replicate RNA-seq libraries. Testing on new and existing datasets shows that application of Qllelic greatly decreases false positive rate in allele-specific analysis while conserving appropriate signal, and thus greatly improves reproducibility of AI estimates. We explore sources of technical overdispersion in observed AI signal and conclude by discussing design of RNA-seq studies addressing two biologically important questions: quantification of transcriptome-wide AI in one sample, and differential analysis of allele-specific expression between samples.


Genetics ◽  
2013 ◽  
Vol 195 (3) ◽  
pp. 1157-1166 ◽  
Author(s):  
Sandrine Lagarrigue ◽  
Lisa Martin ◽  
Farhad Hormozdiari ◽  
Pierre-François Roux ◽  
Calvin Pan ◽  
...  

2019 ◽  
Author(s):  
Ning Wang ◽  
Andrew E. Teschendorff

AbstractInferring the activity of transcription factors in single cells is a key task to improve our understanding of development and complex genetic diseases. This task is, however, challenging due to the relatively large dropout rate and noisy nature of single-cell RNA-Seq data. Here we present a novel statistical inference framework called SCIRA (Single Cell Inference of Regulatory Activity), which leverages the power of large-scale bulk RNA-Seq datasets to infer high-quality tissue-specific regulatory networks, from which regulatory activity estimates in single cells can be subsequently obtained. We show that SCIRA can correctly infer regulatory activity of transcription factors affected by high technical dropouts. In particular, SCIRA can improve sensitivity by as much as 70% compared to differential expression analysis and current state-of-the-art methods. Importantly, SCIRA can reveal novel regulators of cell-fate in tissue-development, even for cell-types that only make up 5% of the tissue, and can identify key novel tumor suppressor genes in cancer at single cell resolution. In summary, SCIRA will be an invaluable tool for single-cell studies aiming to accurately map activity patterns of key transcription factors during development, and how these are altered in disease.


Sign in / Sign up

Export Citation Format

Share Document