scholarly journals Cellular deconvolution of GTEx tissues powers eQTL studies to discover thousands of novel disease and cell-type associated regulatory variants

2019 ◽  
Author(s):  
Margaret K. R. Donovan ◽  
Agnieszka D’Antonio-Chronowska ◽  
Matteo D’Antonio ◽  
Kelly A. Frazer

AbstractThe Genotype-Tissue Expression (GTEx) resource has contributed a wealth of novel insights into the regulatory impact of genetic variation on gene expression across human tissues, however thus far has not been utilized to study how variation acts at the resolution of the different cell types composing the tissues. To address this gap, using liver and skin as a proof-of-concept tissues, we show that readily available signature genes based on expression profiles of mouse cell types can be used to deconvolute the cellular composition of human GTEx tissues. We then deconvoluted 6,829 bulk RNA-seq samples corresponding to 28 GTEx tissues and show that we are able to quantify cellular heterogeneity, determining both the different cell types present in each of the tissues and how their proportions vary between samples of the same tissue type. Conducting eQTL analyses for GTEx liver and skin samples using cell type composition estimates as interaction terms, we identified thousands of novel genetic associations that had lower effect sizes and were cell-type-associated. We further show that cell-type-associated eQTLs in skin colocalize with melanoma, malignant neoplasm, and infection signatures, indicating variants that influence gene expression in distinct skin cell types play important roles in skin traits and disease. Overall, our study provides a framework to estimate the relative fractions of different cell types in GTEx tissues using signature genes from mouse cell types and functionally characterize human genetic variation that impacts gene expression in a cell-type-specific manner.

2020 ◽  
Vol 49 (D1) ◽  
pp. D1413-D1419 ◽  
Author(s):  
Tianyi Zhao ◽  
Shuxuan Lyu ◽  
Guilin Lu ◽  
Liran Juan ◽  
Xi Zeng ◽  
...  

Abstract SC2disease (http://easybioai.com/sc2disease/) is a manually curated database that aims to provide a comprehensive and accurate resource of gene expression profiles in various cell types for different diseases. With the development of single-cell RNA sequencing (scRNA-seq) technologies, uncovering cellular heterogeneity of different tissues for different diseases has become feasible by profiling transcriptomes across cell types at the cellular level. In particular, comparing gene expression profiles between different cell types and identifying cell-type-specific genes in various diseases offers new possibilities to address biological and medical questions. However, systematic, hierarchical and vast databases of gene expression profiles in human diseases at the cellular level are lacking. Thus, we reviewed the literature prior to March 2020 for studies which used scRNA-seq to study diseases with human samples, and developed the SC2disease database to summarize all the data by different diseases, tissues and cell types. SC2disease documents 946 481 entries, corresponding to 341 cell types, 29 tissues and 25 diseases. Each entry in the SC2disease database contains comparisons of differentially expressed genes between different cell types, tissues and disease-related health status. Furthermore, we reanalyzed gene expression matrix by unified pipeline to improve the comparability between different studies. For each disease, we also compare cell-type-specific genes with the corresponding genes of lead single nucleotide polymorphisms (SNPs) identified in genome-wide association studies (GWAS) to implicate cell type specificity of the traits.


eLife ◽  
2017 ◽  
Vol 6 ◽  
Author(s):  
Julien Racle ◽  
Kaat de Jonge ◽  
Petra Baumgaertner ◽  
Daniel E Speiser ◽  
David Gfeller

Immune cells infiltrating tumors can have important impact on tumor progression and response to therapy. We present an efficient algorithm to simultaneously estimate the fraction of cancer and immune cell types from bulk tumor gene expression data. Our method integrates novel gene expression profiles from each major non-malignant cell type found in tumors, renormalization based on cell-type-specific mRNA content, and the ability to consider uncharacterized and possibly highly variable cell types. Feasibility is demonstrated by validation with flow cytometry, immunohistochemistry and single-cell RNA-Seq analyses of human melanoma and colorectal tumor specimens. Altogether, our work not only improves accuracy but also broadens the scope of absolute cell fraction predictions from tumor gene expression data, and provides a unique novel experimental benchmark for immunogenomics analyses in cancer research (http://epic.gfellerlab.org).


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Bárbara Andrade Barbosa ◽  
Saskia D. van Asten ◽  
Ji Won Oh ◽  
Arantza Farina-Sarasqueta ◽  
Joanne Verheij ◽  
...  

AbstractDeconvolution of bulk gene expression profiles into the cellular components is pivotal to portraying tissue’s complex cellular make-up, such as the tumor microenvironment. However, the inherently variable nature of gene expression requires a comprehensive statistical model and reliable prior knowledge of individual cell types that can be obtained from single-cell RNA sequencing. We introduce BLADE (Bayesian Log-normAl Deconvolution), a unified Bayesian framework to estimate both cellular composition and gene expression profiles for each cell type. Unlike previous comprehensive statistical approaches, BLADE can handle > 20 types of cells due to the efficient variational inference. Throughout an intensive evaluation with > 700 simulated and real datasets, BLADE demonstrated enhanced robustness against gene expression variability and better completeness than conventional methods, in particular, to reconstruct gene expression profiles of each cell type. In summary, BLADE is a powerful tool to unravel heterogeneous cellular activity in complex biological systems from standard bulk gene expression data.


2020 ◽  
Author(s):  
Bárbara Andrade Barbosa ◽  
Saskia van Asten ◽  
Ji-won Oh ◽  
Arantza Fariña-Sarasqueta ◽  
Joanne Verheij ◽  
...  

Abstract High-resolution deconvolution of bulk gene expression profiles is pivotal to characterize the complex cellular make-up of tissues, such as tumor microenvironment. Single-cell RNA-seq provides reliable prior knowledge for deconvolution, however, a comprehensive statistical model is required for efficient utilization due to the inherently variable nature of gene expression. We introduce BLADE (Bayesian Log-normAl Deconvolution), a comprehensive probabilistic framework to estimate both cellular make-up and gene expression profiles of each cell type in each sample. Unlike previous comprehensive statistical approaches, BLADE can handle >20 cell types thanks to the efficient variational inference. Throughout an intensive evaluation using >700 datasets, BLADE showed enhanced robustness against gene expression variability and better completeness than conventional methods, in particular to reconstruct gene expression profiles of each cell type. All-in-all, BLADE is a powerful tool to unravel heterogeneous cellular activity in complex biological systems based on standard bulk gene expression data.


2018 ◽  
Author(s):  
Idan Nurick ◽  
Ron Shamir ◽  
Ran Elkon

AbstractBackgroundOur appreciation of the critical role of the 3D organization of the genome in gene regulation is steadily increasing. Recent 3C-based deep sequencing techniques elucidated a hierarchy of structures that underlie the spatial organization of the genome in the nucleus. At the top of this hierarchical organization are chromosomal territories and the megabase-scale A/B compartments that correlate with transcriptional activity within cells. Below them are the relatively cell-type invariant topologically associated domains (TADs), characterized by high frequency of physical contacts between loci within the same TAD and are assumed to function as regulatory units. Within TADs, chromatin loops bring enhancers and target promoters to close spatial proximity. Yet, we still have only rudimentary understanding how differences in chromatin organization between different cell types affect cell-type specific gene expression programs that are executed under basal and challenged conditions.ResultsHere, we carried out a large-scale meta-analysis that integrated Hi-C data from thirteen different cell lines and dozens of ChIP-seq and RNA-seq datasets measured on these cells, either under basal conditions or after treatment. Pairwise comparisons between cell lines demonstrated the strong association between modulation of A/B compartmentalization, differential gene expression and transcription factor (TF) binding events. Furthermore, integrating the analysis of transcriptomes of different cell lines in response to various challenges, we show that 3D organization of cells under basal conditions constrains not only gene expression programs and TF binding profiles that are active under the basal condition but also those induced in response to treatment.ConclusionsOur results further elucidate the role of dynamic genome organization in regulation of differential gene expression between different cell types, and indicate the impact of intra-TAD enhancer-promoter interactions that are established under basal conditions on both the basal and treatment-induced gene expression programs.


2017 ◽  
Author(s):  
Julien Racle ◽  
Kaat de Jonge ◽  
Petra Baumgaertner ◽  
Daniel E. Speiser ◽  
David Gfeller

AbstractImmune cells infiltrating tumors can have important impact on tumor progression and response to therapy. We present an efficient algorithm to simultaneously estimate the fraction of cancer and immune cell types from bulk tumor gene expression data. Our method integrates novel gene expression profiles from each major non-malignant cell type found in tumors, renormalization based on cell-type specific mRNA content, and the ability to consider uncharacterized and possibly highly variable cell types. Feasibility is demonstrated by validation with flow cytometry, immunohistochemistry and single-cell RNA-Seq analyses of human melanoma and colorectal tumor specimens. Altogether, our work not only improves accuracy but also broadens the scope of absolute cell fraction predictions from tumor gene expression data, and provides a unique novel experimental benchmark for immunogenomics analyses in cancer research.


2021 ◽  
Author(s):  
Laura Puente-Santamaria ◽  
Lucia Sanchez-Gonzalez ◽  
Barbara Pilar Gonzalez-Serrano ◽  
Nuria Pescador ◽  
Oscar Hernan Martinez-Costa ◽  
...  

Background: Integrating transcriptional profiles results in the identification of gene expression signatures that are more robust than those obtained for individual datasets. However, direct comparison of datasets derived from heterogeneous experimental conditions is not possible and their integration requires the application of specific meta-analysis techniques. The transcriptional response to hypoxia has been the focus of intense research due to its central role in tissue homeostasis and in prevalent diseases. Accordingly, a large number of studies have determined the gene expression profile of hypoxic cells. Yet, in spite of this wealth of information, little effort have been done to integrate these dataset to produce a robust hypoxic signature. Results: We applied a formal meta-analysis procedure to a dataset comprising 425 RNAseq samples derived from 42 individual studies including 33 different cell types, to derive a pooled estimate of the effect of hypoxia on gene expression. This approach revealed that a large proportion of the transcriptome (8556 genes out of 20888) is significantly regulated by hypoxia. However, only a small fraction of the differentially expressed genes (1265 genes, 15%) show an effect size that, according to comparisons to gene pathways known to be regulated by hypoxia, is likely to be biologically relevant. By focusing on genes ubiquitously expressed we identified a signature of 291 genes robustly and consistently regulated by hypoxia. Finally, by a applying a moderator analysis we found that endothelial cells show a characteristic gene expression pattern that is significantly different from other cell types. Conclusion: By the application of a formal meta-analysis to hypoxic gene profiles, we have developed a robust gene signature that characterizes the transcriptomic response to low oxygen. In addition to identifying a universal set of hypoxia-responsive genes, we found a set of genes whose regulation is cell-type specific and suggest a unique metabolic response of endothelial cells to reduced oxygen tension.


Author(s):  
Meng Zhang ◽  
Stephen W. Eichhorn ◽  
Brian Zingg ◽  
Zizhen Yao ◽  
Hongkui Zeng ◽  
...  

AbstractA mammalian brain is comprised of numerous cell types organized in an intricate manner to form functional neural circuits. Single-cell RNA sequencing provides a powerful approach to identify cell types based on their gene expression profiles and has revealed many distinct cell populations in the brain1-3. Single-cell epigenomic profiling4,5 further provides information on gene-regulatory signatures of different cell types. Understanding how different cell types contribute to brain function, however, requires knowledge of their spatial organization and connectivity, which is not preserved in sequencing-based methods that involve cell dissociation3,6. Here, we used an in situ single-cell transcriptome-imaging method, multiplexed error-robust fluorescence in situ hybridization (MERFISH)7, to generate a molecularly defined and spatially resolved cell atlas of the mouse primary motor cortex (MOp). We profiled ∼300,000 cells in the MOp, identified 95 neuronal and non-neuronal cell clusters, and revealed a complex spatial map in which not only excitatory neuronal clusters but also most inhibitory neuronal clusters adopted layered organizations. Notably, intratelencephalic (IT) cells, the largest branch of neurons in the MOp, formed a continuous spectrum of cells with gradual changes in both gene expression profiles and cortical depth positions in a highly correlated manner. Furthermore, we integrated MERFISH with retrograde tracing to probe the projection targets for different MOp neuronal cell types and found that projections of MOp neurons to other cortical regions formed a many-to-many network with each target region receiving input preferentially from a different composition of IT clusters. Overall, our results provide a high-resolution spatial and projection map of molecularly defined cell types in the MOp. We anticipate that the imaging platform described here can be broadly applied to create high-resolution cell atlases of a wide range of systems.


2019 ◽  
Author(s):  
David J. Forsthoefel ◽  
Nicholas I. Cejda ◽  
Umair W. Khan ◽  
Phillip A. Newmark

AbstractOrgan regeneration requires precise coordination of new cell differentiation and remodeling of uninjured tissue to faithfully re-establish organ morphology and function. An atlas of gene expression and cell types in the uninjured state is therefore an essential pre-requisite for understanding how damage is repaired. Here, we use laser-capture microdissection (LCM) and RNA-Seq to define the transcriptome of the intestine of Schmidtea mediterranea, a planarian flatworm with exceptional regenerative capacity. Bioinformatic analysis of 1,844 intestine-enriched transcripts suggests extensive conservation of digestive physiology with other animals, including humans. Comparison of the intestinal transcriptome to purified absorptive intestinal cell (phagocyte) and published single-cell expression profiles confirms the identities of known intestinal cell types, and also identifies hundreds of additional transcripts with previously undetected intestinal enrichment. Furthermore, by assessing the expression patterns of 143 transcripts in situ, we discover unappreciated mediolateral regionalization of gene expression and cell-type diversity, especially among goblet cells. Demonstrating the utility of the intestinal transcriptome, we identify 22 intestine-enriched transcription factors, and find that several have distinct functional roles in the regeneration and maintenance of goblet cells. Furthermore, depletion of goblet cells inhibits planarian feeding and reduces viability. Altogether, our results show that LCM is a viable approach for assessing tissue-specific gene expression in planarians, and provide a new resource for further investigation of digestive tract regeneration, the physiological roles of intestinal cell types, and axial polarity.


2021 ◽  
Author(s):  
Fabio Sacher ◽  
Christian Feregrino ◽  
Patrick Tschopp ◽  
Collin Y. Ewald

AbstractTranscriptomic signatures based on cellular mRNA expression profiles can be used to categorize cell types and states. Yet whether different functional groups of genes perform better or worse in this process remains largely unexplored. Here we test the core matrisome - that is, all genes coding for structural proteins of the extracellular matrix - for its ability to delineate distinct cell types in embryonic single-cell RNA-sequencing (scRNA-seq) data. We show that even though expressed core matrisome genes correspond to less than 2% of an entire cellular transcriptome, their RNA expression levels suffice to recapitulate important aspects of cell type-specific clustering. Notably, using scRNA-seq data from the embryonic limb, we demonstrate that core matrisome gene expression outperforms random gene subsets of similar sizes and can match and exceed the predictive power of transcription factors. While transcription factor signatures generally perform better in predicting cell types at early stages of chicken and mouse limb development, i.e., when cells are less differentiated, the information content of the core matrisome signature increases in more differentiated cells. Our findings suggest that each cell type produces its own unique extracellular matrix, or matreotype, which becomes progressively more refined and cell type-specific as embryonic tissues mature.HighlightsCell types produce unique extracellular matrix compositionsDynamic extracellular matrix gene expression profiles hold predictive power for cell type and cell state identification


Sign in / Sign up

Export Citation Format

Share Document