scholarly journals A probabilistic framework to dissect functional cell-type-specific regulatory elements and risk loci underlying the genetics of complex traits

2016 ◽  
Author(s):  
Yue Li ◽  
Jose Davila-Velderrain ◽  
Manolis Kellis

AbstractDissecting the physiological circuitry underlying diverse human complex traits associated with heritable common mutations is an ongoing effort. The primary challenge involves identifying the relevant cell types and the causal variants among the vast majority of the associated mutations in the noncoding regions. To address this challenge, we developed an efficient probabilistic framework. First, we propose a sparse group-guided learning algorithm to infer cell-type-specific enrichments. Second, we propose a fine-mapping Bayesian model that incorporates as Bayesian priors the sparse enrichments to infer risk variants. Using the proposed framework to analyze 32 complex human traits revealed meaningful tissue-specific epigenomic enrichments indicative of the relevant disease pathologies. The prioritized variants exhibit prominent tissue-specific epigenomic signatures and significant enrichments for eQTL and conserved elements. Together, we demonstrate the general benefits of the proposed integrative framework in elucidating meaningful tissue-specific epigenomic elements from large-scale correlated annotations and the implicated functional variants for future experimental interrogation.

2017 ◽  
Author(s):  
Can Wang ◽  
Shihua Zhang

AbstractHistone modifications have been widely elucidated to play vital roles in gene regulation and cell identity. The Roadmap Epigenomics Consortium generated a reference catalogue of several key histone modifications across >100s of human cell types and tissues. Decoding these epigenomes into functional regulatory elements is a challenging task in computational biology. To this end, we adopted a differential chromatin modification analysis framework to comprehensively determine and characterize cell type-specific regulatory elements (CSREs) and their histone modification codes in the human epigenomes of five histone modifications across 127 tissues or cell types. The CSREs show significant relevance with cell type-specific biological functions and diseases and cell identity. Clustering of CSREs with their specificity signals reveals diverse histone codes, demonstrating the diversity of functional roles of CSREs within the same cell or tissue. Last but not least, dynamics of CSREs from close cell types or tissues can give a detailed view of developmental processes such as normal tissue development and cancer occurrence.


2019 ◽  
Author(s):  
Alexi Nott ◽  
Inge R. Holtman ◽  
Nicole G. Coufal ◽  
Johannes C.M. Schlachetzki ◽  
Miao Yu ◽  
...  

AbstractUnique cell type-specific patterns of activated enhancers can be leveraged to interpret non-coding genetic variation associated with complex traits and diseases such as neurological and psychiatric disorders. Here, we have defined active promoters and enhancers for major cell types of the human brain. Whereas psychiatric disorders were primarily associated with regulatory regions in neurons, idiopathic Alzheimer’s disease (AD) variants were largely confined to microglia enhancers. Interactome maps connecting GWAS variants in cell type-specific enhancers to gene promoters revealed an extended microglia gene network in AD. Deletion of a microglia-specific enhancer harboring AD-risk variants ablated BIN1 expression in microglia but not in neurons or astrocytes. These findings revise and expand the genes likely to be influenced by non-coding variants in AD and suggest the probable brain cell types in which they function.One Sentence SummaryIdentification of cell type-specific regulatory elements in the human brain enables interpretation of non-coding GWAS risk variants.


2021 ◽  
Author(s):  
Justin Miller ◽  
Taylor Meurs ◽  
Matthew Hodgman ◽  
Benjamin Song ◽  
Mark Ebbert ◽  
...  

Abstract Translational ramp sequences are essential regulatory elements that have yet to be characterized in specific tissues. Ramp sequences increase gene expression by evenly spacing ribosomes and slowing initial translation. Therefore, the relative codon adaptiveness within different tissues changes the existence of a ramp sequence without altering the underlying genetic code. Here, we present the first comprehensive analysis of tissue and cell type-specific ramp sequences, and report 3,108 genes with ramp sequences that change between tissues and cell types. The Ramp Atlas (https://ramps.byu.edu/) is an accompanying web portal that allows researchers to query ramp sequences in 18,388 genes across 62 tissues and 66 cell types. We also identified seven SARS-CoV-2 genes and seven human SARS-CoV-2 entry factor genes with tissue-specific ramp sequences that may help explain viral proliferation within those tissues. We anticipate that The Ramp Atlas will facilitate future tissue-specific ramp sequence analyses to develop targeted therapeutics for human disease.


2021 ◽  
Author(s):  
Charles Breeze

Hundreds of epigenome-wide association studies (EWAS) have been performed, successfully identifying replicated epigenomic signals in processes such as ageing and smoking. Despite this progress, it remains a major challenge in EWAS to detect both cell type-specific and cell type confounding effects impacting study results. One way to identify these effects is through eFORGE (experimentally derived Functional element Overlap analysis of ReGions from EWAS), a published tool that uses 815 datasets from large-scale mapping studies to detect enriched tissues, cell types and genomic regions. Here, I show that eFORGE analysis can be extended to EWAS differentially variable positions (DVPs), identifying target cell types and tissues. In addition, I also show that eFORGE tissue-specific enrichment can be detected for sites below EWAS significance threshold. I develop on these and other analysis examples, extending our knowledge of eFORGE cell type- and tissue-specific enrichment results for different EWAS.


2022 ◽  
Vol 23 (1) ◽  
Author(s):  
Charles E. Breeze ◽  
Eric Haugen ◽  
Alex Reynolds ◽  
Andrew Teschendorff ◽  
Jenny van Dongen ◽  
...  

Abstract Background Genome-wide association study (GWAS) single nucleotide polymorphisms (SNPs) are known to preferentially co-locate to active regulatory elements in tissues and cell types relevant to disease aetiology. Further characterisation of associated cell type-specific regulation can broaden our understanding of how GWAS signals may contribute to disease risk. Results To gain insight into potential functional mechanisms underlying GWAS associations, we developed FORGE2 (https://forge2.altiusinstitute.org/), which is an updated version of the FORGE web tool. FORGE2 uses an expanded atlas of cell type-specific regulatory element annotations, including DNase I hotspots, five histone mark categories and 15 hidden Markov model (HMM) chromatin states, to identify tissue- and cell type-specific signals. An analysis of 3,604 GWAS from the NHGRI-EBI GWAS catalogue yielded at least one significant disease/trait-tissue association for 2,057 GWAS, including > 400 associations specific to epigenomic marks in immune tissues and cell types, > 30 associations specific to heart tissue, and > 60 associations specific to brain tissue, highlighting the key potential of tissue- and cell type-specific regulatory elements. Importantly, we demonstrate that FORGE2 analysis can separate previously observed accessible chromatin enrichments into different chromatin states, such as enhancers or active transcription start sites, providing a greater understanding of underlying regulatory mechanisms. Interestingly, tissue-specific enrichments for repressive chromatin states and histone marks were also detected, suggesting a role for tissue-specific repressed regions in GWAS-mediated disease aetiology. Conclusion In summary, we demonstrate that FORGE2 has the potential to uncover previously unreported disease-tissue associations and identify new candidate mechanisms. FORGE2 is a transparent, user-friendly web tool for the integrative analysis of loci discovered from GWAS.


eLife ◽  
2019 ◽  
Vol 8 ◽  
Author(s):  
Sinisa Hrvatin ◽  
Christopher P Tzeng ◽  
M Aurel Nagy ◽  
Hume Stroud ◽  
Charalampia Koutsioumpa ◽  
...  

Enhancers are the primary DNA regulatory elements that confer cell type specificity of gene expression. Recent studies characterizing individual enhancers have revealed their potential to direct heterologous gene expression in a highly cell-type-specific manner. However, it has not yet been possible to systematically identify and test the function of enhancers for each of the many cell types in an organism. We have developed PESCA, a scalable and generalizable method that leverages ATAC- and single-cell RNA-sequencing protocols, to characterize cell-type-specific enhancers that should enable genetic access and perturbation of gene function across mammalian cell types. Focusing on the highly heterogeneous mammalian cerebral cortex, we apply PESCA to find enhancers and generate viral reagents capable of accessing and manipulating a subset of somatostatin-expressing cortical interneurons with high specificity. This study demonstrates the utility of this platform for developing new cell-type-specific viral reagents, with significant implications for both basic and translational research.


2020 ◽  
Vol 29 (11) ◽  
pp. 1922-1932
Author(s):  
Priyanka Nandakumar ◽  
Dongwon Lee ◽  
Thomas J Hoffmann ◽  
Georg B Ehret ◽  
Dan Arking ◽  
...  

Abstract Hundreds of loci have been associated with blood pressure (BP) traits from many genome-wide association studies. We identified an enrichment of these loci in aorta and tibial artery expression quantitative trait loci in our previous work in ~100 000 Genetic Epidemiology Research on Aging study participants. In the present study, we sought to fine-map known loci and identify novel genes by determining putative regulatory regions for these and other tissues relevant to BP. We constructed maps of putative cis-regulatory elements (CREs) using publicly available open chromatin data for the heart, aorta and tibial arteries, and multiple kidney cell types. Variants within these regions may be evaluated quantitatively for their tissue- or cell-type-specific regulatory impact using deltaSVM functional scores, as described in our previous work. We aggregate variants within these putative CREs within 50 Kb of the start or end of ‘expressed’ genes in these tissues or cell types using public expression data and use deltaSVM scores as weights in the group-wise sequence kernel association test to identify candidates. We test for association with both BP traits and expression within these tissues or cell types of interest and identify the candidates MTHFR, C10orf32, CSK, NOV, ULK4, SDCCAG8, SCAMP5, RPP25, HDGFRP3, VPS37B and PPCDC. Additionally, we examined two known QT interval genes, SCN5A and NOS1AP, in the Atherosclerosis Risk in Communities Study, as a positive control, and observed the expected heart-specific effect. Thus, our method identifies variants and genes for further functional testing using tissue- or cell-type-specific putative regulatory information.


2019 ◽  
Author(s):  
Priyanka Nandakumar ◽  
Dongwon Lee ◽  
Thomas J. Hoffmann ◽  
Georg B. Ehret ◽  
Dan Arking ◽  
...  

AbstractHundreds of loci have been associated with blood pressure traits from many genome-wide association studies. We identified an enrichment of these loci in aorta and tibial artery expression quantitative trait loci in our previous work in ∼100,000 Genetic Epidemiology Research on Aging (GERA) study participants. In the present study, we subsequently focused on determining putative regulatory regions for these and other tissues of relevance to blood pressure, to both fine-map these loci by pinpointing genes and variants of functional interest within them, and to identify any novel genes.We constructed maps of putative cis-regulatory elements using publicly available open chromatin data for the heart, aorta and tibial arteries, and multiple kidney cell types. Sequence variants within these regions may be evaluated quantitatively for their tissue- or cell-type-specific regulatory impact using deltaSVM functional scores, as described in our previous work. In order to identify genes of interest, we aggregate these variants in these putative cis-regulatory elements within 50Kb of the start or end of genes considered as “expressed” in these tissues or cell types using publicly available gene expression data, and use the deltaSVM scores as weights in the well-known group-wise sequence kernel association test (SKAT). We test for association with both blood pressure traits as well as expression within these tissues or cell types of interest, and identify several genes, including MTHFR, C10orf32, CSK, NOV, ULK4, SDCCAG8, SCAMP5, RPP25, HDGFRP3, VPS37B, and PPCDC. Although our study centers on blood pressure traits, we additionally examined two known genes, SCN5A and NOS1AP involved in the cardiac trait QT interval, in the Atherosclerosis Risk in Communities Study (ARIC), as a positive control, and observed an expected heart-specific effect. Thus, our method may be used to identify variants and genes for further functional testing using tissue- or cell-type-specific putative regulatory information.Author SummarySequence change in genes (“variants”) are linked to the presence and severity of different traits or diseases. However, as genes may be expressed in different tissues and at different times and degrees, using this information is expected to more accurately identify genes of interest. Variants within the genes are essential, but also in the sequences (“regulatory elements”) that control the genes’ expression in different tissues or cell types. In this study, we aim to use this information about expression and variants potentially involved in gene expression regulation to better pinpoint genes and variants in regulatory elements of interest for blood pressure regulation. We do so by taking advantage of such data that are publicly available, and use methods to combine information about variants in aggregate within a gene’s putative regulatory elements in tissues thought to be relevant for blood pressure, and identify several genes, meant to enable experimental follow-up.


2020 ◽  
Author(s):  
Yupeng Wang ◽  
Rosario Jaime-Lara ◽  
Abhrarup Roy ◽  
Ying Sun ◽  
Xinyue Liu ◽  
...  

Abstract ObjectiveComputational identification of cell type-specific regulatory elements on a genome-wide scale is very challenging.ResultsWe propose SeqEnhDL, a deep learning framework for classifying cell type-specific enhancers based on sequence features. DNA sequences of “strong enhancer” chromatin states in nine cell types from the ENCODE project were retrieved to build and test enhancer classifiers. For any DNA sequence, sequential k-mer (k=5, 7, 9 and 11) fold changes relative to randomly selected non-coding sequences were used as features for deep learning models. Three deep learning models were implemented, including multi-layer perceptron (MLP), Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN). All models in SeqEnhDL outperform state-of-the-art enhancer classifiers including gkm-SVM and DanQ, with regard to distinguishing cell type-specific enhancers from randomly selected non-coding sequences. Moreover, SeqEnhDL is able to directly discriminate enhancers from different cell types, which has not been achieved by other enhancer classifiers. Our analysis suggests that both enhancers and their tissue-specificity can be accurately identified according to their sequence features. SeqEnhDL is publicly available at https://github.com/wyp1125/SeqEnhDL.


2021 ◽  
Author(s):  
Rujin Wang ◽  
Danyu Lin ◽  
Yuchao Jiang

More than a decade of genome-wide association studies (GWASs) have identified genetic risk variants that are significantly associated with complex traits. Emerging evidence suggests that the function of trait-associated variants likely acts in a tissue- or cell-type-specific fashion. Yet, it remains challenging to prioritize trait-relevant tissues or cell types to elucidate disease etiology. Here, we present EPIC (cEll tyPe enrIChment), a statistical framework that relates large-scale GWAS summary statistics to cell-type-specific omics measurements from single-cell sequencing. We derive powerful gene-level test statistics for common and rare variants, separately and jointly, and adopt generalized least squares to prioritize trait-relevant tissues or cell types while accounting for the correlation structures both within and between genes. Using enrichment of loci associated with four lipid traits in the liver and enrichment of loci associated with three neurological disorders in the brain as ground truths, we show that EPIC outperforms existing methods. We extend our framework to single-cell transcriptomic data and identify cell types underlying type 2 diabetes and schizophrenia. The enrichment is replicated using independent GWAS and single-cell datasets and further validated using PubMed search and existing bulk case-control testing results.


Sign in / Sign up

Export Citation Format

Share Document