Annotations capturing cell-type-specific TF binding explain a large fraction of disease heritability

Mapping Intimacies ◽

10.1101/474684 ◽

2018 ◽

Cited By ~ 1

Author(s):

Bryce van de Geijn ◽

Hilary Finucane ◽

Steven Gazal ◽

Farhad Hormozdiari ◽

Tiffany Amariuta ◽

...

Keyword(s):

Gene Regulation ◽

Transcription Factors ◽

Complex Traits ◽

Complex Disease ◽

Specific Binding ◽

Large Fraction ◽

Genome Wide Association ◽

Cell Type ◽

Genome Wide ◽

Cell Type Specific

AbstractIt is widely known that regulatory variation plays a major role in complex disease and that cell-type-specific binding of transcription factors (TF) is critical to gene regulation, but genomic annotations from directly measured TF binding information are not currently available for most cell-type-TF pairs. Here, we construct cell-type-specific TF binding annotations by intersecting sequence-based TF binding predictions with cell-type-specific chromatin data; this strategy addresses both the limitation that identical sequences may be bound or unbound depending on surrounding chromatin context, and the limitation that sequence-based predictions are generally not cell-type-specific. We evaluated different combinations of sequence-based TF predictions and chromatin data by partitioning the heritability of 49 diseases and complex traits (average N=320K) using stratified LD score regression with the baseline-LD model (which is not cell-type-specific). We determined that 100bp windows around MotifMap sequenced-based TF binding predictions intersected with a union of six cell-type-specific chromatin marks (imputed using ChromImpute) performed best, with an 58% increase in heritability enrichment compared to the chromatin marks alone (11.6x vs 7.3x; P = 9 × 10-14 for difference) and a 12% increase in cell-type-specific signal conditional on annotations from the baseline-LD model (P = 8 × 10-11 for difference). Our results show that intersecting sequence-based TF predictions with cell-type-specific chromatin information can help refine genome-wide association signals.

Download Full-text

Annotations capturing cell type-specific TF binding explain a large fraction of disease heritability

Human Molecular Genetics ◽

10.1093/hmg/ddz226 ◽

2019 ◽

Vol 29 (7) ◽

pp. 1057-1067 ◽

Cited By ~ 3

Author(s):

Bryce van de Geijn ◽

Hilary Finucane ◽

Steven Gazal ◽

Farhad Hormozdiari ◽

Tiffany Amariuta ◽

...

Keyword(s):

Binding Sites ◽

Complex Traits ◽

Complex Disease ◽

Specific Binding ◽

Large Fraction ◽

Cell Types ◽

Genome Wide Association ◽

Cell Type ◽

Genome Wide ◽

Cell Type Specific

Abstract Regulatory variation plays a major role in complex disease and that cell type-specific binding of transcription factors (TF) is critical to gene regulation. However, assessing the contribution of genetic variation in TF-binding sites to disease heritability is challenging, as binding is often cell type-specific and annotations from directly measured TF binding are not currently available for most cell type-TF pairs. We investigate approaches to annotate TF binding, including directly measured chromatin data and sequence-based predictions. We find that TF-binding annotations constructed by intersecting sequence-based TF-binding predictions with cell type-specific chromatin data explain a large fraction of heritability across a broad set of diseases and corresponding cell types; this strategy of constructing annotations addresses both the limitation that identical sequences may be bound or unbound depending on surrounding chromatin context and the limitation that sequence-based predictions are generally not cell type-specific. We partitioned the heritability of 49 diseases and complex traits using stratified linkage disequilibrium (LD) score regression with the baseline-LD model (which is not cell type-specific) plus the new annotations. We determined that 100 bp windows around MotifMap sequenced-based TF-binding predictions intersected with a union of six cell type-specific chromatin marks (imputed using ChromImpute) performed best, with an 58% increase in heritability enrichment compared to the chromatin marks alone (11.6× vs. 7.3×, P = 9 × 10−14 for difference) and a 20% increase in cell type-specific signal conditional on annotations from the baseline-LD model (P = 8 × 10−11 for difference). Our results show that TF-binding annotations explain substantial disease heritability and can help refine genome-wide association signals.

Download Full-text

Inferring relevant tissues and cell types for complex traits in genome-wide association studies

10.1101/2021.06.09.447805 ◽

2021 ◽

Author(s):

Rujin Wang ◽

Danyu Lin ◽

Yuchao Jiang

Keyword(s):

Single Cell ◽

Complex Traits ◽

Association Studies ◽

Cell Types ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Cell Type ◽

Disease Etiology ◽

Genome Wide ◽

Cell Type Specific

More than a decade of genome-wide association studies (GWASs) have identified genetic risk variants that are significantly associated with complex traits. Emerging evidence suggests that the function of trait-associated variants likely acts in a tissue- or cell-type-specific fashion. Yet, it remains challenging to prioritize trait-relevant tissues or cell types to elucidate disease etiology. Here, we present EPIC (cEll tyPe enrIChment), a statistical framework that relates large-scale GWAS summary statistics to cell-type-specific omics measurements from single-cell sequencing. We derive powerful gene-level test statistics for common and rare variants, separately and jointly, and adopt generalized least squares to prioritize trait-relevant tissues or cell types while accounting for the correlation structures both within and between genes. Using enrichment of loci associated with four lipid traits in the liver and enrichment of loci associated with three neurological disorders in the brain as ground truths, we show that EPIC outperforms existing methods. We extend our framework to single-cell transcriptomic data and identify cell types underlying type 2 diabetes and schizophrenia. The enrichment is replicated using independent GWAS and single-cell datasets and further validated using PubMed search and existing bulk case-control testing results.

Download Full-text

Cell-type specific and combinatorial usage of diverse transcription factors revealed by genome-wide binding studies in multiple human cells

Genome Research ◽

10.1101/gr.127597.111 ◽

2011 ◽

Vol 22 (1) ◽

pp. 9-24 ◽

Cited By ~ 77

Author(s):

B.-K. Lee ◽

A. A. Bhinge ◽

A. Battenhouse ◽

R. M. McDaniell ◽

Z. Liu ◽

...

Keyword(s):

Transcription Factors ◽

Human Cells ◽

Cell Type ◽

Binding Studies ◽

Genome Wide ◽

Cell Type Specific

Download Full-text

Evaluating the contribution of cell-type specific alternative splicing to variation in lipid levels

10.1101/659326 ◽

2019 ◽

Author(s):

K.A.B. Gawronski ◽

W. Bone ◽

Y. Park ◽

E. Pashos ◽

X. Wang ◽

...

Keyword(s):

Alternative Splicing ◽

Quantitative Trait ◽

Cell Types ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Lipid Levels ◽

Cell Type ◽

Genome Wide ◽

Genetic Mechanisms ◽

Cell Type Specific

AbstractBackgroundGenome-wide association studies have identified 150+ loci associated with lipid levels. However, the genetic mechanisms underlying most of these loci are not well-understood. Recent work indicates that changes in the abundance of alternatively spliced transcripts contributes to complex trait variation. Consequently, identifying genetic loci that associate with alternative splicing in disease-relevant cell types and determining the degree to which these loci are informative for lipid biology is of broad interest.Methods and ResultsWe analyze gene splicing in 83 sample-matched induced pluripotent stem cell (iPSC) and hepatocyte-like cell (HLC) lines (n=166), as well as in an independent collection of primary liver tissues (n=96). We observe that transcript splicing is highly cell-type specific, and the genes that are differentially spliced between iPSCs and HLCs are enriched for metabolism pathway annotations. We identify 1,381 HLC splicing quantitative trait loci (sQTLs) and 1,462 iPSC sQTLs and find that sQTLs are often shared across cell types. To evaluate the contribution of sQTLs to variation in lipid levels, we conduct colocalization analysis using lipid genome-wide association data. We identify 19 lipid-associated loci that colocalize either with an HLC expression quantitative trait locus (eQTL) or sQTL. Only one locus colocalizes with both an sQTL and eQTL, indicating that sQTLs contribute information about GWAS loci that cannot be obtained by analysis of steady-state gene expression alone.ConclusionsThese results provide an important foundation for future efforts that use iPSC and iPSC-derived cells to evaluate genetic mechanisms influencing both cardiovascular disease risk and complex traits in general.

Download Full-text

Myeloid lncRNA LOUP Mediates Opposing Regulatory Effects of RUNX1 and RUNX1-ETO in t(8;21) AML

Blood ◽

10.1182/blood.2020007920 ◽

2021 ◽

Author(s):

Bon Q Trinh ◽

Simone Ummarino ◽

Yanzhou Zhang ◽

Alexander K Ebralidze ◽

Mahmoud A Bassal ◽

...

Keyword(s):

Transcription Factors ◽

Noncoding Rna ◽

Gene Activation ◽

Chromatin Accessibility ◽

Specific Gene ◽

Cell Type ◽

Genome Wide ◽

Therapeutic Development ◽

Cell Type Specific ◽

Regulatory Effects

The mechanism underlying cell type-specific gene induction conferred by ubiquitous transcription factors as well as disruptions caused by their chimeric derivatives in leukemia is not well understood. Here we investigate whether RNAs coordinate with transcription factors to drive myeloid gene transcription. In an integrated genome-wide approach surveying for gene loci exhibiting concurrent RNA- and DNA-interactions with the broadly expressed transcription factor RUNX1, we identified the long noncoding RNA LOUP. This myeloid-specific and polyadenylated lncRNA induces myeloid differentiation and inhibits cell growth, acting as a transcriptional inducer of the myeloid master regulator PU.1. Mechanistically, LOUP recruits RUNX1 to both the PU.1 enhancer and the promoter, leading to the formation of an active chromatin loop. In t(8;21) acute myeloid leukemia, wherein RUNX1 is fused to ETO, the resulting oncogenic fusion protein RUNX1-ETO limits chromatin accessibility at the LOUP locus, causing inhibition of LOUP and PU.1 expression. These findings highlight the important role of the interplay between cell type-specific RNAs and transcription factors as well as their oncogenic derivatives in modulating lineage-gene activation and raise the possibility that RNA regulators of transcription factors represent alternative targets for therapeutic development.

Download Full-text

ECLIPSER: identifying causal cell types and genes for complex traits through single cell enrichment of e/sQTL-mapped genes in GWAS loci

10.1101/2021.11.24.469720 ◽

2021 ◽

Author(s):

John M Rouhana ◽

Jiali Wang ◽

Gokcen Eraslan ◽

Shankara Anand ◽

Andrew R Hamel ◽

...

Keyword(s):

Single Cell ◽

Complex Traits ◽

Complex Disease ◽

Skin Diseases ◽

Source Code ◽

Data Access ◽

Cell Types ◽

Cell Type ◽

Healthy Human ◽

Cell Type Specific

Summary: ECLIPSER was developed to identify pathogenic cell types and cell type-specific genes that may affect complex disease susceptibility and trait variation by integrating single cell data with known GWAS loci. ECLIPSER maps genes to GWAS loci for a given complex trait based on expression and splicing quantitative trait loci (e/sQTLs) and other functional data, and tests whether the mapped genes are enriched for cell type-specific expression in particular cell types using single-cell/nucleus RNA-seq data from one or more tissues of interest. A Bayesian Fisher's exact test is used to compute fold-enrichment significance. We demonstrate the application of ECLIPSER on various skin diseases and traits using snRNA-seq of healthy human skin samples. Availability and Implementation: The python source code and documentation for ECLIPSER and a Jupyter notebook for generating output tables and figures are available at https://github.com/segrelabgenomics/ECLIPSER. The source code for GWASvar2gene that maps genes to GWAS loci based on e/sQTLs is available at https://github.com/segrelabgenomics/GWASvar2gene. The analysis presented here used data from GTEx (https://gtexportal.org/home/datasets) and Open Targets Genetics (https://genetics-docs.opentargets.org/data-access/graphql-api), but can also be applied to other GWAS variant lists and QTL studies. Data used to reproduce the results of the paper are available in Supplementary data.

Download Full-text

Integration of high-resolution promoter profiling assays reveals novel, cell type-specific transcription start sites across 115 human cell and tissue types

Genome Research ◽

10.1101/gr.275723.121 ◽

2021 ◽

pp. gr.275723.121

Author(s):

Jill E Moore ◽

Xiao-Ou Zhang ◽

Shaimae I Elhajjajy ◽

Kaili Fan ◽

Henry E Pratt ◽

...

Keyword(s):

Disease Gene ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Cell Type ◽

Transcription Start ◽

Transcription Start Sites ◽

Genome Wide ◽

Cell Type Specific ◽

Gwas Catalog

Accurate transcription start site (TSS) annotations are essential for understanding transcriptional regulation and its role in human disease. Gene collections such as GENCODE contain annotations for tens of thousands of TSSs, but not all of these annotations are experimentally validated, nor do they contain information on cell type-specific usage. Therefore, we sought to generate a collection of experimentally validated TSSs by integrating RNA Annotation and Mapping of Promoters for the Analysis of Gene Expression (RAMPAGE) data from 115 cell and tissue types, which resulted in a collection of approximately 50 thousand representative RAMPAGE peaks. These peaks were primarily proximal to GENCODE-annotated TSSs and were concordant with other transcription assays. Because RAMPAGE uses paired-end reads, we were then able to connect peaks to transcripts by analyzing the genomic positions of the 3' ends of read mates. Using this paired-end information, we classified the vast majority (37 thousand) of our RAMPAGE peaks as verified TSSs, updating TSS annotations for 20% of GENCODE genes. We also found that these updated TSS annotations were supported by epigenomic and other transcriptomic datasets. To demonstrate the utility of this RAMPAGE rPeak collection, we intersected it with the NHGRI/EBI genome-wide association studies (GWAS) catalog and identified new candidate GWAS genes. Overall, our work demonstrates the importance of integrating experimental data to further refine TSS annotations and provides a valuable resource for the biological community.

Download Full-text

Identifying Cell Type-Specific Transcription Factors by Integrating ChIP-seq and eQTL Data-Application to Monocyte Gene Regulation

Gene Regulation and Systems Biology ◽

10.4137/grsb.s40768 ◽

2016 ◽

Vol 10 ◽

pp. GRSB.S40768 ◽

Cited By ~ 2

Author(s):

Mudra Choudhury ◽

Stephen A. Ramsey

Keyword(s):

Gene Regulation ◽

Transcription Factors ◽

Cell Type ◽

Data Application ◽

Cell Type Specific ◽

Eqtl Data

Download Full-text

Predicting FOXM1-Mediated Gene Regulation through the Analysis of Genome-Wide FOXM1 Binding Sites in MCF-7, K562, SK-N-SH, GM12878 and ECC-1 Cell Lines

International Journal of Molecular Sciences ◽

10.3390/ijms21176141 ◽

2020 ◽

Vol 21 (17) ◽

pp. 6141

Author(s):

Keunsoo Kang ◽

Yoonjung Choi ◽

Hoo Hyun Kim ◽

Kyung Hyun Yoo ◽

Sungryul Yu

Keyword(s):

Cell Cycle ◽

Gene Regulation ◽

Cell Lines ◽

Binding Sites ◽

Binding Motif ◽

Cell Type ◽

Gene Set ◽

Genome Wide ◽

Cell Type Specific ◽

Mcf 7

Forkhead box protein M1 (FOXM1) is a key transcription factor (TF) that regulates a common set of genes related to the cell cycle in various cell types. However, the mechanism by which FOXM1 controls the common gene set in different cellular contexts is unclear. In this study, a comprehensive meta-analysis of genome-wide FOXM1 binding sites in ECC-1, GM12878, K562, MCF-7, and SK-N-SH cell lines was conducted to predict FOXM1-driven gene regulation. Consistent with previous studies, different TF binding motifs were identified at FOXM1 binding sites, while the NFY binding motif was found at 81% of common FOXM1 binding sites in promoters of cell cycle-related genes. The results indicated that FOXM1 might control the gene set through interaction with the NFY proteins, while cell type-specific genes were predicted to be regulated by enhancers with FOXM1 and cell type-specific TFs. We also found that the high expression level of FOXM1 was significantly associated with poor prognosis in nine types of cancer. Overall, these results suggest that FOXM1 is predicted to function as a master regulator of the cell cycle through the interaction of NFY-family proteins, and therefore the inhibition of FOXM1 could be an attractive strategy for cancer therapy.

Download Full-text

Faculty Opinions recommendation of Estimation of complex effect-size distributions using summary-level statistics from genome-wide association studies across 32 complex traits.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.733803377.793550136 ◽

2018 ◽

Author(s):

Mohan Liu

Keyword(s):

Effect Size ◽

Complex Traits ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Size Distributions ◽

Complex Effect ◽

Genome Wide ◽

Level Statistics

Download Full-text