Osteoporosis- and obesity-risk interrelationships: An epigenetic analysis of GWAS-derived SNPs at the developmental geneTBX15

Mapping Intimacies ◽

10.1101/766584 ◽

2019 ◽

Author(s):

Xiao Zhang ◽

Kenneth C. Ehrlich ◽

Fangtang Yu ◽

Xiaojun Hu ◽

Hong-Wen Deng ◽

...

Keyword(s):

Transcription Factor ◽

Bone Development ◽

Association Studies ◽

High Linkage Disequilibrium ◽

Causal Snps ◽

Genome Wide Association Studies ◽

Regulatory Variants ◽

Manual Curation ◽

Functional Variants ◽

Intron 1

AbstractA major challenge in translating findings from genome-wide association studies (GWAS) to biological mechanisms is pinpointing functional variants because only a very small percentage of variants associated with a given trait actually impact the trait. We used an extensive epigenetics, transcriptomics, and genetics analysis of theTBX15/WARS2neighborhood to prioritize this region’s best-candidate causal variants for the genetic risk of osteoporosis (estimated bone density, eBMD) and obesity (waist-hip ratio or waist circumference adjusted for body mass index).TBX15encodes a transcription factor that is important in bone development and adipose biology. Manual curation of 692 GWAS-derived variants gave eight strong candidates for causal SNPs that modulateTBX15transcription in subcutaneous adipose tissue (SAT) or osteoblasts, which highly and specifically express this gene. None of these SNPs were prioritized by Bayesian fine-mapping. The eight regulatory causal SNPs were in enhancer or promoter chromatin seen preferentially in SAT or osteoblasts atTBX15intron-1 or upstream. They overlap strongly predicted, allele-specific transcription factor binding sites. Our analysis suggests that these SNPs act independently of two missense SNPs inTBX15. Remarkably, five of the regulatory SNPs were associated with eBMD and obesity and had the same trait-increasing allele for both. We found thatWARS2obesity-related SNPs can be ascribed to high linkage disequilibrium withTBX15intron-1 SNPs. Our findings from GWAS index, proxy, and imputed SNPs suggest that a few SNPs, including three in a 0.7-kb cluster, act as causal regulatory variants to fine-tuneTBX15expression and, thereby, affect both obesity and osteoporosis risk.

Download Full-text

Hybrid allele-specific ChIP-Seq analysis links variation in transcription factor binding to traits in maize

10.21203/rs.3.rs-543958/v1 ◽

2021 ◽

Author(s):

Thomas Hartwig ◽

Michael Banf ◽

Gisele Prietsch ◽

Julia Engelhorn ◽

Jinliang Yang ◽

...

Keyword(s):

Transcription Factor ◽

Target Genes ◽

Phenotypic Diversity ◽

Association Studies ◽

Chromatin Accessibility ◽

Genome Wide Association Studies ◽

Functional Variants ◽

Genome Wide ◽

Allele Specific ◽

Hybrid Allele

Abstract Variation in transcriptional regulation is a major cause of phenotypic diversity. Genome-wide association studies (GWAS) have shown that most functional variants reside in non-coding regions, where they potentially affect transcription factor (TF) binding and chromatin accessibility to alter gene expression. Pinpointing such regulatory variations, however, remains challenging. Here, we developed a hybrid allele-specific chromatin binding sequencing (HASCh-seq) approach and identified variations in target binding of the brassinosteroid (BR) responsive transcription factor ZmBZR1 in maize. Chromatin immunoprecipitation followed by sequencing (ChIP-seq) in B73xMo17 F1s identified thousands of target genes of ZmBZR1. Allele-specific ZmBZR1 binding (ASB) was observed for about 14.3% of target genes. It correlated with over 550 loci containing sequence variation in BZR1-binding motifs and over 340 loci with haplotype-specific DNA methylation, linking genetic and epigenetic variations to ZmBZR1 occupancy. Comparison with GWAS data linked hundreds of ASB loci to important yield, growth, and disease-related traits. Our study provides a robust method for analyzing genome-wide variations of transcription factor occupancy and identified genetic and epigenetic variations of the BR response transcription network in maize.

Download Full-text

Functional regulatory variants implicate distinct transcriptional networks in dementia

10.1101/2021.06.14.448395 ◽

2021 ◽

Author(s):

Yonatan A. Cooper ◽

Jessica E. Davis ◽

Sriram Kosuri ◽

Giovanni Coppola ◽

Daniel H. Geschwind

Keyword(s):

Genetic Risk ◽

Disease Risk ◽

Specific Activity ◽

Association Studies ◽

Transcriptional Networks ◽

Genome Wide Association Studies ◽

Regulatory Variants ◽

Functional Variants ◽

Common Genetic Variants ◽

Complement 4

Predicting functionality of noncoding variation is one of the major challenges in modern genetics. We employed massively parallel reporter assays to screen 5,706 variants from genome-wide association studies for both Alzheimers disease (AD) and Progressive Supranuclear Palsy (PSP). We identified 320 functional regulatory polymorphisms (SigVars) comprising 27 of 34 unique tested loci, including multiple independent signals across the complex 17q21.31 region. We identify novel risk genes including PLEKHM1 in PSP and APOC1 in AD, and perform gene-editing to validate four distinct causal loci, confirming complement 4 (C4A) as a novel genetic risk factor for AD. Moreover, functional variants preferentially disrupt transcription factor binding sites that converge on enhancers with differential cell-type specific activity in PSP and AD, implicating a neuronal SP1-driven regulatory network in PSP pathogenesis. These analyses support a novel mechanism underlying noncoding genetic risk, whereby common genetic variants drive disease risk via their aggregate activity on specific transcriptional programs.

Download Full-text

Abstract 48: A newly identified rare variant (chr11:47227430) with possible functional activity is associated with fasting insulin at the chromosome 11p11.2-NR1H3 locus in the Cohorts for Heart and Aging Research in Genetic Epidemiology Targeted Sequencing Study (CHARGE-TSS).

Circulation ◽

10.1161/circ.129.suppl_1.48 ◽

2014 ◽

Vol 129 (suppl_1) ◽

Author(s):

Marco Dauriz ◽

Belinda K Cornes ◽

Jennifer A Brody ◽

Naghmeh Nikpoor ◽

Alanna C Morrison ◽

...

Keyword(s):

Transcription Factor ◽

Glucose Homeostasis ◽

Transcriptional Activity ◽

Rare Variants ◽

Association Studies ◽

Regulatory Function ◽

Genome Wide Association Studies ◽

Functional Studies ◽

Functional Variants ◽

Insulin Regulation

Aim: Common variation at the polygenic 11p11.2 locus has been associated with fasting glucose (FG) and insulin (FI) in genome-wide association studies. Further insights into the genetic pathways involved in glucose homeostasis and type 2 diabetes pathogenesis might rely on discovery of functional variants in genes or regulatory regions. Hypothesis: We hypothesized that high-throughput next-generation deep sequencing at the polygenic 11p11.2 locus might identify additional rare, potentially functional variants influencing FG and/or FI levels. Methods: We deeply sequenced (mean depth 38X) 16.1kb across the 11p11.2 locus in 3,566 non-diabetic individuals enrolled in the CHARGE Consortium (http://web.chargeconsortium.com/). We analyzed rare variants (minor allele frequency [MAF] <1%) in five gene regions, including MADD , ACP2 , NR1H3 , MYBPC3 and SPI1 , with FI or FG using Sequence Kernel Association Test (SKAT). Predicted regulatory variants were then analyzed by conditioning in SKAT on two previously known variants at MADD locus (rs7944584 and rs10838687 associated, respectively, with FG and FI). All analyses were adjusted for age, sex and study design variables. FI (adjusted for BMI) was naturally log-transformed to improve normality. Further functional studies were performed in human HepG2 hepatoma cells to unravel possible mechanistic pathways linked to functional variants. Results: We identified 653 allelic variants (including the known rs7944584 and rs10838687), 79.9% of which were rare and novel. At NR1H3, 53 rare variants were jointly associated with FI ( p =2.7 x 10 -3 ); of these, seven were predicted to have regulatory function. Conditional analysis suggested more than two independent signals at 11p11.2- MADD locus. One predicted regulatory variant, chr11:47227430 (hg18; MAF=0.0007), contributed 20.6% to the overall SKAT score at NR1H3, and lies in intron 2 of NR1H3 , a predicted binding site of the FOXA1 enhancer, a transcription factor associated with insulin regulation. Functional studies in HepG2 cells showed that the chr11:47227430 variant disrupts FOXA1 binding and significantly reduces FOXA1-dependent transcriptional activity. Conclusions/interpretation: We confirmed known common FI-associated variants near MADD gene and identified rare variation in an intron of NR1H3 associated with FI. Functional in vitro studies showed that the rare A allele of the chr11:47227430 variant at the NR1H3 locus might theoretically affect insulin regulation by interfering with transcription factor FOXA1 binding and, consequently, FOXA1-dependent transcriptional activity. Our targeted deep resequencing approach proved valuable in identifying new rare functional variants; quantitation of their actual impact on glucose homeostasis needs further confirmation.

Download Full-text

GERV: A Statistical Method for Generative Evaluation of Regulatory Variants for Transcription Factor Binding

10.1101/017392 ◽

2015 ◽

Cited By ~ 1

Author(s):

Haoyang Zeng ◽

Tatsunori Hashimoto ◽

Daniel D. Kang ◽

David K. Gifford

Keyword(s):

Transcription Factor ◽

Specific Binding ◽

Association Studies ◽

Transcription Factor Binding ◽

Computational Method ◽

Breast Cancer Cell Lines ◽

Genome Wide Association Studies ◽

Factor Binding ◽

Regulatory Variants ◽

Causal Variants

The majority of disease-associated variants identified in genome-wide association studies (GWAS) reside in noncoding regions of the genome with regulatory roles. Thus being able to interpret the functional consequence of a variant is essential for identifying causal variants in the analysis of GWAS studies. We present GERV (Generative Evaluation of Regulatory Variants), a novel computational method for predicting regulatory variants that affect transcription factor binding. GERV learns a k-mer based generative model of transcription factor binding from ChIP-seq and DNase-seq data, and scores variants by computing the change of predicted ChIP-seq reads between the reference and alternate allele. The k-mers learned by GERV capture more sequence determinants of transcription factor binding than a motif-based approach alone, including both a transcription factor's canonical motif as well as associated co-factor motifs. We show that GERV outperforms existing methods in predicting SNPs associated with allele-specific binding. GERV correctly predicts a validated causal variant among linked SNPs, and prioritizes the variants previously reported to modulate the binding of FOXA1 in breast cancer cell lines. Thus, GERV provides a powerful approach for functionally annotating and prioritizing causal variants for experimental follow-up analysis.

Download Full-text

A comprehensive integrated post-GWAS analysis of Type 1 diabetes reveals enhancer-based immune dysregulation

PLoS ONE ◽

10.1371/journal.pone.0257265 ◽

2021 ◽

Vol 16 (9) ◽

pp. e0257265

Author(s):

Seung-Soo Kim ◽

Adam D. Hudgins ◽

Jiping Yang ◽

Yizhou Zhu ◽

Zhidong Tu ◽

...

Keyword(s):

Type 1 Diabetes ◽

Target Genes ◽

Association Studies ◽

Regulatory Elements ◽

Immune Dysregulation ◽

Specific Gene ◽

Genome Wide Association Studies ◽

Gwas Analysis ◽

Regulatory Variants

Type 1 diabetes (T1D) is an organ-specific autoimmune disease, whereby immune cell-mediated killing leads to loss of the insulin-producing β cells in the pancreas. Genome-wide association studies (GWAS) have identified over 200 genetic variants associated with risk for T1D. The majority of the GWAS risk variants reside in the non-coding regions of the genome, suggesting that gene regulatory changes substantially contribute to T1D. However, identification of causal regulatory variants associated with T1D risk and their affected genes is challenging due to incomplete knowledge of non-coding regulatory elements and the cellular states and processes in which they function. Here, we performed a comprehensive integrated post-GWAS analysis of T1D to identify functional regulatory variants in enhancers and their cognate target genes. Starting with 1,817 candidate T1D SNPs defined from the GWAS catalog and LDlink databases, we conducted functional annotation analysis using genomic data from various public databases. These include 1) Roadmap Epigenomics, ENCODE, and RegulomeDB for epigenome data; 2) GTEx for tissue-specific gene expression and expression quantitative trait loci data; and 3) lncRNASNP2 for long non-coding RNA data. Our results indicated a prevalent enhancer-based immune dysregulation in T1D pathogenesis. We identified 26 high-probability causal enhancer SNPs associated with T1D, and 64 predicted target genes. The majority of the target genes play major roles in antigen presentation and immune response and are regulated through complex transcriptional regulatory circuits, including those in HLA (6p21) and non-HLA (16p11.2) loci. These candidate causal enhancer SNPs are supported by strong evidence and warrant functional follow-up studies.

Download Full-text

MORFEE: a new tool for detecting and annotating single nucleotide variants creating premature ATG codons from VCF files

10.1101/2020.03.29.012054 ◽

2020 ◽

Cited By ~ 1

Author(s):

Dylan Aïssi ◽

Omar Soukarieh ◽

Carole Proust ◽

Beatrice Jaspard-Vinassa ◽

Pierre Fautrad ◽

...

Keyword(s):

Stop Codon ◽

Association Studies ◽

Premature Stop Codon ◽

Open Reading Frames ◽

Strong Impact ◽

Genome Wide Association Studies ◽

Single Nucleotide Variants ◽

Functional Variants ◽

Genome Wide ◽

Upstream Open Reading Frames

AbstractSummaryVariants in 5’UTR regions that create upstream translation initiation AUG codons are a class of neglected non coding variations. When they associate with a premature stop codon and create upstream open reading frames (uORFs) whose translation competes with that of natural proteins, they can have strong impact on human diseases. We here describe MORFEE, a new bioinformatics tool that detects, annotates and predicts, from a standard VCF file, the creation of uORF by any 5’UTR variants on uORF creation. MORFEE was applied to two genomic resources and identified candidate functional variants that could explain statistical association signals observed in the context of Genome Wide Association Studies or could be responsible for rare forms of diseases. In conclusion MORFEE is an easy-to-use tool complementary to existing ones that can help resolving genetic investigations that remained so far unfruitful.Availability and implementationMORFEE is written in R with code and package available at https://github.com/daissi/[email protected]; [email protected]

Download Full-text

Abstract 18374: Targeted Sequencing and Massively Parallel Reporter Assay Identify the Functional Variation Underlying the 4q25 Locus for Atrial Fibrillation

Circulation ◽

10.1161/circ.132.suppl_3.18374 ◽

2015 ◽

Vol 132 (suppl_3) ◽

Author(s):

Nathan R Tucker ◽

Jiangchuan Ye ◽

Honghuang Lin ◽

Michael A McLellan ◽

Emelia J Benjamin ◽

...

Keyword(s):

Atrial Fibrillation ◽

Association Studies ◽

Targeted Sequencing ◽

Massively Parallel ◽

Genome Wide Association Studies ◽

Sequencing Analysis ◽

Reporter Assay ◽

Enhancer Activity ◽

Functional Variants ◽

Massively Parallel Reporter Assay

Introduction: Genome-wide association studies have identified 14 independent loci for atrial fibrillation (AF). The 4q25 locus upstream of the left-right asymmetry gene PITX2 is, by far, the strongest association signal for AF. However, as with most GWAS loci, the functional variants are noncoding, presumed to be regulatory, and remain unknown. We therefore sought to rapidly identify the functional variants at an AF locus by combining high throughput sequencing and massively parallel reporter assays. Methods and Results: We sequenced a ~750kb region encompassing the PITX2 locus in 462 individuals with early-onset AF from the MGH AF Study and 464 referents from the Framingham Heart Study. The SNP most significantly associated with AF in our sequenced sample was rs2129983, which is 140kb from PITX2 (OR=2.43, P =8.9X10 -16 ). rs2129983 is approximately 1.7kb from the most significantly associated SNP in a prior AF GWAS, rs6817105 (r 2 =0.52). From the targeted sequencing analysis, we identified 262 SNVs with a MAF >0.5% within a genomic region bounded by SNPs with an r2 greater than 0.4 with the top variant. To identify functional variants, we then utilized a massively parallel reporter assay (MPRA) in order to measure enhancer activity at each SNP across the entire AF locus. In both HL-1 and C2C12 myoblasts, MPRA identified many distinct SNP regions with differential enhancer activity. Using AF-association status as a standard, we were able to identify a series of variants that have both differential activity in either cell line tested and also a high level of association (rs17042076, rs4469143). Mechanistically, these functional SNPs are predicted to alter transcription factor binding. Conclusions: We have comprehensively identified the AF-associated variation at 4q25 and determined which of these variants are functional through differential enhancer activity. Here, in addition to identifying the causative variation for AF at 4q25, we provide a generalizable pathway for translating this work to other loci, a method that could expedite the identification of causative genetic variants at other disease loci.

Download Full-text

Identification of molecular markers for starch content in barley (Hordeum vulgare L.) by genome-wide association studies based on bulked samples

Plant Genetic Resources ◽

10.1017/s1479262120000143 ◽

2020 ◽

Vol 18 (3) ◽

pp. 111-119

Author(s):

Yinghu Zhang ◽

Haiye Luan ◽

Hui Zang ◽

Hongyan Yang ◽

Xiao Xu ◽

...

Keyword(s):

Molecular Markers ◽

Association Studies ◽

Starch Content ◽

Principal Component ◽

Mixed Linear Model ◽

High Linkage Disequilibrium ◽

Snp Markers ◽

Genome Wide Association Studies ◽

Hordeum Vulgare L ◽

Growing Seasons

AbstractStarch content is an important trait in barley. To evaluate the genetic diversity and identify molecular markers of starch content in barley, 40 cultivated barley genotypes collected from different regions, including genotypes whose starch content is at either the high or low end of the spectrum (15), were used in this study. All the genotypes were re-sequenced by the double-digest-restriction associated DNA sequencing method, and a total of 299,103 single-nucleotide polymorphism (SNP) markers were obtained. The genotypes were divided into four sub-populations based on FASTSTRUCTURE, principal component analysis and neighbour-joining tree analysis. All four sub-populations had a high linkage disequilibrium, especially group 3, whose members were recently bred for malting in the Jiangsu coastal area. The starch content of the barley lines was evaluated during three growing seasons (2014–2017), and the average values of starch content across the three growing seasons at the low and high ends were 51.5 and 55.0%, respectively. The starch content was affected by population structure, the barley in group 2 had a low starch content, while the barley in group 4 had a high starch content. Twenty-six SNP markers were identified as being significantly associated with starch content (P ⩽ 0.001) based on the average values across the three growing seasons using the mixed linear model method. These SNP markers were located on chromosomes 1H and 4H, and were considered loci of qSC1-1 and qSC4-1, respectively. The major identified QTLs for starch content are helpful for further research on carbohydrates and for barley breeding.

Download Full-text

Genetics of COPD

Annual Review of Physiology ◽

10.1146/annurev-physiol-021317-121224 ◽

2020 ◽

Vol 82 (1) ◽

pp. 413-431 ◽

Cited By ~ 6

Author(s):

Edwin K. Silverman

Keyword(s):

Association Studies ◽

Chronic Obstructive ◽

Genome Wide Association Studies ◽

Obstructive Pulmonary Disease ◽

Functional Variants ◽

Genome Wide ◽

Antitrypsin Deficiency ◽

Alpha 1 Antitrypsin Deficiency ◽

Alpha 1 Antitrypsin ◽

Genomic Regions

Although chronic obstructive pulmonary disease (COPD) risk is strongly influenced by cigarette smoking, genetic factors are also important determinants of COPD. In addition to Mendelian syndromes such as alpha-1 antitrypsin deficiency, many genomic regions that influence COPD susceptibility have been identified in genome-wide association studies. Similarly, multiple genomic regions associated with COPD-related phenotypes, such as quantitative emphysema measures, have been found. Identifying the functional variants and key genes within these association regions remains a major challenge. However, newly identified COPD susceptibility genes are already providing novel insights into COPD pathogenesis. Network-based approaches that leverage these genetic discoveries have the potential to assist in decoding the complex genetic architecture of COPD.

Download Full-text

Chronic lymphocytic leukemia (CLL) risk is mediated by multiple enhancer variants within CLL risk loci

Human Molecular Genetics ◽

10.1093/hmg/ddaa165 ◽

2020 ◽

Vol 29 (16) ◽

pp. 2761-2774

Author(s):

Huihuang Yan ◽

Shulan Tian ◽

Geffen Kleinstern ◽

Zhiquan Wang ◽

Jeong-Heon Lee ◽

...

Keyword(s):

Chronic Lymphocytic Leukemia ◽

Immune Cell ◽

Association Studies ◽

Cell Types ◽

Lymphocytic Leukemia ◽

Type I ◽

Genome Wide Association Studies ◽

Gene Promoters ◽

Functional Variants ◽

Increased Risk

Abstract Chronic lymphocytic leukemia (CLL) is the most common adult leukemia in Western countries. It has a strong genetic basis, showing a ~ 8-fold increased risk of CLL in first-degree relatives. Genome-wide association studies (GWAS) have identified 41 risk variants across 41 loci. However, for a majority of the loci, the functional variants and the mechanisms underlying their causal roles remain undefined. Here, we examined the genetic and epigenetic features associated with 12 index variants, along with any correlated (r2 ≥ 0.5) variants, at the CLL risk loci located outside of gene promoters. Based on publicly available ChIP-seq and chromatin accessibility data as well as our own ChIP-seq data from CLL patients, we identified six candidate functional variants at six loci and at least two candidate functional variants at each of the remaining six loci. The functional variants are predominantly located within enhancers or super-enhancers, including bi-directionally transcribed enhancers, which are often restricted to immune cell types. Furthermore, we found that, at 78% of the functional variants, the alternative alleles altered the transcription factor binding motifs or histone modifications, indicating the involvement of these variants in the change of local chromatin state. Finally, the enhancers carrying functional variants physically interacted with genes enriched in the type I interferon signaling pathway, apoptosis, or TP53 network that are known to play key roles in CLL. These results support the regulatory roles for inherited noncoding variants in the pathogenesis of CLL.

Download Full-text