scholarly journals Which genetic variants in DNase I sensitive regions are functional?

2014 ◽  
Author(s):  
Gregory A Moyerbrailean ◽  
Chris T Harvey ◽  
Cynthia A Kalita ◽  
Xiaoquan Wen ◽  
Francesca Luca ◽  
...  

Ongoing large experimental characterization is crucial to determine all regulatory sequences, yet we do not know which genetic variants in those regions are non-silent. Here, we present a novel analysis integrating sequence and DNase I footprinting data for 653 samples to predict the impact of a sequence change on transcription factor binding for a panel of 1,372 motifs. Most genetic variants in footprints (5,810,227) do not show evidence of allele-specific binding (ASB). In contrast, functional genetic variants predicted by our computational models are highly enriched for ASB (3,217 SNPs at 20% FDR). Comparing silent to functional non-coding genetic variants, the latter are 1.22-fold enriched for GWAS traits, have lower allele frequencies, and affect footprints more distal to promoters or active in fewer tissues. Finally, integration of the annotations into 18 GWAS meta-studies improves identification of likely causal SNPs and transcription factors relevant for complex traits.

2018 ◽  
Author(s):  
Cynthia A. Kalita ◽  
Christopher D. Brown ◽  
Andrew Freiman ◽  
Jenna Isherwood ◽  
Xiaoquan Wen ◽  
...  

Many variants associated with complex traits are in non-coding regions, and contribute to phenotypes by disrupting regulatory sequences. To characterize these variants, we developed a streamlined protocol for a high-throughput reporter assay, BiT-STARR-seq (Biallelic Targeted STARR-seq), that identifies allele-specific expression (ASE) while accounting for PCR duplicates through unique molecular identifiers. We tested 75,501 oligos (43,500 SNPs) and identified 2,720 SNPs with significant ASE (FDR 10%). To validate disruption of binding as one of the mechanisms underlying ASE, we developed a new high throughput allele specific binding assay for NFKB-p50. We identified 2,951 SNPs with allele-specific binding (ASB) (FDR 10%); 173 of these SNPs also had ASE (OR=1.97, p-value=0.0006). Of variants associated with complex traits, 1,531 resulted in ASE and 1,662 showed ASB. For example, we characterized that the Crohn’s disease risk variant for rs3810936 increases NFKB binding and results in altered gene expression.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Lauren L. Schmitz ◽  
Julia Goodwin ◽  
Jiacheng Miao ◽  
Qiongshi Lu ◽  
Dalton Conley

AbstractUnemployment shocks from the COVID-19 pandemic have reignited concerns over the long-term effects of job loss on population health. Past research has highlighted the corrosive effects of unemployment on health and health behaviors. This study examines whether the effects of job loss on changes in body mass index (BMI) are moderated by genetic predisposition using data from the U.S. Health and Retirement Study (HRS). To improve detection of gene-by-environment (G × E) interplay, we interacted layoffs from business closures—a plausibly exogenous environmental exposure—with whole-genome polygenic scores (PGSs) that capture genetic contributions to both the population mean (mPGS) and variance (vPGS) of BMI. Results show evidence of genetic moderation using a vPGS (as opposed to an mPGS) and indicate genome-wide summary measures of phenotypic plasticity may further our understanding of how environmental stimuli modify the distribution of complex traits in a population.


2009 ◽  
Vol 296 (5) ◽  
pp. L713-L725 ◽  
Author(s):  
Li Gao ◽  
Kathleen C. Barnes

It has been well established that acute lung injury (ALI), and the more severe presentation of acute respiratory distress syndrome (ARDS), constitute complex traits characterized by a multigenic and multifactorial etiology. Identification and validation of genetic variants contributing to disease susceptibility and severity has been hampered by the profound heterogeneity of the clinical phenotype and the role of environmental factors, which includes treatment, on outcome. The critical nature of ALI and ARDS, compounded by the impact of phenotypic heterogeneity, has rendered the amassing of sufficiently powered studies especially challenging. Nevertheless, progress has been made in the identification of genetic variants in select candidate genes, which has enhanced our understanding of the specific pathways involved in disease manifestation. Identification of novel candidate genes for which genetic association studies have confirmed a role in disease has been greatly aided by the powerful tool of high-throughput expression profiling. This article will review these studies to date, summarizing candidate genes associated with ALI and ARDS, acknowledging those that have been replicated in independent populations, with a special focus on the specific pathways for which candidate genes identified so far can be clustered.


1998 ◽  
Vol 18 (11) ◽  
pp. 6767-6776 ◽  
Author(s):  
Piroska E. Szabó ◽  
Gerd P. Pfeifer ◽  
Jeffrey R. Mann

ABSTRACT Genomic imprinting results in parent-specific monoallelic expression of a small number of genes in mammals. The identity of imprints is unknown, but much evidence points to a role for DNA methylation. The maternal alleles of the imprinted H19 gene are active and hypomethylated; the paternal alleles are inactive and hypermethylated. Roles for other epigenetic modifications are suggested by allele-specific differences in nuclease hypersensitivity at particular sites. To further analyze the possible epigenetic mechanisms determining monoallelic expression of H19, we have conducted in vivo dimethylsulfate and DNase I footprinting of regions upstream of the coding sequence in parthenogenetic and androgenetic embryonic stem cells. These cells carry only maternally and paternally derived alleles, respectively. We observed the presence of maternal-allele-specific dimethylsulfate and DNase I footprints at the promoter indicative of protein-DNA interactions at a CCAAT box and at binding sites for transcription factors Sp1 and AP-2. Also, at the boundary of a region further upstream for which existent differential methylation has been suggested to constitute an imprint, we observed a number of strand-specific dimethylsulfate reactivity differences specific to the maternal allele, along with an unusual chromatin structure in that both strands of maternally derived DNA were strongly hypersensitive to DNase I cutting over a distance of 100 nucleotides. We therefore reveal the existence of novel parent-specific epigenetic modifications, which in addition to DNA methylation, could constitute imprints or maintain monoallelic expression of H19.


Science ◽  
2020 ◽  
Vol 369 (6503) ◽  
pp. 561-565 ◽  
Author(s):  
Siwei Zhang ◽  
Hanwen Zhang ◽  
Yifan Zhou ◽  
Min Qiao ◽  
Siming Zhao ◽  
...  

Most neuropsychiatric disease risk variants are in noncoding sequences and lack functional interpretation. Because regulatory sequences often reside in open chromatin, we reasoned that neuropsychiatric disease risk variants may affect chromatin accessibility during neurodevelopment. Using human induced pluripotent stem cell (iPSC)–derived neurons that model developing brains, we identified thousands of genetic variants exhibiting allele-specific open chromatin (ASoC). These neuronal ASoCs were partially driven by altered transcription factor binding, overrepresented in brain gene enhancers and expression quantitative trait loci, and frequently associated with distal genes through chromatin contacts. ASoCs were enriched for genetic variants associated with brain disorders, enabling identification of functional schizophrenia risk variants and their cis-target genes. This study highlights ASoC as a functional mechanism of noncoding neuropsychiatric risk variants, providing a powerful framework for identifying disease causal variants and genes.


2001 ◽  
Vol 1 ◽  
pp. 218-224 ◽  
Author(s):  
Subhasis Banerjee ◽  
Alan Smallwood ◽  
Scott Lamond ◽  
Stuart Campbell ◽  
Geeta Nargund

The imprinting control region (ICR) located far upstream of the H19 gene, in conjunction with enhancers, modulates the transcription of Igf2 and H19 genes in an allele-specific manner. On paternal inheritance, the methylated ICR silences the H19 gene and indirectly facilitates transcription from the distant Igf2 promoter, whereas on the maternal chromosome the unmethylated ICR, together with enhancers, activates transcription of the H19 gene and thereby contributes to the repression of Igf2. This repression of maternal Igf2 has recently been postulated to be due to a chromatin boundary or insulator function of the unmethylated ICR. Central to the insulator model is the site-specific binding of a ubiquitous nuclear factor CTCF which exhibits remarkable flexibility in functioning as transcriptional activator or silencer. We suggest that the ICR positioned close to the enhancers in an episomal context might function as a transcriptional silencer by virtue of interaction of CTCF with its modifiers such as SIN3A and histone deacetylases. Furthermore, a localised folded chromatin structure resulting from juxtaposition of two disparate regulatory sequences (enhancer ICR) could be the mechanistic basis of ICR-mediated position-dependent (ICR-promoter) transcriptional repression in transgenic Drosophila.


2018 ◽  
Author(s):  
Ei-Wen Yang ◽  
Jae Hoon Bahn ◽  
Esther Yun-Hua Hsiao ◽  
Boon Xin Tan ◽  
Yiwei Sun ◽  
...  

AbstractAllele-specific protein-RNA binding is an essential aspect that may reveal functional genetic variants influencing RNA processing and gene expression phenotypes. Recently, genome-wide detection of in vivo binding sites of RNA binding proteins (RBPs) is greatly facilitated by the enhanced UV crosslinking and immunoprecipitation (eCLIP) protocol. Hundreds of eCLIP-Seq data sets were generated from HepG2 and K562 cells during the ENCODE3 phase. These data afford a valuable opportunity to examine allele-specific binding (ASB) of RBPs. To this end, we developed a new computational algorithm, called BEAPR (Binding Estimation of Allele-specific Protein-RNA interaction). In identifying statistically significant ASB sites, BEAPR takes into account UV cross-linking induced sequence propensity and technical variations between replicated experiments. Using simulated data and actual eCLIP-Seq data, we show that BEAPR largely outperforms often-used methods Chi-Squared test and Fisher’s Exact test. Importantly, BEAPR overcomes the inherent over-dispersion problem of the other methods. Complemented by experimental validations, we demonstrate that ASB events are significantly associated with genetic regulation of splicing and mRNA abundance, supporting the usage of this method to pinpoint functional genetic variants in post-transcriptional gene regulation. Many variants with ASB patterns of RBPs were found as genetic variants with cancer or other disease relevance. About 38% of ASB variants were in linkage disequilibrium with single nucleotide polymorphisms from genome-wide association studies. Overall, our results suggest that BEAPR is an effective method to reveal ASB patterns in eCLIP and can inform functional interpretation of disease-related genetic variants.


2021 ◽  
Author(s):  
Nastassia Gobet ◽  
Maxime Jan ◽  
Paul Franken ◽  
Ioannis Xenarios

Genetic variations affect behavior and cause disease but understanding how these variants drive complex traits is still an open question. A common approach is to link the genetic variants to intermediate molecular phenotypes such as the transcriptome using RNA-sequencing (RNA-seq). Paradoxically, these variants between the samples are usually ignored at the beginning of RNA-seq analyses of many model organisms. This can skew the transcriptome estimates that are used later for downstream analyses, such as expression quantitative trait locus (eQTL) detection. Here, we assessed the impact of reference-based analysis on the transcriptome and eQTLs in a widely-used mouse genetic population: the BXD panel of recombinant inbred lines. We highlight existing reference bias in the transcriptome data analysis and propose practical solutions which combine available genetic variants, genotypes, and genome reference sequence. The use of custom BXD line references improved downstream analysis compared to classical genome reference. These insights would likely benefit genetic studies with a transcriptomic component and demonstrate that genome references might need to be reassessed and improved.


Sign in / Sign up

Export Citation Format

Share Document