allelic effect
Recently Published Documents


TOTAL DOCUMENTS

5
(FIVE YEARS 0)

H-INDEX

3
(FIVE YEARS 0)

2020 ◽  
Author(s):  
Kushal K. Dey ◽  
Samuel S. Kim ◽  
Steven Gazal ◽  
Joseph Nasser ◽  
Jesse M. Engreitz ◽  
...  

AbstractDeep learning models have achieved great success in predicting genome-wide regulatory effects from DNA sequence, but recent work has reported that SNP annotations derived from these predictions contribute limited unique information for human complex disease. Here, we explore three integrative approaches to improve the disease informativeness of allelic-effect annotations (predicted difference between reference and variant alleles) constructed using two previously trained deep learning models, DeepSEA and Basenji. First, we employ gradient boosting to learn optimal combinations of deep learning annotations, using (off-chromosome) fine-mapped SNPs and matched control SNPs for training. Second, we improve the specificity of these annotations by restricting them to SNPs implicated by (proximal and distal) SNP-to-gene (S2G) linking strategies, e.g. prioritizing SNPs involved in gene regulation. Third, we predict gene expression (and derive allelic-effect annotations) from deep learning annotations at SNPs implicated by S2G linking strategies — generalizing the previously proposed ExPecto approach, which incorporates deep learning annotations based on distance to TSS. We evaluated these approaches using stratified LD score regression, using functional data in blood and focusing on 11 autoimmune diseases and blood-related traits (average N=306K). We determined that the three approaches produced SNP annotations that were uniquely informative for these diseases/traits, despite the fact that linear combinations of the underlying DeepSEA and Basenji blood annotations were not uniquely informative for these diseases/traits. Our results highlight the benefits of integrating SNP annotations produced by deep learning models with other types of data, including data linking SNPs to genes.


2020 ◽  
Author(s):  
Christine H. Diepenbrock ◽  
Daniel C. Ilut ◽  
Maria Magallanes-Lundback ◽  
Catherine B. Kandianis ◽  
Alexander E. Lipka ◽  
...  

ABSTRACTVitamin A deficiency remains prevalent in parts of Asia, Latin America and sub-Saharan Africa where maize is a food staple. Extensive natural variation exists for carotenoids in maize grain; to understand its genetic basis, we conducted a joint linkage and genome-wide association study in the U.S. maize nested association mapping panel. Eleven of the 44 detected quantitative trait loci (QTL) were resolved to individual genes. Six of these were expression QTL (eQTL), showing strong correlations between RNA-seq expression abundances and QTL allelic effect estimates across six stages of grain development. These six eQTL also had the largest percent phenotypic variance explained, and in major part comprised the three to five loci capturing the bulk of genetic variation for each trait. Most of these eQTL had highly correlated QTL allelic effect estimates across multiple traits, suggesting that pleiotropy within this pathway is largely regulated at the expression level. Significant pairwise epistatic interactions were also detected. These findings provide the most comprehensive genome-level understanding of the genetic and molecular control of carotenoids in any plant system, and a roadmap to accelerate breeding for provitamin A and other priority carotenoid traits in maize grain that should be readily extensible to other cereals.


2019 ◽  
Author(s):  
Kushal K. Dey ◽  
Bryce Van de Geijn ◽  
Samuel Sungil Kim ◽  
Farhad Hormozdiari ◽  
David R. Kelley ◽  
...  

AbstractDeep learning models have shown great promise in predicting genome-wide regulatory effects from DNA sequence, but their informativeness for human complex diseases and traits is not fully understood. Here, we evaluate the disease informativeness of allelic-effect annotations (absolute value of the predicted difference between reference and variant alleles) constructed using two previously trained deep learning models, DeepSEA and Basenji. We apply stratified LD score regression (S-LDSC) to 41 independent diseases and complex traits (average N=320K) to evaluate each annotation’s informativeness for disease heritability conditional on a broad set of coding, conserved, regulatory and LD-related annotations from the baseline-LD model and other sources; as a secondary metric, we also evaluate the accuracy of models that incorporate deep learning annotations in predicting disease-associated or fine-mapped SNPs. We aggregated annotations across all tissues (resp. blood cell types or brain tissues) in meta-analyses across all 41 traits (resp. 11 blood-related traits or 8 brain-related traits). These allelic-effect annotations were highly enriched for disease heritability, but produced only limited conditionally significant results – only Basenji-H3K4me3 in meta-analyses across all 41 traits and brain-specific Basenji-H3K4me3 in meta-analyses across 8 brain-related traits. We conclude that deep learning models are yet to achieve their full potential to provide considerable amount of unique information for complex disease, and that the informativeness of deep learning models for disease beyond established functional annotations cannot be inferred from metrics based on their accuracy in predicting regulatory annotations.


Sign in / Sign up

Export Citation Format

Share Document