Improving the informativeness of Mendelian disease-derived pathogenicity scores for common disease

Mapping Intimacies ◽

10.1101/2020.01.02.890657 ◽

2020 ◽

Author(s):

Samuel S. Kim ◽

Kushal K. Dey ◽

Omer Weissbrod ◽

Carla Marquez-Luna ◽

Steven Gazal ◽

...

Keyword(s):

Candidate Gene ◽

Fine Mapping ◽

Complex Traits ◽

High Potential ◽

Model Fit ◽

Considerable Progress ◽

Gradient Boosting ◽

Common Disease ◽

Mendelian Disease ◽

Functional Annotations

AbstractDespite considerable progress on pathogenicity scores prioritizing both coding and noncoding variants for Mendelian disease, little is known about the utility of these pathogenicity scores for common disease. Here, we sought to assess the informativeness of Mendelian diseasederived pathogenicity scores for common disease, and to improve upon existing scores. We first applied stratified LD score regression to assess the informativeness of annotations defined by top variants from published Mendelian disease-derived pathogenicity scores across 41 independent common diseases and complex traits (average N = 320K). Several of the resulting annotations were informative for common disease, even after conditioning on a broad set of coding, conserved, regulatory and LD-related annotations from the baseline-LD model. We then improved upon the published pathogenicity scores by developing AnnotBoost, a gradient boosting-based framework to impute and denoise pathogenicity scores using functional annotations from the baseline-LD model. AnnotBoost substantially increased the informativeness for common disease of both previously uninformative and previously informative pathogenicity scores, implying pervasive variant-level overlap between Mendelian disease and common disease. The boosted scores also produced significant improvements in heritability model fit and in classifying disease-associated, fine-mapped SNPs. Our boosted scores have high potential to improve candidate gene discovery and fine-mapping for common disease.

Download Full-text

Improving the informativeness of Mendelian disease-derived pathogenicity scores for common disease

Nature Communications ◽

10.1038/s41467-020-20087-2 ◽

2020 ◽

Vol 11 (1) ◽

Author(s):

Samuel S. Kim ◽

Kushal K. Dey ◽

Omer Weissbrod ◽

Carla Márquez-Luna ◽

Steven Gazal ◽

...

Keyword(s):

Machine Learning ◽

Linkage Disequilibrium ◽

Candidate Gene ◽

Fine Mapping ◽

Complex Traits ◽

Model Fit ◽

Common Disease ◽

Mendelian Disease ◽

Functional Annotations ◽

Learning Framework

AbstractDespite considerable progress on pathogenicity scores prioritizing variants for Mendelian disease, little is known about the utility of these scores for common disease. Here, we assess the informativeness of Mendelian disease-derived pathogenicity scores for common disease and improve upon existing scores. We first apply stratified linkage disequilibrium (LD) score regression to evaluate published pathogenicity scores across 41 common diseases and complex traits (average N = 320K). Several of the resulting annotations are informative for common disease, even after conditioning on a broad set of functional annotations. We then improve upon published pathogenicity scores by developing AnnotBoost, a machine learning framework to impute and denoise pathogenicity scores using a broad set of functional annotations. AnnotBoost substantially increases the informativeness for common disease of both previously uninformative and previously informative pathogenicity scores, implying that Mendelian and common disease variants share similar properties. The boosted scores also produce improvements in heritability model fit and in classifying disease-associated, fine-mapped SNPs. Our boosted scores may improve fine-mapping and candidate gene discovery for common disease.

Download Full-text

Cross-population Joint Analysis of eQTLs: Fine Mapping and Functional Annotation

10.1101/008797 ◽

2014 ◽

Cited By ~ 3

Author(s):

Xiaoquan Wen ◽

Francesca Luca ◽

Roger Pique-Regi

Keyword(s):

Fine Mapping ◽

Complex Traits ◽

Statistical Approach ◽

Molecular Level ◽

Joint Analysis ◽

P Value ◽

Analysis Framework ◽

Population Groups ◽

Functional Annotations ◽

Eqtl Data

Mapping expression quantitative trait loci (eQTLs) has been shown as a powerful tool to uncover the genetic underpinnings of many complex traits at the molecular level. In this paper, we present an integrative analysis approach that leverages eQTL data collected from multiple population groups. In particular, our approach effectively identifies multiple independent {\it cis}-eQTL signals that are consistently presented across populations, accounting for heterogeneity in allele frequencies and patterns of linkage disequilibrium. Furthermore, our analysis framework enables integrating high-resolution functional annotations into analysis of eQTLs. We applied our statistical approach to analyze the GEUVADIS data consisting of samples from five population groups. From this analysis, we concluded that i) joint analysis across population groups greatly improves the power of eQTL discovery and the resolution of fine mapping of causal eQTLs; ii) many genes harbor multiple independent eQTLs in their {\it cis} regions; iii) genetic variants that disrupt transcription factor binding are significantly enriched in eQTLs (p-value = 4.93 × 10-22).

Download Full-text

Combining SNP-to-gene linking strategies to pinpoint disease genes and assess disease omnigenicity

10.1101/2021.08.02.21261488 ◽

2021 ◽

Author(s):

Steven Gazal ◽

Omer Weissbrod ◽

Farhad Hormozdiari ◽

Kushal Dey ◽

Joseph Nasser ◽

...

Keyword(s):

Fine Mapping ◽

Complex Traits ◽

Target Genes ◽

Disease Risk ◽

Association Studies ◽

Common Disease ◽

Disease Genes ◽

Genome Wide Association Studies ◽

Functional Interpretation ◽

Genome Wide

Although genome-wide association studies (GWAS) have identified thousands of disease-associated common SNPs, these SNPs generally do not implicate the underlying target genes, as most disease SNPs are regulatory. Many SNP-to-gene (S2G) linking strategies have been developed to link regulatory SNPs to the genes that they regulate in cis, but it is unclear how these strategies should be applied in the context of interpreting common disease risk variants. We developed a framework for evaluating and combining different S2G strategies to optimize their informativeness for common disease risk, leveraging polygenic analyses of disease heritability to define and estimate their precision and recall. We applied our framework to GWAS summary statistics for 63 diseases and complex traits (average N=314K), evaluating 50 S2G strategies. Our optimal combined S2G strategy (cS2G) included 7 constituent S2G strategies (Exon, Promoter, 2 fine-mapped cis-eQTL strategies, EpiMap enhancer-gene linking, Activity-By-Contact (ABC), and Cicero), and achieved a precision of 0.75 and a recall of 0.33, more than doubling the precision and/or recall of any individual strategy; this implies that 33% of SNP-heritability can be linked to causal genes with 75% confidence. We applied cS2G to fine-mapping results for 49 UK Biobank diseases/traits to predict 7,111 causal SNP-gene-disease triplets (with S2G-derived functional interpretation) with high confidence. Finally, we applied cS2G to genome-wide fine-mapping results for these traits (not restricted to GWAS loci) to rank genes by the heritability linked to each gene, providing an empirical assessment of disease omnigenicity; averaging across traits, we determined that the top 200 (1%) of ranked genes explained roughly half of the heritability linked to all genes. Our results highlight the benefits of our cS2G strategy in providing functional interpretation of GWAS findings; we anticipate that precision and recall will increase further under our framework as improved functional assays lead to improved S2G strategies.

Download Full-text

SparsePro: an efficient genome-wide fine-mapping method integrating summary statistics and functional annotations

10.1101/2021.10.04.463133 ◽

2021 ◽

Author(s):

Wenmin Zhang ◽

Hamed S Najafabadi ◽

Yue Li

Keyword(s):

Fine Mapping ◽

Complex Traits ◽

Genetic Architecture ◽

Association Studies ◽

Computational Cost ◽

Mapping Method ◽

Genome Wide Association Studies ◽

Functional Annotations ◽

Genome Wide ◽

Causal Variants

Identifying causal variants from genome-wide association studies (GWASs) is challenging due to widespread linkage disequilibrium (LD). Functional annotations of the genome may help prioritize variants that are biologically relevant and thus improve fine-mapping of GWAS results. However, classical fine-mapping methods have a high computational cost, particularly when the underlying genetic architecture and LD patterns are complex. Here, we propose a novel approach, SparsePro, to efficiently conduct functionally informed statistical fine-mapping. Our method enjoys two major innovations: First, by creating a sparse low-dimensional projection of the high-dimensional genotype, we enable a linear search of causal variants instead of an exponential search of causal configurations used in existing methods; Second, we adopt a probabilistic framework with a highly efficient variational expectation-maximization algorithm to integrate statistical associations and functional priors. We evaluate SparsePro through extensive simulations using resources from the UK Biobank. Compared to state-of-the-art methods, SparsePro achieved more accurate and well-calibrated posterior inference with greatly reduced computation time. We demonstrate the utility of SparsePro by investigating the genetic architecture of five functional biomarkers of vital organs. We identify potential causal variants contributing to the genetically encoded coordination mechanisms between vital organs and pinpoint target genes with potential pleiotropic effects. In summary, we have developed an efficient genome-wide fine-mapping method with the ability to integrate functional annotations. Our method may have wide utility in understanding the genetics of complex traits as well as in increasing the yield of functional follow-up studies of GWASs.

Download Full-text

Pharmacogenetics in Jewish populations

Drug Metabolism and Drug Interactions ◽

10.1515/dmdi-2013-0069 ◽

2014 ◽

Vol 29 (4) ◽

Cited By ~ 13

Author(s):

Yao Yang ◽

Inga Peter ◽

Stuart A. Scott

Keyword(s):

Complex Traits ◽

Jewish Population ◽

Jewish People ◽

Common Disease ◽

Disease Genes ◽

Mendelian Disease ◽

Racial Groups ◽

Founder Mutations ◽

Genomic Studies ◽

Jewish Populations

AbstractSpanning over 2000 years, the Jewish population has a long history of migration, population bottlenecks, expansions, and geographical isolation, which has resulted in a unique genetic architecture among the Jewish people. As such, many Mendelian disease genes and founder mutations for autosomal recessive diseases have been discovered in several Jewish groups, which have prompted recent genomic studies in the Jewish population on common disease susceptibility and other complex traits. Although few studies on the genetic determinants of drug response variability have been reported in the Jewish population, a number of unique pharmacogenetic variants have been discovered that are more common in Jewish populations than in other major racial groups. Notable examples identified in the Ashkenazi Jewish (AJ) population include the vitamin K epoxide reductase complex subunit 1 (

Download Full-text

Functionally-informed fine-mapping and polygenic localization of complex trait heritability

10.1101/807792 ◽

2019 ◽

Cited By ~ 10

Author(s):

Omer Weissbrod ◽

Farhad Hormozdiari ◽

Christian Benner ◽

Ran Cui ◽

Jacob Ulirsch ◽

...

Keyword(s):

Fine Mapping ◽

Functional Data ◽

Complex Traits ◽

Model Misspecification ◽

Complex Trait ◽

Hair Color ◽

Mapping Accuracy ◽

Functional Annotations ◽

Genome Wide ◽

Causal Probability

AbstractFine-mapping aims to identify causal variants impacting complex traits. Several recent methods improve fine-mapping accuracy by prioritizing variants in enriched functional annotations. However, these methods can only use information at genome-wide significant loci (or a small number of functional annotations), severely limiting the benefit of functional data. We propose PolyFun, a computationally scalable framework to improve fine-mapping accuracy using genome-wide functional data for a broad set of coding, conserved, regulatory and LD-related annotations. PolyFun prioritizes variants in enriched functional annotations by specifying prior causal probabilities for fine-mapping methods such as SuSiE or FINEMAP, employing special procedures to ensure robustness to model misspecification and winner’s curse. In simulations with in-sample LD, PolyFun + SuSiE and PolyFun + FINEMAP were well-calibrated and identified >20% more variants with posterior causal probability >0.95 than their non-functionally informed counterparts (and >33% more fine-mapped variants than previous functionally-informed fine-mapping methods). In simulations with mismatched reference LD, PolyFun + SuSiE remained well-calibrated when reducing the maximum number of assumed causal SNPs per locus, which reduces absolute power but still produces large relative improvements. In analyses of 49 UK Biobank traits (average N=318K) with in-sample LD, PolyFun + SuSiE identified 3,025 fine-mapped variant-trait pairs with posterior causal probability >0.95, a >32% improvement vs. SuSiE; 223 variants were fine-mapped for multiple genetically uncorrelated traits, indicating pervasive pleiotropy. We used posterior mean per-SNP heritabilities from PolyFun + SuSiE to perform polygenic localization, constructing minimal sets of common SNPs causally explaining 50% of common SNP heritability; these sets ranged in size from 28 (hair color) to 3,400 (height) to 2 million (number of children). In conclusion, PolyFun prioritizes variants for functional follow-up and provides insights into complex trait architectures.

Download Full-text

SparsePro: an efficient genome-wide fine-mapping method integrating summary statistics and functional annotations

10.21203/rs.3.rs-1160063/v1 ◽

2022 ◽

Author(s):

Wenmin Zhang ◽

Hamed Najafabadi ◽

Yue Li

Keyword(s):

Fine Mapping ◽

Complex Traits ◽

Genetic Architecture ◽

Association Studies ◽

Computational Cost ◽

Mapping Method ◽

Genome Wide Association Studies ◽

Functional Annotations ◽

Genome Wide ◽

Causal Variants

Abstract Identifying causal variants from genome-wide association studies (GWASs) is challenging due to widespread linkage disequilibrium (LD). Functional annotations of the genome may help prioritize variants that are biologically relevant and thus improve fine-mapping of GWAS results. However, classical fine-mapping methods have a high computational cost, particularly when the underlying genetic architecture and LD patterns are complex. Here, we propose a novel approach, SparsePro, to efficiently conduct genome-wide fine-mapping. Our method enjoys two major innovations: First, by creating a sparse low-dimensional projection of the high-dimensional genotype data, we enable a linear search of causal variants instead of a combinatorial search of causal configurations used in most existing methods; Second, we adopt a probabilistic framework with a highly efficient variational expectation-maximization algorithm to integrate statistical associations and functional priors. We evaluate SparsePro through extensive simulations using resources from the UK Biobank. Compared to state-of-the-art methods, SparsePro achieved more accurate and well-calibrated posterior inference with greatly reduced computation time. We demonstrate the utility of SparsePro by investigating the genetic architecture of five functional biomarkers of vital organs. We show that, compared to other methods, the causal variants identified by SparsePro are highly enriched for expression quantitative trait loci and explain a larger proportion of trait heritability. We also identify potential causal variants contributing to the genetically encoded coordination mechanisms between vital organs, and pinpoint target genes with potential pleiotropic effects. In summary, we have developed an efficient genome-wide fine-mapping method with the ability to integrate functional annotations. Our method may have wide utility in understanding the genetics of complex traits as well as in increasing the yield of functional follow-up studies of GWASs. SparsePro software is available on GitHub at https://github.com/zhwm/SparsePro.

Download Full-text

Fine mapping of QTL conferring Cercospora leaf spot disease resistance in mungbean revealed TAF5 as candidate gene for the resistance

Theoretical and Applied Genetics ◽

10.1007/s00122-020-03724-8 ◽

2020 ◽

Author(s):

Chutintorn Yundaeng ◽

Prakit Somta ◽

Jingbin Chen ◽

Xingxing Yuan ◽

Sompong Chankaew ◽

...

Keyword(s):

Disease Resistance ◽

Candidate Gene ◽

Fine Mapping ◽

Leaf Spot ◽

Leaf Spot Disease ◽

Cercospora Leaf Spot ◽

Spot Disease

Download Full-text

Using Heterogeneous Stocks for Fine-Mapping Genetically Complex Traits

Methods in Molecular Biology - Rat Genomics ◽

10.1007/978-1-4939-9581-3_11 ◽

2019 ◽

pp. 233-247 ◽

Cited By ~ 8

Author(s):

Leah C. Solberg Woods ◽

Abraham A. Palmer

Keyword(s):

Fine Mapping ◽

Complex Traits

Download Full-text

Fine Mapping and Candidate Gene Discovery of the Soybean Mosaic Virus Resistance Gene, Rsv4

The Plant Genome ◽

10.3835/plantgenome2009.07.0020 ◽

2010 ◽

Vol 3 (1) ◽

Cited By ~ 53

Author(s):

M. A. Saghai Maroof ◽

Dominic M. Tucker ◽

Jeffrey A. Skoneczka ◽

Brian C. Bowman ◽

Sucheta Tripathy ◽

...

Keyword(s):

Candidate Gene ◽

Mosaic Virus ◽

Resistance Gene ◽

Fine Mapping ◽

Virus Resistance ◽

Soybean Mosaic Virus ◽

Gene Discovery ◽

Virus Resistance Gene

Download Full-text