scholarly journals MR-Base: a platform for systematic causal inference across the phenome using billions of genetic associations

2016 ◽  
Author(s):  
Gibran Hemani ◽  
Jie Zheng ◽  
Kaitlin H Wade ◽  
Charles Laurin ◽  
Benjamin Elsworth ◽  
...  

AbstractPublished genetic associations can be used to infer causal relationships between phenotypes, bypassing the need for individual-level genotype or phenotype data. We have curated complete summary data from 1094 genome-wide association studies (GWAS) on diseases and other complex traits into a centralised database, and developed an analytical platform that uses these data to perform Mendelian randomization (MR) tests and sensitivity analyses (MR-Base, http://www.mrbase.org). Combined with curated data of published GWAS hits for phenomic measures, the MR-Base platform enables millions of potential causal relationships to be evaluated. We use the platform to predict the impact of lipid lowering on human health. While our analysis provides evidence that reducing LDL-cholesterol, lipoprotein(a) or triglyceride levels reduce coronary disease risk, it also suggests causal effects on a number of other non-vascular outcomes, indicating potential for adverse-effects or drug repositioning of lipid-lowering therapies.

eLife ◽  
2018 ◽  
Vol 7 ◽  
Author(s):  
Gibran Hemani ◽  
Jie Zheng ◽  
Benjamin Elsworth ◽  
Kaitlin H Wade ◽  
Valeriia Haberland ◽  
...  

Results from genome-wide association studies (GWAS) can be used to infer causal relationships between phenotypes, using a strategy known as 2-sample Mendelian randomization (2SMR) and bypassing the need for individual-level data. However, 2SMR methods are evolving rapidly and GWAS results are often insufficiently curated, undermining efficient implementation of the approach. We therefore developed MR-Base (http://www.mrbase.org): a platform that integrates a curated database of complete GWAS results (no restrictions according to statistical significance) with an application programming interface, web app and R packages that automate 2SMR. The software includes several sensitivity analyses for assessing the impact of horizontal pleiotropy and other violations of assumptions. The database currently comprises 11 billion single nucleotide polymorphism-trait associations from 1673 GWAS and is updated on a regular basis. Integrating data with software ensures more rigorous application of hypothesis-driven analyses and allows millions of potential causal relationships to be efficiently evaluated in phenome-wide association studies.


2019 ◽  
Author(s):  
Tom G Richardson ◽  
Gibran Hemani ◽  
Tom R Gaunt ◽  
Caroline L Relton ◽  
George Davey Smith

AbstractBackgroundDeveloping insight into tissue-specific transcriptional mechanisms can help improve our understanding of how genetic variants exert their effects on complex traits and disease. By applying the principles of Mendelian randomization, we have undertaken a systematic analysis to evaluate transcriptome-wide associations between gene expression across 48 different tissue types and 395 complex traits.ResultsOverall, we identified 100,025 gene-trait associations based on conventional genome-wide corrections (P < 5 × 10−08) that also provided evidence of genetic colocalization. These results indicated that genetic variants which influence gene expression levels in multiple tissues are more likely to influence multiple complex traits. We identified many examples of tissue-specific effects, such as genetically-predicted TPO, NR3C2 and SPATA13 expression only associating with thyroid disease in thyroid tissue. Additionally, FBN2 expression was associated with both cardiovascular and lung function traits, but only when analysed in heart and lung tissue respectively.We also demonstrate that conducting phenome-wide evaluations of our results can help flag adverse on-target side effects for therapeutic intervention, as well as propose drug repositioning opportunities. Moreover, we find that exploring the tissue-dependency of associations identified by genome-wide association studies (GWAS) can help elucidate the causal genes and tissues responsible for effects, as well as uncover putative novel associations.ConclusionsThe atlas of tissue-dependent associations we have constructed should prove extremely valuable to future studies investigating the genetic determinants of complex disease. The follow-up analyses we have performed in this study are merely a guide for future research. Conducting similar evaluations can be undertaken systematically at http://mrcieu.mrsoftware.org/Tissue_MR_atlas/.


Author(s):  
Io Ieong Chan ◽  
Man Ki Kwok ◽  
C Mary Schooling

Abstract Introduction Observational studies suggest earlier puberty is associated with higher adulthood blood pressure (BP), but these findings have not been replicated using Mendelian randomization (MR). We examined this question sex-specifically using larger genome-wide association studies (GWAS) with more extensive measures of pubertal timing. Methods We obtained genetic instruments proxying pubertal maturation (age at menarche (AAM) or voice breaking (AVB)) from the largest published GWAS. We applied them to summary sex-specific genetic associations with systolic and diastolic BP z-scores, and self-reported hypertension in women (n=194174) and men (n=167020) from the UK Biobank, using inverse-variance weighting meta-analysis. We conducted sensitivity analyses using other MR methods, including multivariable MR adjusted for childhood obesity proxied by body mass index (BMI). We used late pubertal growth as a validation outcome. Results AAM (beta per one-year later = -0.030 [95% confidence interval (CI) -0.055, -0.005] and AVB (beta -0.058 [95% CI -0.100, -0.015]) were inversely associated with systolic BP independent of childhood BMI, as were diastolic BP (-0.035 [95% CI -0.060, -0.009] for AAM and -0.046 [95% CI -0.089, -0.004] for AVB) and self-reported hypertension (odds ratios 0.89 [95% CI 0.84, 0.95] for AAM and 0.87 [95% CI 0.79, 0.96] for AVB). AAM and AVB were positively associated with late pubertal growth, as expected. The results were robust to sensitivity analysis using other MR methods. Conclusion Timing of pubertal maturation was associated with adulthood BP independent of childhood BMI, highlighting the role of pubertal maturation timing in midlife BP.


2017 ◽  
Vol 242 (13) ◽  
pp. 1325-1334 ◽  
Author(s):  
Yizhou Zhu ◽  
Cagdas Tazearslan ◽  
Yousin Suh

Genome-wide association studies have shown that the far majority of disease-associated variants reside in the non-coding regions of the genome, suggesting that gene regulatory changes contribute to disease risk. To identify truly causal non-coding variants and their affected target genes remains challenging but is a critical step to translate the genetic associations to molecular mechanisms and ultimately clinical applications. Here we review genomic/epigenomic resources and in silico tools that can be used to identify causal non-coding variants and experimental strategies to validate their functionalities. Impact statement Most signals from genome-wide association studies (GWASs) map to the non-coding genome, and functional interpretation of these associations remained challenging. We reviewed recent progress in methodologies of studying the non-coding genome and argued that no single approach allows one to effectively identify the causal regulatory variants from GWAS results. By illustrating the advantages and limitations of each method, our review potentially provided a guideline for taking a combinatorial approach to accurately predict, prioritize, and eventually experimentally validate the causal variants.


2020 ◽  
Author(s):  
Jingshu Wang ◽  
Qingyuan Zhao ◽  
Jack Bowden ◽  
Gilbran Hemani ◽  
George Davey Smith ◽  
...  

Over a decade of genome-wide association studies have led to the finding that significant genetic associations tend to spread across the genome for complex traits. The extreme polygenicity where "all genes affect every complex trait" complicates Mendelian Randomization studies, where natural genetic variations are used as instruments to infer the causal effect of heritable risk factors. We reexamine the assumptions of existing Mendelian Randomization methods and show how they need to be clarified to allow for pervasive horizontal pleiotropy and heterogeneous effect sizes. We propose a comprehensive framework GRAPPLE (Genome-wide mR Analysis under Pervasive PLEiotropy) to analyze the causal effect of target risk factors with heterogeneous genetic instruments and identify possible pleiotropic patterns from data. By using summary statistics from genome-wide association studies, GRAPPLE can efficiently use both strong and weak genetic instruments, detect the existence of multiple pleiotropic pathways, adjust for confounding risk factors, and determine the causal direction. With GRAPPLE, we analyze the effect of blood lipids, body mass index, and systolic blood pressure on 25 disease outcomes, gaining new information on their causal relationships and the potential pleiotropic pathways.


2014 ◽  
Author(s):  
Feng Gao ◽  
Diana Chang ◽  
Arjun Biddanda ◽  
Li Ma ◽  
Yingjie Guo ◽  
...  

XWAS is a new software suite for the analysis of the X chromosome in association studies and similar studies. The X chromosome plays an important role in human disease, especially those with sexually dimorphic characteristics. Special attention needs to be given to its analysis due to the unique inheritance pattern, which leads to analytical complications that have resulted in the majority of genome-wide association studies (GWAS) either not considering X or mishandling it with toolsets that had been designed for non-sex chromosomes. We hence developed XWAS to fill the need for tools that are specially designed for analysis of X. Following extensive, stringent, and X-specific quality control, XWAS offers an array of statistical tests of association, including: (1) the standard test between a SNP (single nucleotide polymorphism) and disease risk, including after first stratifying individuals by sex, (2) a test for a differential effect of a SNP on disease between males and females, (3) motivated by X-inactivation, a test for higher variance of a trait in heterozygous females as compared to homozygous females, and (4) for all tests, a version that allows for combining evidence from all SNPs across a gene. We applied the toolset analysis pipeline to 16 GWAS datasets of immune-related disorders and 7 risk factors of coronary artery disease, and discovered several new X-linked genetic associations. XWAS will provide the tools and incentive for others to incorporate the X chromosome into GWAS, hence enabling discoveries of novel loci implicated in many diseases and in their sexual dimorphism.


2016 ◽  
Author(s):  
Jimmy Z Liu ◽  
Yaniv Erlich ◽  
Joseph K Pickrell

AbstractThe case-control association study is a powerful method for identifying genetic variants that influence disease risk. However, the collection of cases can be time-consuming and expensive; if a disease occurs late in life or is rapidly lethal, it may be more practical to identify family members of cases. Here, we show that replacing cases with their first-degree relatives enables genome-wide association studies by proxy (GWAX). In randomly-ascertained cohorts, this approach enables previously infeasible studies of diseases that are absent (or nearly absent) in the cohort. As an illustration, we performed GWAX of 12 common diseases in 116,196 individuals from the UK Biobank. By combining these results with published GWAS summary statistics in a meta-analysis, we replicated established risk loci and identified 17 newly associated risk loci: four in Alzheimer’s disease, eight in coronary artery disease, and five in type 2 diabetes. In addition to informing disease biology, our results demonstrate the utility of association mapping using family history of disease as a phenotype to be mapped. We anticipate that this approach will prove useful in future genetic studies of complex traits in large population cohorts.


Genes ◽  
2021 ◽  
Vol 13 (1) ◽  
pp. 87
Author(s):  
Sean M. Burnard ◽  
Rodney A. Lea ◽  
Miles Benton ◽  
David Eccles ◽  
Daniel W. Kennedy ◽  
...  

Conventional genome-wide association studies (GWASs) of complex traits, such as Multiple Sclerosis (MS), are reliant on per-SNP p-values and are therefore heavily burdened by multiple testing correction. Thus, in order to detect more subtle alterations, ever increasing sample sizes are required, while ignoring potentially valuable information that is readily available in existing datasets. To overcome this, we used penalised regression incorporating elastic net with a stability selection method by iterative subsampling to detect the potential interaction of loci with MS risk. Through re-analysis of the ANZgene dataset (1617 cases and 1988 controls) and an IMSGC dataset as a replication cohort (1313 cases and 1458 controls), we identified new association signals for MS predisposition, including SNPs above and below conventional significance thresholds while targeting two natural killer receptor loci and the well-established HLA loci. For example, rs2844482 (98.1% iterations), otherwise ignored by conventional statistics (p = 0.673) in the same dataset, was independently strongly associated with MS in another GWAS that required more than 40 times the number of cases (~45 K). Further comparison of our hits to those present in a large-scale meta-analysis, confirmed that the majority of SNPs identified by the elastic net model reached conventional statistical GWAS thresholds (p < 5 × 10−8) in this much larger dataset. Moreover, we found that gene variants involved in oxidative stress, in addition to innate immunity, were associated with MS. Overall, this study highlights the benefit of using more advanced statistical methods to (re-)analyse subtle genetic variation among loci that have a biological basis for their contribution to disease risk.


2019 ◽  
Author(s):  
Cong Guo ◽  
Karsten B. Sieber ◽  
Jorge Esparza-Gordillo ◽  
Mark R. Hurle ◽  
Kijoung Song ◽  
...  

AbstractIdentifying the effector genes from genome-wide association studies (GWAS) is a crucial step towards understanding the biological mechanisms underlying complex traits and diseases. Colocalization of expression and protein quantitative trait loci (eQTL and pQTL, hereafter collectively called “xQTL”) can be effective for mapping associations to genes in many loci. However, existing colocalization methods require full single-variant summary statistics which are often not readily available for many published GWAS or xQTL studies. Here, we present PICCOLO, a method that uses minimum SNP p-values within a locus to determine if pairs of genetic associations are colocalized. This method greatly expands the number of GWAS and xQTL datasets that can be tested for colocalization. We applied PICCOLO to 10,759 genome-wide significant associations across the NHGRI-EBI GWAS Catalog with xQTLs from 28 studies. We identified at least one colocalized gene-xQTL in at least one tissue for 30% of associations, and we pursued multiple lines of evidence to demonstrate that these mappings are biologically meaningful. PICCOLO genes are significantly enriched for biologically relevant tissues, and 4.3-fold enriched for targets of approved drugs.


2021 ◽  
Author(s):  
Steven Gazal ◽  
Omer Weissbrod ◽  
Farhad Hormozdiari ◽  
Kushal Dey ◽  
Joseph Nasser ◽  
...  

Although genome-wide association studies (GWAS) have identified thousands of disease-associated common SNPs, these SNPs generally do not implicate the underlying target genes, as most disease SNPs are regulatory. Many SNP-to-gene (S2G) linking strategies have been developed to link regulatory SNPs to the genes that they regulate in cis, but it is unclear how these strategies should be applied in the context of interpreting common disease risk variants. We developed a framework for evaluating and combining different S2G strategies to optimize their informativeness for common disease risk, leveraging polygenic analyses of disease heritability to define and estimate their precision and recall. We applied our framework to GWAS summary statistics for 63 diseases and complex traits (average N=314K), evaluating 50 S2G strategies. Our optimal combined S2G strategy (cS2G) included 7 constituent S2G strategies (Exon, Promoter, 2 fine-mapped cis-eQTL strategies, EpiMap enhancer-gene linking, Activity-By-Contact (ABC), and Cicero), and achieved a precision of 0.75 and a recall of 0.33, more than doubling the precision and/or recall of any individual strategy; this implies that 33% of SNP-heritability can be linked to causal genes with 75% confidence. We applied cS2G to fine-mapping results for 49 UK Biobank diseases/traits to predict 7,111 causal SNP-gene-disease triplets (with S2G-derived functional interpretation) with high confidence. Finally, we applied cS2G to genome-wide fine-mapping results for these traits (not restricted to GWAS loci) to rank genes by the heritability linked to each gene, providing an empirical assessment of disease omnigenicity; averaging across traits, we determined that the top 200 (1%) of ranked genes explained roughly half of the heritability linked to all genes. Our results highlight the benefits of our cS2G strategy in providing functional interpretation of GWAS findings; we anticipate that precision and recall will increase further under our framework as improved functional assays lead to improved S2G strategies. 


Sign in / Sign up

Export Citation Format

Share Document