scholarly journals Automating Mendelian randomization through machine learning to construct a putative causal map of the human phenome

2017 ◽  
Author(s):  
Gibran Hemani ◽  
Jack Bowden ◽  
Philip Haycock ◽  
Jie Zheng ◽  
Oliver Davis ◽  
...  

AbstractA major application for genome-wide association studies (GWAS) has been the emerging field of causal inference using Mendelian randomization (MR), where the causal effect between a pair of traits can be estimated using only summary level data. MR depends on SNPs exhibiting vertical pleiotropy, where the SNP influences an outcome phenotype only through an exposure phenotype. Issues arise when this assumption is violated due to SNPs exhibiting horizontal pleiotropy. We demonstrate that across a range of pleiotropy models, instrument selection will be increasingly liable to selecting invalid instruments as GWAS sample sizes continue to grow. Methods have been developed in an attempt to protect MR from different patterns of horizontal pleiotropy, and here we have designed a mixture-of-experts machine learning framework (MR-MoE 1.0) that predicts the most appropriate model to use for any specific causal analysis, improving on both power and false discovery rates. Using the approach, we systematically estimated the causal effects amongst 2407 phenotypes. Almost 90% of causal estimates indicated some level of horizontal pleiotropy. The causal estimates are organised into a publicly available graph database (http://eve.mrbase.org), and we use it here to highlight the numerous challenges that remain in automated causal inference.

2019 ◽  
Author(s):  
Jia Zhao ◽  
Jingsi Ming ◽  
Xianghong Hu ◽  
Gang Chen ◽  
Jin Liu ◽  
...  

Abstract Motivation The results from Genome-Wide Association Studies (GWAS) on thousands of phenotypes provide an unprecedented opportunity to infer the causal effect of one phenotype (exposure) on another (outcome). Mendelian randomization (MR), an instrumental variable (IV) method, has been introduced for causal inference using GWAS data. Due to the polygenic architecture of complex traits/diseases and the ubiquity of pleiotropy, however, MR has many unique challenges compared to conventional IV methods. Results We propose a Bayesian weighted Mendelian randomization (BWMR) for causal inference to address these challenges. In our BWMR model, the uncertainty of weak effects owing to polygenicity has been taken into account and the violation of IV assumption due to pleiotropy has been addressed through outlier detection by Bayesian weighting. To make the causal inference based on BWMR computationally stable and efficient, we developed a variational expectation-maximization (VEM) algorithm. Moreover, we have also derived an exact closed-form formula to correct the posterior covariance which is often underestimated in variational inference. Through comprehensive simulation studies, we evaluated the performance of BWMR, demonstrating the advantage of BWMR over its competitors. Then we applied BWMR to make causal inference between 130 metabolites and 93 complex human traits, uncovering novel causal relationship between exposure and outcome traits. Availability and implementation The BWMR software is available at https://github.com/jiazhao97/BWMR. Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Fernando Pires Hartwig ◽  
Kate Tilling ◽  
George Davey Smith ◽  
Deborah A Lawlor ◽  
Maria Carolina Borges

Abstract Background Two-sample Mendelian randomization (MR) allows the use of freely accessible summary association results from genome-wide association studies (GWAS) to estimate causal effects of modifiable exposures on outcomes. Some GWAS adjust for heritable covariables in an attempt to estimate direct effects of genetic variants on the trait of interest. One, both or neither of the exposure GWAS and outcome GWAS may have been adjusted for covariables. Methods We performed a simulation study comprising different scenarios that could motivate covariable adjustment in a GWAS and analysed real data to assess the influence of using covariable-adjusted summary association results in two-sample MR. Results In the absence of residual confounding between exposure and covariable, between exposure and outcome, and between covariable and outcome, using covariable-adjusted summary associations for two-sample MR eliminated bias due to horizontal pleiotropy. However, covariable adjustment led to bias in the presence of residual confounding (especially between the covariable and the outcome), even in the absence of horizontal pleiotropy (when the genetic variants would be valid instruments without covariable adjustment). In an analysis using real data from the Genetic Investigation of ANthropometric Traits (GIANT) consortium and UK Biobank, the causal effect estimate of waist circumference on blood pressure changed direction upon adjustment of waist circumference for body mass index. Conclusions Our findings indicate that using covariable-adjusted summary associations in MR should generally be avoided. When that is not possible, careful consideration of the causal relationships underlying the data (including potentially unmeasured confounders) is required to direct sensitivity analyses and interpret results with appropriate caution.


2019 ◽  
Author(s):  
Simon Haworth ◽  
Pik Fang Kho ◽  
Pernilla Lif Holgerson ◽  
Liang-Dar Hwang ◽  
Nicholas J. Timpson ◽  
...  

AbstractBackgroundHypothesis-free Mendelian randomization studies provide a way to assess the causal relevance of a trait across the human phenome but can be limited by statistical power or complicated by horizontal pleiotropy. The recently described latent causal variable (LCV) approach provides an alternative method for causal inference which might be useful in hypothesis-free experiments.MethodsWe developed an automated pipeline for phenome-wide tests using the LCV approach including steps to estimate partial genetic causality, filter to a meaningful set of estimates, apply correction for multiple testing and then present the findings in a graphical summary termed a causal architecture plot. We apply this process to body mass index and lipid traits as exemplars of traits where there is strong prior expectation for causal effects and dental caries and periodontitis as exemplars of traits where there is a need for causal inference.ResultsThe results for lipids and BMI suggest that these traits are best viewed as creating consequences on a multitude of traits and conditions, thus providing additional evidence that supports viewing these traits as targets for interventions to improve health. On the other hand, caries and periodontitis are best viewed as a downstream consequence of other traits and diseases rather than a cause of ill health.ConclusionsThe automated process is available as part of the MASSIVE pipeline from the Complex-Traits Genetics Virtual Lab (https://vl.genoma.io) and results are available in (https://view.genoma.io). We propose causal architecture plots based on phenome-wide partial genetic causality estimates as a way visualizing the overall causal map of the human phenome.Key messagesThe latent causal variable approach uses summary statistics from genome-wide association studies to estimate a parameter termed genetic causality proportion.Systematic estimation of genetic causality proportion for many pairs of traits provides an alternative method for phenome-wide causal inference with some theoretical and practical advantages compared to phenome-wide Mendelian randomization.Using this approach, we confirm that lipid traits are an upstream risk factor for other traits and diseases, and we identify that dental diseases are predominantly a downstream consequence of other traits rather than a cause of poor systemic health.The method assumes no bidirectional causality and no confounding by environmental correlates of genotypes, so care is needed when these assumptions are not met.We developed an automated and accessible pipeline for estimating phenome-wide causal relationships and generating interactive visual summaries.


2021 ◽  
Vol 12 ◽  
Author(s):  
Jie Yang ◽  
Tianyi Chen ◽  
Yahong Zhu ◽  
Mingxia Bai ◽  
Xingang Li

BackgroundPrevious epidemiological studies have shown significant associations between chronic periodontitis (CP) and chronic kidney disease (CKD), but the causal relationship remains uncertain. Aiming to examine the causal relationship between these two diseases, we conducted a bidirectional two-sample Mendelian randomization (MR) analysis with multiple MR methods.MethodsFor the casual effect of CP on CKD, we selected seven single-nucleotide polymorphisms (SNPs) specific to CP as genetic instrumental variables from the genome-wide association studies (GWAS) in the GLIDE Consortium. The summary statistics of complementary kidney function measures, i.e., estimated glomerular filtration rate (eGFR) and blood urea nitrogen (BUN), were derived from the GWAS in the CKDGen Consortium. For the reversed causal inference, six SNPs associated with eGFR and nine with BUN from the CKDGen Consortium were included and the summary statistics were extracted from the CLIDE Consortium.ResultsNo significant causal association between genetically determined CP and eGFR or BUN was found (all p > 0.05). Based on the conventional inverse variance-weighted method, one of seven instrumental variables supported genetically predicted CP being associated with a higher risk of eGFR (estimate = 0.019, 95% CI: 0.012–0.026, p < 0.001).ConclusionEvidence from our bidirectional causal inference does not support a causal relation between CP and CKD risk and therefore suggests that associations reported by previous observational studies may represent confounding.


2021 ◽  
Vol 12 ◽  
Author(s):  
Yuquan Wang ◽  
Tingting Li ◽  
Liwan Fu ◽  
Siqian Yang ◽  
Yue-Qing Hu

Mendelian randomization makes use of genetic variants as instrumental variables to eliminate the influence induced by unknown confounders on causal estimation in epidemiology studies. However, with the soaring genetic variants identified in genome-wide association studies, the pleiotropy, and linkage disequilibrium in genetic variants are unavoidable and may produce severe bias in causal inference. In this study, by modeling the pleiotropic effect as a normally distributed random effect, we propose a novel mixed-effects regression model-based method PLDMR, pleiotropy and linkage disequilibrium adaptive Mendelian randomization, which takes linkage disequilibrium into account and also corrects for the pleiotropic effect in causal effect estimation and statistical inference. We conduct voluminous simulation studies to evaluate the performance of the proposed and existing methods. Simulation results illustrate the validity and advantage of the novel method, especially in the case of linkage disequilibrium and directional pleiotropic effects, compared with other methods. In addition, by applying this novel method to the data on Atherosclerosis Risk in Communications Study, we conclude that body mass index has a significant causal effect on and thus might be a potential risk factor of systolic blood pressure. The novel method is implemented in R and the corresponding R code is provided for free download.


2020 ◽  
Vol 36 (15) ◽  
pp. 4374-4376
Author(s):  
Ninon Mounier ◽  
Zoltán Kutalik

Abstract Summary Increasing sample size is not the only strategy to improve discovery in Genome Wide Association Studies (GWASs) and we propose here an approach that leverages published studies of related traits to improve inference. Our Bayesian GWAS method derives informative prior effects by leveraging GWASs of related risk factors and their causal effect estimates on the focal trait using multivariable Mendelian randomization. These prior effects are combined with the observed effects to yield Bayes Factors, posterior and direct effects. The approach not only increases power, but also has the potential to dissect direct and indirect biological mechanisms. Availability and implementation bGWAS package is freely available under a GPL-2 License, and can be accessed, alongside with user guides and tutorials, from https://github.com/n-mounier/bGWAS. Supplementary information Supplementary data are available at Bioinformatics online.


2018 ◽  
Vol 21 (6) ◽  
pp. 485-494 ◽  
Author(s):  
Subhi Arafat ◽  
Camelia C. Minică

The Barker hypothesis states that low birth weight (BW) is associated with higher risk of adult onset diseases, including mental disorders like schizophrenia, major depressive disorder (MDD), and attention deficit hyperactivity disorder (ADHD). The main criticism of this hypothesis is that evidence for it comes from observational studies. Specifically, observational evidence does not suffice for inferring causality, because the associations might reflect the effects of confounders. Mendelian randomization (MR) — a novel method that tests causality on the basis of genetic data — creates the unprecedented opportunity to probe the causality in the association between BW and mental disorders in observation studies. We used MR and summary statistics from recent large genome-wide association studies to test whether the association between BW and MDD, schizophrenia and ADHD is causal. We employed the inverse variance weighted (IVW) method in conjunction with several other approaches that are robust to possible assumption violations. MR-Egger was used to rule out horizontal pleiotropy. IVW showed that the association between BW and MDD, schizophrenia and ADHD is not causal (all p > .05). The results of all the other MR methods were similar and highly consistent. MR-Egger provided no evidence for pleiotropic effects biasing the estimates of the effects of BW on MDD (intercept = -0.004, SE = 0.005, p = .372), schizophrenia (intercept = 0.003, SE = 0.01, p = .769), or ADHD (intercept = 0.009, SE = 0.01, p = .357). Based on the current evidence, we refute the Barker hypothesis concerning the fetal origins of adult mental disorders. The discrepancy between our results and the results from observational studies may be explained by the effects of confounders in the observational studies, or by the existence of a small causal effect not detected in our study due to weak instruments. Our power analyses suggested that the upper bound for a potential causal effect of BW on mental disorders would likely not exceed an odds ratio of 1.2.


2020 ◽  
Author(s):  
Jingshu Wang ◽  
Qingyuan Zhao ◽  
Jack Bowden ◽  
Gilbran Hemani ◽  
George Davey Smith ◽  
...  

Over a decade of genome-wide association studies have led to the finding that significant genetic associations tend to spread across the genome for complex traits. The extreme polygenicity where "all genes affect every complex trait" complicates Mendelian Randomization studies, where natural genetic variations are used as instruments to infer the causal effect of heritable risk factors. We reexamine the assumptions of existing Mendelian Randomization methods and show how they need to be clarified to allow for pervasive horizontal pleiotropy and heterogeneous effect sizes. We propose a comprehensive framework GRAPPLE (Genome-wide mR Analysis under Pervasive PLEiotropy) to analyze the causal effect of target risk factors with heterogeneous genetic instruments and identify possible pleiotropic patterns from data. By using summary statistics from genome-wide association studies, GRAPPLE can efficiently use both strong and weak genetic instruments, detect the existence of multiple pleiotropic pathways, adjust for confounding risk factors, and determine the causal direction. With GRAPPLE, we analyze the effect of blood lipids, body mass index, and systolic blood pressure on 25 disease outcomes, gaining new information on their causal relationships and the potential pleiotropic pathways.


PLoS ONE ◽  
2021 ◽  
Vol 16 (12) ◽  
pp. e0261020
Author(s):  
Masahiro Yoshikawa ◽  
Kensuke Asaba ◽  
Tomohiro Nakayama

Chronic kidney disease (CKD) and atrial fibrillation are both major burdens on the health care system worldwide. Several observational studies have reported clinical associations between CKD and atrial fibrillation; however, causal relationships between these conditions remain to be elucidated due to possible bias by confounders and reverse causations. Here, we conducted bidirectional two-sample Mendelian randomization analyses using publicly available summary statistics of genome-wide association studies (the CKDGen consortium and the UK Biobank) to investigate causal associations between CKD and atrial fibrillation/flutter in the European population. Our study suggested a causal effect of the risk of atrial fibrillation/flutter on the decrease in serum creatinine-based estimated glomerular filtration rate (eGFR) and revealed a causal effect of the risk of atrial fibrillation/flutter on the risk of CKD (odds ratio, 9.39 per doubling odds ratio of atrial fibrillation/flutter; 95% coefficient interval, 2.39–37.0; P = 0.001), while the causal effect of the decrease in eGFR on the risk of atrial fibrillation/flutter was unlikely. However, careful interpretation and further studies are warranted, as the underlying mechanisms remain unknown. Further, our sample size was relatively small and selection bias was possible.


2021 ◽  
Vol 12 ◽  
Author(s):  
Haoxin Peng ◽  
Xiangrong Wu ◽  
Yaokai Wen ◽  
Yiyuan Ao ◽  
Yutian Li ◽  
...  

Background:Leisure sedentary behaviors (LSB) are widespread, and observational studies have provided emerging evidence that LSB play a role in the development of lung cancer (LC). However, the causal inference between LSB and LC remains unknown.Methods: We utilized univariable (UVMR) and multivariable two-sample Mendelian randomization (MVMR) analysis to disentangle the effects of LSB on the risk of LC. MR analysis was conducted with genetic variants from genome-wide association studies of LSB (408,815 persons from UK Biobank), containing 152 single-nucleotide polymorphisms (SNPs) for television (TV) watching, 37 SNPs for computer use, and four SNPs for driving, and LC from the International Lung Cancer Consortium (11,348 cases and 15,861 controls). Multiple sensitivity analyses were further performed to verify the causality.Results: UVMR demonstrated that genetically predisposed 1.5-h increase in LSB spent on watching TV increased the odds of LC by 90% [odds ratio (OR), 1.90; 95% confidence interval (CI), 1.44–2.50; p < 0.001]. Similar trends were observed for squamous cell lung cancer (OR, 1.97; 95%CI, 1.31–2.94; p = 0.0010) and lung adenocarcinoma (OR, 1.64; 95%CI 1.12–2.39; p = 0.0110). The causal effects remained significant after adjusting for education (OR, 1.97; 95%CI, 1.44–2.68; p < 0.001) and body mass index (OR, 1.86; 95%CI, 1.36–2.54; p < 0.001) through MVMR approach. No association was found between prolonged LSB spent on computer use and driving and LC risk. Genetically predisposed prolonged LSB was additionally correlated with smoking (OR, 1.557; 95%CI, 1.287–1.884; p < 0.001) and alcohol consumption (OR, 1.010; 95%CI, 1.004–1.016; p = 0.0016). Consistency of results across complementary sensitivity MR methods further strengthened the causality.Conclusion: Robust evidence was demonstrated for an independent, causal effect of LSB spent on watching TV in increasing the risk of LC. Further work is necessary to investigate the potential mechanisms.


Sign in / Sign up

Export Citation Format

Share Document