scholarly journals Bayesian meta-analysis across genome-wide association studies of diverse phenotypes

2018 ◽  
Author(s):  
Holly Trochet ◽  
Matti Pirinen ◽  
Gavin Band ◽  
Luke Jostins ◽  
Gilean McVean ◽  
...  

AbstractGenome-wide association studies (GWAS) are a powerful tool for understanding the genetic basis of diseases and traits, but most studies have been conducted in isolation, with a focus on either a single or a set of closely related phenotypes. We describe MetABF, a simple Bayesian framework for performing integrative meta-analysis across multiple GWAS using summary statistics. The approach is applicable across a wide range of study designs and can increase the power by 50% compared to standard frequentist tests when only a subset of studies have a true effect. We demonstrate its utility in a meta-analysis of 20 diverse GWAS which were part of the Wellcome Trust Case-Control Consortium 2. The novelty of the approach is its ability to explore, and assess the evidence for, a range of possible true patterns of association across studies in a computationally efficient framework.

2020 ◽  
Author(s):  
Reza Nasirigerdeh ◽  
Reihaneh Torkzadehmahani ◽  
Julian Matschinske ◽  
Tobias Frisch ◽  
Markus List ◽  
...  

ABSTRACTGenome-wide association studies (GWAS) have been widely used to unravel connections between genetic variants and diseases. Larger sample sizes in GWAS can lead to discovering more associations and more accurate genetic predictors. However, sharing and combining distributed genomic data to increase the sample size is often challenging or even impossible due to privacy concerns and privacy protection laws such as the GDPR. While meta-analysis has been established as an effective approach to combine summary statistics of several GWAS, its accuracy can be attenuated in the presence of cross-study heterogeneity. Here, we present sPLINK (safe PLINK), a user-friendly tool, which performs federated GWAS on distributed datasets while preserving the privacy of data and the accuracy of the results. sPLINK neither exchanges raw data nor does it rely on summary statistics. Instead, it performs model training in a federated manner, communicating only model parameters between cohorts and a central server. We verify that the federated results from sPLINK are the same as those from aggregated analyses conducted with PLINK. We demonstrate that sPLINK is robust against heterogeneous data (phenotype and confounding factors) distributions across cohorts while existing meta-analysis tools considerably lose accuracy in such scenarios. We also show that sPLINK achieves practical runtime, in order of minutes or hours, and acceptable network bandwidth consumption for chi-square and linear/logistic regression tests. Federated analysis with sPLINK, thus, has the potential to replace meta-analysis as the gold standard for collaborative GWAS. The user-friendly, readily usable sPLINK tool is available at https://exbio.wzw.tum.de/splink.


2020 ◽  
Author(s):  
Jiangming Sun ◽  
Yunpeng Wang

ABSTRACTSummaryPost-GWAS studies using the results from large consortium meta-analysis often need to correctly take care of the overlapping sample issue. The gold standard approach for resolving this issue is to reperform the GWAS or meta-analysis excluding the overlapped participants. However, such approach is time-consuming and, sometimes, restricted by the available data. deMeta provides a user friendly and computationally efficient command-line implementation for removing the effect of a contributing sub-study to a consortium from the meta-analysis results. Only the summary statistics of the meta-analysis the sub-study to be removed are required. In addition, deMeta can generate contrasting Manhattan and quantile-quantile plots for users to visualize the impact of the sub-study on the meta-analysis results.Availability and ImplementationThe python source code, examples and documentations of deMeta are publicly available at https://github.com/Computational-NeuroGenetics/[email protected] (J. Sun); [email protected] (Y. Wang)Supplementary informationNone.


2016 ◽  
Author(s):  
Xiang Zhu ◽  
Matthew Stephens

Bayesian methods for large-scale multiple regression provide attractive approaches to the analysis of genome-wide association studies (GWAS). For example, they can estimate heritability of complex traits, allowing for both polygenic and sparse models; and by incorporating external genomic data into the priors they can increase power and yield new biological insights. However, these methods require access to individual genotypes and phenotypes, which are often not easily available. Here we provide a framework for performing these analyses without individual-level data. Specifically, we introduce a “Regression with Summary Statistics” (RSS) likelihood, which relates the multiple regression coefficients to univariate regression results that are often easily available. The RSS likelihood requires estimates of correlations among covariates (SNPs), which also can be obtained from public databases. We perform Bayesian multiple regression analysis by combining the RSS likelihood with previously-proposed prior distributions, sampling posteriors by Markov chain Monte Carlo. In a wide range of simulations RSS performs similarly to analyses using the individual data, both for estimating heritability and detecting associations. We apply RSS to a GWAS of human height that contains 253,288 individuals typed at 1.06 million SNPs, for which analyses of individual-level data are practically impossible. Estimates of heritability (52%) are consistent with, but more precise, than previous results using subsets of these data. We also identify many previously-unreported loci that show evidence for association with height in our analyses. Software is available at https://github.com/stephenslab/rss.


Genetics ◽  
2021 ◽  
Author(s):  
Ravi V Mural ◽  
Marcin Grzybowski ◽  
Chenyong Miao ◽  
Alyssa Damke ◽  
Sirjan Sapkota ◽  
...  

Abstract Community association populations are composed of phenotypically and genetically diverse accessions. Once these populations are genotyped, the resulting marker data can be reused by different groups investigating the genetic basis of different traits. Because the same genotypes are observed and scored for a wide range of traits in different environments, these populations represent a unique resource to investigate pleiotropy. Here we assembled a set of 234 separate trait datasets for the Sorghum Association Panel, a group of 406 sorghum genotypes widely employed by the sorghum genetics community. Comparison of genome wide association studies conducted with two independently generated marker sets for this population demonstrate that existing genetic marker sets do not saturate the genome and likely capture only 35-43% of potentially detectable loci controlling variation for traits scored in this population. While limited evidence for pleiotropy was apparent in cross-GWAS comparisons, a multivariate adaptive shrinkage approach recovered both known pleiotropic effects of existing loci and new pleiotropic effects, particularly significant impacts of known dwarfing genes on root architecture. In addition, we identified new loci with pleiotropic effects consistent with known trade-offs in sorghum development. These results demonstrate the potential for mining existing trait datasets from widely used community association populations to enable new discoveries from existing trait datasets as new, denser genetic marker datasets are generated for existing community association populations.


2020 ◽  
Vol 46 (Supplement_1) ◽  
pp. S103-S103
Author(s):  
Tim Bigdeli ◽  
Ayman Fanous ◽  
Nallakkandi Rajeevan ◽  
Frederick Sayward ◽  
Yuli Li ◽  
...  

Abstract Background Schizophrenia and bipolar disorder are debilitating neuropsychiatric illnesses collectively affecting 2% of the world’s population, and which cause tremendous human suffering that impacts patients, their families and their communities. Recognizing the major impact of these disorders on the psychosocial function of more than 200,000 US Veterans, the Department of Veterans Affairs (VA) recently genotyping of nearly 9,000 veterans with schizophrenia or bipolar I disorder in Cooperative Studies Program (CSP) #572: “Genetics of Functional Disability in Schizophrenia and Bipolar Illness”, all of whom were extensively assessed for neurocognitive function and disability, and genotyped using a custom Affymetrix Axiom Biobank array. Methods Primary genome-wide association studies (GWAS) of schizophrenia and bipolar disorder were performed across and within ancestry goups, with attempted replication in matched subjects from the PGC and Genomic Psychiatry Cohort (GPC). We combined results for CSP#572 with available summary statistics from the PGC, Indonesia Schizophrenia Consortium and Genetic REsearch on schizophreniA neTwork-China and Netherland (GREAT-CN) study, and multi-ethnic GPC cohorts, achieving among the largest and most diverse studies of these disorders to date. Results Polygenic risk scores based on published PGC summary statistics for schizophrenia or bipolar disorder were significantly associated with case status among EA (P<10–30) and AA (P<0.0005) participants in CSP#572. Our primary analyses of schizophrenia yielded a single genome-wide significant association with variants in CHD7 at 8q12.2 for European-American (EA) participants, which remained significant in a joint analysis of EA and African-American (AA) subjects (P=4.62e-08). While no genome-wide significant associations were detected by our within-ancestry analyses of bipolar disorder, a cross-ancestry meta-analysis of CSP#572 participants yielded a significant finding at 10q25 with variants in SORCS3 (P=2.62e-08). Among loci attaining P<0.0001 in our within-ancestry analyses, 4 and 8 subsequently achieved genome-wide significance, respectively, when jointly analyzed with matched subjects from the PGC and GPC. Combining our results with published summary statistics, we performed a cross-ancestry GWAS meta-analysis of 69,280 schizophrenia cases and 138,379 controls, identifying 200 genome-wide significant loci of which 76 are newly reported here. Cross-ancestry analysis of 28,326 bipolar cases and 90,570 controls identified 24 genome-wide significant loci, including novel associations with common variants in PAX5, DOCK2, MACROD2, BRE, KCNG1, and LINC01378. Discussion We newly describe genome-wide analyses in a diverse cohort of US Veterans with schizophrenia or bipolar disorder, benchmarking the predictive value of polygenic risk scores based on published GWAS findings. Leveraging available summary statistics from studies of global populations, we add to burgeoning lists of genomic loci implicated in the etiologies of these disorders.


2018 ◽  
Author(s):  
Loic Yengo ◽  
Julia Sidorenko ◽  
Kathryn E. Kemper ◽  
Zhili Zheng ◽  
Andrew R. Wood ◽  
...  

Genome-wide association studies (GWAS) stand as powerful experimental designs for identifying DNA variants associated with complex traits and diseases. In the past decade, both the number of such studies and their sample sizes have increased dramatically. Recent GWAS of height and body mass index (BMI) in ∼250,000 European participants have led to the discovery of ∼700 and ∼100 nearly independent SNPs associated with these traits, respectively. Here we combine summary statistics from those two studies with GWAS of height and BMI performed in ∼450,000 UK Biobank participants of European ancestry. Overall, our combined GWAS meta-analysis reaches N∼700,000 individuals and substantially increases the number of GWAS signals associated with these traits. We identified 3,290 and 716 near-independent SNPs associated with height and BMI, respectively (at a revised genome-wide significance threshold of p<1 × 10−8), including 1,185 height-associated SNPs and 554 BMI-associated SNPs located within loci not previously identified by these two GWAS. The genome-wide significant SNPs explain ∼24.6% of the variance of height and ∼5% of the variance of BMI in an independent sample from the Health and Retirement Study (HRS). Correlations between polygenic scores based upon these SNPs with actual height and BMI in HRS participants were 0.44 and 0.20, respectively. From analyses of integrating GWAS and eQTL data by Summary-data based Mendelian Randomization (SMR), we identified an enrichment of eQTLs amongst lead height and BMI signals, prioritisting 684 and 134 genes, respectively. Our study demonstrates that, as previously predicted, increasing GWAS sample sizes continues to deliver, by discovery of new loci, increasing prediction accuracy and providing additional data to achieve deeper insight into complex trait biology. All summary statistics are made available for follow up studies.


Sign in / Sign up

Export Citation Format

Share Document