scholarly journals methylSCOPA and META-methylSCOPA: software for the analysis and aggregation of epigenome-wide association studies of multiple correlated phenotypes

2019 ◽  
Author(s):  
Harmen Draisma ◽  
Jun Liu ◽  
Igor Pupko ◽  
Ayşe Demirkan ◽  
Zhanna Balkhiyarova ◽  
...  

AbstractBackgroundMulti-phenotype genome-wide association studies (MP-GWAS) of correlated traits have greater power to detect genotype–phenotype associations than single-trait GWAS. However, no multi-phenotype analysis method exists for epigenome-wide association studies (EWAS).ResultsWe extended the SCOPA approach developed by us to “methylSCOPA” software in C++ by ‘reversely’ regressing DNA hyper/hypo-methylation information on a linear combination of phenotypes. We evaluated two models of association between DNA methylation and fasting glucose (FG) and insulin (FI) levels: Model 1, including FG, FI, and three measured potential confounders (body mass index [BMI], fasting serum triglyceride levels [TG], and waist/hip ratio [WHR]), and Model 2, including FG and FI corrected for the effects of BMI, TG, and WHR. Both models were additionally corrected for participant sex and smoking status (current/ever/never). We meta-analyzed the cohort-specific MP-EWAS results with our novel software META-methylSCOPA, mapped genomic locations to CGCh37/hg19, and adopted P<1×10−7 to denote epigenome-wide significance. We used the Illumina Infinium HumanMethylation450K BeadChip array data from the Northern Finland Birth Cohorts (NFBC) 1966/1986. We quality-controlled the data, regressed out the effects of measured potential confounders, and normalized the methylation signal intensity and FI data. The MP-EWAS included data for 643/457 individuals from NFBC1966 and NFBC1986, respectively (total N=1,100).In Model 1, we detected epigenome-wide significant association in the MP-EWAS meta-analysis at cg13708645 (chr12:121,974,305; P=1.2×10−8) within KDM2B gene. Single-trait effects within KDM2B were on FI, BMI, and WHR. Model with effect on BMI and WHR showed the strongest association at this locus, while effect on FI in single-phenotype analysis was driven by the effect of adiposity. In Model 2, the strongest association was at cg05063096 (chr3:143,689,810; P=2.3×10−7) annotated to C3orf58 with strongest effect on FI in single-trait analysis and multi-phenotype effect on FI and WHI within Model 1.We characterized the effects of established EWAS loci for diabetes and its risk factors and detected suggestive (p<0.01) associations at six markers including PHGDH, TXNIP, SLC7A11, CPT1A, MYO5C and ABCG1, through the dissection of the multi-phenotype effects in Model 1.ConclusionsWe implemented MP-EWAS in methylSCOPA and demonstrated its enhanced power over single-trait EWAS for correlated phenotypes in large-scale data.

2018 ◽  
Vol 21 (2) ◽  
pp. 84-88 ◽  
Author(s):  
W. David Hill

Intelligence and educational attainment are strongly genetically correlated. This relationship can be exploited by Multi-Trait Analysis of GWAS (MTAG) to add power to Genome-wide Association Studies (GWAS) of intelligence. MTAG allows the user to meta-analyze GWASs of different phenotypes, based on their genetic correlations, to identify association's specific to the trait of choice. An MTAG analysis using GWAS data sets on intelligence and education was conducted by Lam et al. (2017). Lam et al. (2017) reported 70 loci that they described as ‘trait specific’ to intelligence. This article examines whether the analysis conducted by Lam et al. (2017) has resulted in genetic information about a phenotype that is more similar to education than intelligence.


Genes ◽  
2021 ◽  
Vol 13 (1) ◽  
pp. 87
Author(s):  
Sean M. Burnard ◽  
Rodney A. Lea ◽  
Miles Benton ◽  
David Eccles ◽  
Daniel W. Kennedy ◽  
...  

Conventional genome-wide association studies (GWASs) of complex traits, such as Multiple Sclerosis (MS), are reliant on per-SNP p-values and are therefore heavily burdened by multiple testing correction. Thus, in order to detect more subtle alterations, ever increasing sample sizes are required, while ignoring potentially valuable information that is readily available in existing datasets. To overcome this, we used penalised regression incorporating elastic net with a stability selection method by iterative subsampling to detect the potential interaction of loci with MS risk. Through re-analysis of the ANZgene dataset (1617 cases and 1988 controls) and an IMSGC dataset as a replication cohort (1313 cases and 1458 controls), we identified new association signals for MS predisposition, including SNPs above and below conventional significance thresholds while targeting two natural killer receptor loci and the well-established HLA loci. For example, rs2844482 (98.1% iterations), otherwise ignored by conventional statistics (p = 0.673) in the same dataset, was independently strongly associated with MS in another GWAS that required more than 40 times the number of cases (~45 K). Further comparison of our hits to those present in a large-scale meta-analysis, confirmed that the majority of SNPs identified by the elastic net model reached conventional statistical GWAS thresholds (p < 5 × 10−8) in this much larger dataset. Moreover, we found that gene variants involved in oxidative stress, in addition to innate immunity, were associated with MS. Overall, this study highlights the benefit of using more advanced statistical methods to (re-)analyse subtle genetic variation among loci that have a biological basis for their contribution to disease risk.


2016 ◽  
Vol 17 (10) ◽  
pp. 1363-1373 ◽  
Author(s):  
Puya Gharahkhani ◽  
Rebecca C Fitzgerald ◽  
Thomas L Vaughan ◽  
Claire Palles ◽  
Ines Gockel ◽  
...  

2018 ◽  
Author(s):  
Brenton R. Swenson ◽  
Tin Louie ◽  
Henry J. Lin ◽  
Raú MéndezGiráldez ◽  
Jennifer E Below ◽  
...  

ABSTRACTBackgroundThe electrocardiographically quantified QRS duration measures ventricular depolarization and conduction. QRS prolongation has been associated with poor heart failure prognosis and cardiovascular mortality, including sudden death. While previous genome-wide association studies (GWAS) have identified 32 QRS SNPs across 26 loci among European, African, and Asian-descent populations, the genetics of QRS among Hispanics/Latinos has not been previously explored.MethodsWe performed a GWAS of QRS duration among Hispanic/Latino ancestry populations (n=15,124) from four studies using 1000 Genomes imputed genotype data (adjusted for age, sex, global ancestry, clinical and study-specific covariates). Study-specific results were combined using fixed-effects, inverse variance-weighted meta-analysis.ResultsWe identified six loci associated with QRS (P<5×10−8), including two novel loci: MYOCD, a nuclear protein expressed in the heart, and SYT1, an integral membrane protein. The top association in the MYOCD locus, intronic SNP rs16946539, was found in Hispanics/Latinos with a minor allele frequency (MAF) of 0.04, but is monomorphic in European and African descent populations. The most significant QRS duration association was for intronic SNP rs3922344 (P= 8.56×10−26) in SCN5A/SCN10A. Three additional previously identified loci, CDKN1A, VTI1A, and HAND1, also exceeded the GWAS significance threshold among Hispanics/Latinos. A total of 27 of 32 previously identified QRS duration SNPs were shown to generalize in Hispanics/Latinos.ConclusionsOur QRS duration GWAS, the first in Hispanic/Latino populations, identified two new loci, underscoring the utility of extending large scale genomic studies to currently under-examined populations.


2016 ◽  
Author(s):  
G.V. Roshchupkin ◽  
H.H.H. Adams ◽  
M.W. Vernooij ◽  
A. Hofman ◽  
C.M. Van Duijn ◽  
...  

ABSTRACTLarge-scale data collection and processing have facilitated scientific discoveries in fields such as genomics and imaging, but cross-investigations between multiple big datasets remain impractical. Computational requirements of high-dimensional association studies are often too demanding for individual sites. Additionally, the sheer size of intermediate results is unfit for collaborative settings where summary statistics are exchanged for meta-analyses. Here we introduce the HASE framework to perform high-dimensional association studies with dramatic reduction in both computational burden and storage requirements of intermediate results. We implemented a novel meta-analytical method that yields identical power as pooled analyses without the need of sharing individual participant data. The efficiency of the framework is illustrated by associating 9 million genetic variants with 1.5 million brain imaging voxels in three cohorts (total N=4,034) followed by meta-analysis, on a standard computational infrastructure. These experiments indicate that HASE facilitates high-dimensional association studies enabling large multicenter association studies for future discoveries.


2012 ◽  
Vol 15 (3) ◽  
pp. 414-418 ◽  
Author(s):  
Nic M. Novak ◽  
Jason L. Stein ◽  
Sarah E. Medland ◽  
Derrek P. Hibar ◽  
Paul M. Thompson ◽  
...  

In an attempt to increase power to detect genetic associations with brain phenotypes derived from human neuroimaging data, we recently conducted a large-scale, genome-wide association meta-analysis of hippocampal, brain, and intracranial volume through the Enhancing NeuroImaging Genetics through Meta-Analysis (ENIGMA) consortium. Here, we present a freely available online interactive tool, EnigmaVis, which makes it easy to visualize the association results generated by the consortium alongside allele frequency, genes, and functional annotations. EnigmaVis runs natively within the web browser, and generates plots that show the level of association between brain phenotypes at user-specified genomic positions. Uniquely, EnigmaVis is dynamic; users can interact with elements on the plot in real time. This software will be useful when exploring the effect on brain structure of particular genetic variants influencing neuropsychiatric illness and cognitive function. Future projects of the consortium and updates to EnigmaVis will also be displayed on the site. EnigmaVis is freely available online at http://enigma.loni.ucla.edu/enigma-vis/


2021 ◽  
Author(s):  
Gabriel Hoffman ◽  
Biao Zeng ◽  
Jaroslav Bendl ◽  
Roman Kosoy ◽  
John Fullard ◽  
...  

Abstract While large-scale genome-wide association studies (GWAS) have identified hundreds of loci associated with neuropsychiatric and neurodegenerative traits, identifying the variants, genes and molecular mechanisms underlying these traits remains challenging. Integrating GWAS results with expression quantitative trait loci (eQTLs) and identifying shared genetic architecture has been widely adopted to nominate genes and candidate causal variants. However, this integrative approach is often limited by the sample size, the statistical power of the eQTL dataset, and the strong linkage disequilibrium between variants. Here we developed the multivariate multiple QTL (mmQTL) approach and applied it to perform a large-scale trans-ethnic eQTL meta-analysis to increase power and fine-mapping resolution. Importantly, this method also increases power to identify conditional eQTL’s that are enriched for cell type specific regulatory effects. Analysis of 3,188 RNA-seq samples from 2,029 donors, including 444 non-European individuals, yields an effective sample size of 2,974, which is substantially larger than previous brain eQTL efforts. Joint statistical fine-mapping of eQTL and GWAS identified 301 variant-trait pairs for 23 brain-related traits driven by 189 unique candidate causal variants for 179 unique genes. This integrative analysis identifies novel disease genes and elucidates potential regulatory mechanisms for genes underlying schizophrenia, bipolar disorder and Alzheimer’s disease.


2021 ◽  
Author(s):  
Biao Zeng ◽  
Jaroslav Bendl ◽  
Roman Kosoy ◽  
John F. Fullard ◽  
Gabriel E. Hoffman ◽  
...  

AbstractWhile large-scale genome-wide association studies (GWAS) have identified hundreds of loci associated with neuropsychiatric and neurodegenerative traits, identifying the variants, genes and molecular mechanisms underlying these traits remains challenging. Integrating GWAS results with expression quantitative trait loci (eQTLs) and identifying shared genetic architecture has been widely adopted to nominate genes and candidate causal variants. However, this integrative approach is often limited by the sample size, the statistical power of the eQTL dataset, and the strong linkage disequilibrium between variants. Here we developed the multivariate multiple QTL (mmQTL) approach and applied it to perform a large-scale trans-ethnic eQTL meta-analysis to increase power and fine-mapping resolution. Importantly, this method also increases power to identify conditional eQTL’s that are enriched for cell type specific regulatory effects. Analysis of 3,188 RNA-seq samples from 2,029 donors, including 444 non-European individuals, yields an effective sample size of 2,974, which is substantially larger than previous brain eQTL efforts. Joint statistical fine-mapping of eQTL and GWAS identified 301 variant-trait pairs for 23 brain-related traits driven by 189 unique candidate causal variants for 179 unique genes. This integrative analysis identifies novel disease genes and elucidates potential regulatory mechanisms for genes underlying schizophrenia, bipolar disorder and Alzheimer’s disease.


Sign in / Sign up

Export Citation Format

Share Document