scholarly journals Reworking GWAS Data to Understand the Role of Nongenetic Factors in MS Etiopathogenesis

Genes ◽  
2020 ◽  
Vol 11 (1) ◽  
pp. 97
Author(s):  
Rosella Mechelli ◽  
Renato Umeton ◽  
Grazia Manfrè ◽  
Silvia Romano ◽  
Maria Chiara Buscarinu ◽  
...  

Genome-wide association studies have identified more than 200 multiple sclerosis (MS)-associated loci across the human genome over the last decade, suggesting complexity in the disease etiology. This complexity poses at least two challenges: the definition of an etiological model including the impact of nongenetic factors, and the clinical translation of genomic data that may be drivers for new druggable targets. We reviewed studies dealing with single genes of interest, to understand how MS-associated single nucleotide polymorphism (SNP) variants affect the expression and the function of those genes. We then surveyed studies on the bioinformatic reworking of genome-wide association studies (GWAS) data, with aggregate analyses of many GWAS loci, each contributing with a small effect to the overall disease predisposition. These investigations uncovered new information, especially when combined with nongenetic factors having possible roles in the disease etiology. In this context, the interactome approach, defined as “modules of genes whose products are known to physically interact with environmental or human factors with plausible relevance for MS pathogenesis”, will be reported in detail. For a future perspective, a polygenic risk score, defined as a cumulative risk derived from aggregating the contributions of many DNA variants associated with a complex trait, may be integrated with data on environmental factors affecting the disease risk or protection.

2019 ◽  
Author(s):  
Jonas Patron ◽  
Arnau Serra-Cayuela ◽  
Beomsoo Han ◽  
Carin Li ◽  
David Scott Wishart

AbstractTo date more than 3700 genome-wide association studies (GWAS) have been published that look at the genetic contributions of single nucleotide polymorphisms (SNPs) to human conditions or human phenotypes. Through these studies many highly significant SNPs have been identified for hundreds of diseases or medical conditions. However, the extent to which GWAS-identified SNPs or combinations of SNP biomarkers can predict disease risk is not well known. One of the most commonly used approaches to assess the performance of predictive biomarkers is to determine the area under the receiver-operator characteristic curve (AUROC). We have developed an R package called G-WIZ to generate ROC curves and calculate the AUROC using summary-level GWAS data. We first tested the performance of G-WIZ by using AUROC values derived from patient-level SNP data, as well as literature-reported AUROC values. We found that G-WIZ predicts the AUROC with <3% error. Next, we used the summary level GWAS data from GWAS Central to determine the ROC curves and AUROC values for 569 different GWA studies spanning 219 different conditions. Using these data we found a small number of GWA studies with SNP-derived risk predictors that have very high AUROCs (>0.75). On the other hand, the average GWA study produces a multi-SNP risk predictor with an AUROC of 0.55. Detailed AUROC comparisons indicate that most SNP-derived risk predictions are not as good as clinically based disease risk predictors. All our calculations (ROC curves, AUROCs, explained heritability) are in a publicly accessible database called GWAS-ROCS (http://gwasrocs.ca). The G-WIZ code is freely available for download at https://github.com/jonaspatronjp/GWIZ-Rscript/.


2017 ◽  
Vol 18 (17) ◽  
pp. 1563-1574 ◽  
Author(s):  
Olga Kulakova ◽  
Vitalina Bashinskaya ◽  
Ivan Kiselev ◽  
Natalia Baulina ◽  
Ekaterina Tsareva ◽  
...  

Author(s):  
Huaqing Zhao ◽  
Nandita Mitra ◽  
Peter A. Kanetsky ◽  
Katherine L. Nathanson ◽  
Timothy R. Rebbeck

Abstract Genome-wide association studies (GWAS) are susceptible to bias due to population stratification (PS). The most widely used method to correct bias due to PS is principal components (PCs) analysis (PCA), but there is no objective method to guide which PCs to include as covariates. Often, the ten PCs with the highest eigenvalues are included to adjust for PS. This selection is arbitrary, and patterns of local linkage disequilibrium may affect PCA corrections. To address these limitations, we estimate genomic propensity scores based on all statistically significant PCs selected by the Tracy-Widom (TW) statistic. We compare a principal components and propensity scores (PCAPS) approach to PCA and EMMAX using simulated GWAS data under no, moderate, and severe PS. PCAPS reduced spurious genetic associations regardless of the degree of PS, resulting in odds ratio (OR) estimates closer to the true OR. We illustrate our PCAPS method using GWAS data from a study of testicular germ cell tumors. PCAPS provided a more conservative adjustment than PCA. Advantages of the PCAPS approach include reduction of bias compared to PCA, consistent selection of propensity scores to adjust for PS, the potential ability to handle outliers, and ease of implementation using existing software packages.


2013 ◽  
Vol 37 (4) ◽  
pp. 383-392 ◽  
Author(s):  
Karla J. Lindquist ◽  
Eric Jorgenson ◽  
Thomas J. Hoffmann ◽  
John S. Witte

2017 ◽  
Vol 242 (13) ◽  
pp. 1325-1334 ◽  
Author(s):  
Yizhou Zhu ◽  
Cagdas Tazearslan ◽  
Yousin Suh

Genome-wide association studies have shown that the far majority of disease-associated variants reside in the non-coding regions of the genome, suggesting that gene regulatory changes contribute to disease risk. To identify truly causal non-coding variants and their affected target genes remains challenging but is a critical step to translate the genetic associations to molecular mechanisms and ultimately clinical applications. Here we review genomic/epigenomic resources and in silico tools that can be used to identify causal non-coding variants and experimental strategies to validate their functionalities. Impact statement Most signals from genome-wide association studies (GWASs) map to the non-coding genome, and functional interpretation of these associations remained challenging. We reviewed recent progress in methodologies of studying the non-coding genome and argued that no single approach allows one to effectively identify the causal regulatory variants from GWAS results. By illustrating the advantages and limitations of each method, our review potentially provided a guideline for taking a combinatorial approach to accurately predict, prioritize, and eventually experimentally validate the causal variants.


2017 ◽  
Author(s):  
Shrayashi Biswas ◽  
Soumen Pal ◽  
Samsiddhi Bhattacharjee

AbstractTraditional unbiased genome-wide association studies (GWAS) have successfully identified thousands of loci associated with various complex diseases but there is evidence to suggest that many variants were missed at stringent genome-wide thresholds. Fortunately, there is a rapidly increasing amount of prior knowledge in publicly available genomic datasets and biological databases that can be harnessed to enhance the power of discovering SNPs/Genes from existing or new GWAS datasets. For most diseases, many of the identified loci tend to cluster into a few specific biological pathways/networks. From the point of view of disease etiology, such clustering is generally to be expected. This phenomenon can be exploited to conduct a more powerful genome-wide scan that is tailored to identify loci that are interconnected in pathways. We propose a scalable regression-based analytical framework to enable such a pathway-guided GWAS and demonstrate that it provides significant gains in power to detect disease associated SNPs. Our method requires two inputs, namely a) genome-wide summary level data (e.g., SNP p-values) and b) a grouping of genes into biologically meaningful categories (e.g., a database of pathways). It automatically adjusts the input p-values by incorporating the knowledge derived adaptively from the data and the pathways specified. The method involves a regularized logistic regression analysis to derive priors of each SNP and then re-weights the p-values of SNPs so as to maximize overall power of making discoveries. It increases the power to discover SNPs co-clustering into some of these pathways, while maintaining the global type-1 error (FWER) at the desired level. We used whole-genome simulations and summary data from real GWA studies of psoriasis, SLE, coronary artery disease and type-2 diabetes to illustrate the power improvement achieved by pathway-guided search. Our pipeline implemented as an R package can flexibly handle large number of prior annotations possibly derived from multiple databases.


2021 ◽  
Author(s):  
Rujin Wang ◽  
Danyu Lin ◽  
Yuchao Jiang

More than a decade of genome-wide association studies (GWASs) have identified genetic risk variants that are significantly associated with complex traits. Emerging evidence suggests that the function of trait-associated variants likely acts in a tissue- or cell-type-specific fashion. Yet, it remains challenging to prioritize trait-relevant tissues or cell types to elucidate disease etiology. Here, we present EPIC (cEll tyPe enrIChment), a statistical framework that relates large-scale GWAS summary statistics to cell-type-specific omics measurements from single-cell sequencing. We derive powerful gene-level test statistics for common and rare variants, separately and jointly, and adopt generalized least squares to prioritize trait-relevant tissues or cell types while accounting for the correlation structures both within and between genes. Using enrichment of loci associated with four lipid traits in the liver and enrichment of loci associated with three neurological disorders in the brain as ground truths, we show that EPIC outperforms existing methods. We extend our framework to single-cell transcriptomic data and identify cell types underlying type 2 diabetes and schizophrenia. The enrichment is replicated using independent GWAS and single-cell datasets and further validated using PubMed search and existing bulk case-control testing results.


Sign in / Sign up

Export Citation Format

Share Document