Reworking GWAS Data to Understand the Role of Nongenetic Factors in MS Etiopathogenesis

Rosella Mechelli; Renato Umeton; Grazia Manfrè; Silvia Romano; Maria Chiara Buscarinu; Virginia Rinaldi; Gianmarco Bellucci; Rachele Bigi; Michela Ferraldeschi; Marco Salvetti; Giovanni Ristori

doi:10.3390/genes11010097

Reworking GWAS Data to Understand the Role of Nongenetic Factors in MS Etiopathogenesis

Genes ◽

10.3390/genes11010097 ◽

2020 ◽

Vol 11 (1) ◽

pp. 97

Author(s):

Rosella Mechelli ◽

Renato Umeton ◽

Grazia Manfrè ◽

Silvia Romano ◽

Maria Chiara Buscarinu ◽

...

Keyword(s):

Disease Risk ◽

Association Studies ◽

Gwas Data ◽

Genome Wide Association ◽

Future Perspective ◽

Polygenic Risk Score ◽

Genome Wide Association Studies ◽

Disease Etiology ◽

Genome Wide ◽

The Impact

Genome-wide association studies have identified more than 200 multiple sclerosis (MS)-associated loci across the human genome over the last decade, suggesting complexity in the disease etiology. This complexity poses at least two challenges: the definition of an etiological model including the impact of nongenetic factors, and the clinical translation of genomic data that may be drivers for new druggable targets. We reviewed studies dealing with single genes of interest, to understand how MS-associated single nucleotide polymorphism (SNP) variants affect the expression and the function of those genes. We then surveyed studies on the bioinformatic reworking of genome-wide association studies (GWAS) data, with aggregate analyses of many GWAS loci, each contributing with a small effect to the overall disease predisposition. These investigations uncovered new information, especially when combined with nongenetic factors having possible roles in the disease etiology. In this context, the interactome approach, defined as “modules of genes whose products are known to physically interact with environmental or human factors with plausible relevance for MS pathogenesis”, will be reported in detail. For a future perspective, a polygenic risk score, defined as a cumulative risk derived from aggregating the contributions of many DNA variants associated with a complex trait, may be integrated with data on environmental factors affecting the disease risk or protection.

Download Full-text

Assessing the performance of genome-wide association studies for predicting disease risk

10.1101/701086 ◽

2019 ◽

Author(s):

Jonas Patron ◽

Arnau Serra-Cayuela ◽

Beomsoo Han ◽

Carin Li ◽

David Scott Wishart

Keyword(s):

Disease Risk ◽

Association Studies ◽

Roc Curves ◽

Gwas Data ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Link Type ◽

Genome Wide ◽

Risk Predictors ◽

Gwa Studies

AbstractTo date more than 3700 genome-wide association studies (GWAS) have been published that look at the genetic contributions of single nucleotide polymorphisms (SNPs) to human conditions or human phenotypes. Through these studies many highly significant SNPs have been identified for hundreds of diseases or medical conditions. However, the extent to which GWAS-identified SNPs or combinations of SNP biomarkers can predict disease risk is not well known. One of the most commonly used approaches to assess the performance of predictive biomarkers is to determine the area under the receiver-operator characteristic curve (AUROC). We have developed an R package called G-WIZ to generate ROC curves and calculate the AUROC using summary-level GWAS data. We first tested the performance of G-WIZ by using AUROC values derived from patient-level SNP data, as well as literature-reported AUROC values. We found that G-WIZ predicts the AUROC with <3% error. Next, we used the summary level GWAS data from GWAS Central to determine the ROC curves and AUROC values for 569 different GWA studies spanning 219 different conditions. Using these data we found a small number of GWA studies with SNP-derived risk predictors that have very high AUROCs (>0.75). On the other hand, the average GWA study produces a multi-SNP risk predictor with an AUROC of 0.55. Detailed AUROC comparisons indicate that most SNP-derived risk predictions are not as good as clinically based disease risk predictors. All our calculations (ROC curves, AUROCs, explained heritability) are in a publicly accessible database called GWAS-ROCS (http://gwasrocs.ca). The G-WIZ code is freely available for download at https://github.com/jonaspatronjp/GWIZ-Rscript/.

Download Full-text

Pharmacogenetics of glatiramer acetate therapy for multiple sclerosis: the impact of genome-wide association studies identified disease risk loci

Pharmacogenomics ◽

10.2217/pgs-2017-0058 ◽

2017 ◽

Vol 18 (17) ◽

pp. 1563-1574 ◽

Cited By ~ 3

Author(s):

Olga Kulakova ◽

Vitalina Bashinskaya ◽

Ivan Kiselev ◽

Natalia Baulina ◽

Ekaterina Tsareva ◽

...

Keyword(s):

Multiple Sclerosis ◽

Glatiramer Acetate ◽

Disease Risk ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Genome Wide ◽

The Impact

Download Full-text

The Impact of Incomplete Linkage Disequilibrium and Genetic Model Choice on the Analysis and Interpretation of Genome-wide Association Studies

Annals of Human Genetics ◽

10.1111/j.1469-1809.2010.00579.x ◽

2010 ◽

Vol 74 (4) ◽

pp. 375-379 ◽

Cited By ~ 6

Author(s):

Mark M. Iles

Keyword(s):

Linkage Disequilibrium ◽

Genetic Model ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Model Choice ◽

Genome Wide ◽

The Impact

Download Full-text

Improvement in Prediction of Coronary Heart Disease Risk over Conventional Risk Factors Using SNPs Identified in Genome-Wide Association Studies

PLoS ONE ◽

10.1371/journal.pone.0057310 ◽

2013 ◽

Vol 8 (2) ◽

pp. e57310 ◽

Cited By ~ 16

Author(s):

Jennifer L. Bolton ◽

Marlene C. W. Stewart ◽

James F. Wilson ◽

Niall Anderson ◽

Jackie F. Price

Keyword(s):

Risk Factors ◽

Coronary Heart Disease ◽

Heart Disease ◽

Disease Risk ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Genome Wide ◽

Heart Disease Risk ◽

Conventional Risk Factors

Download Full-text

A practical approach to adjusting for population stratification in genome-wide association studies: principal components and propensity scores (PCAPS)

Statistical Applications in Genetics and Molecular Biology ◽

10.1515/sagmb-2017-0054 ◽

2018 ◽

Vol 17 (6) ◽

Cited By ~ 2

Author(s):

Huaqing Zhao ◽

Nandita Mitra ◽

Peter A. Kanetsky ◽

Katherine L. Nathanson ◽

Timothy R. Rebbeck

Keyword(s):

Principal Components ◽

Population Stratification ◽

Propensity Scores ◽

Association Studies ◽

Germ Cell Tumors ◽

Gwas Data ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Testicular Germ Cell ◽

Genome Wide

Abstract Genome-wide association studies (GWAS) are susceptible to bias due to population stratification (PS). The most widely used method to correct bias due to PS is principal components (PCs) analysis (PCA), but there is no objective method to guide which PCs to include as covariates. Often, the ten PCs with the highest eigenvalues are included to adjust for PS. This selection is arbitrary, and patterns of local linkage disequilibrium may affect PCA corrections. To address these limitations, we estimate genomic propensity scores based on all statistically significant PCs selected by the Tracy-Widom (TW) statistic. We compare a principal components and propensity scores (PCAPS) approach to PCA and EMMAX using simulated GWAS data under no, moderate, and severe PS. PCAPS reduced spurious genetic associations regardless of the degree of PS, resulting in odds ratio (OR) estimates closer to the true OR. We illustrate our PCAPS method using GWAS data from a study of testicular germ cell tumors. PCAPS provided a more conservative adjustment than PCA. Advantages of the PCAPS approach include reduction of bias compared to PCA, consistent selection of propensity scores to adjust for PS, the potential ability to handle outliers, and ease of implementation using existing software packages.

Download Full-text

A Review on the Impact of Genetics and Genome Wide Association Studies in Autoimmunity

MOJ Proteomics & Bioinformatics ◽

10.15406/mojpb.2017.06.00203 ◽

2017 ◽

Vol 6 (4) ◽

Author(s):

Harishchander Anandaram

Keyword(s):

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Genome Wide ◽

The Impact

Download Full-text

The Impact of Improved Microarray Coverage and Larger Sample Sizes on Future Genome-Wide Association Studies

Genetic Epidemiology ◽

10.1002/gepi.21724 ◽

2013 ◽

Vol 37 (4) ◽

pp. 383-392 ◽

Cited By ~ 16

Author(s):

Karla J. Lindquist ◽

Eric Jorgenson ◽

Thomas J. Hoffmann ◽

John S. Witte

Keyword(s):

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Sample Sizes ◽

Genome Wide ◽

The Impact ◽

Larger Sample

Download Full-text

Challenges and progress in interpretation of non-coding genetic variants associated with human disease

Experimental Biology and Medicine ◽

10.1177/1535370217713750 ◽

2017 ◽

Vol 242 (13) ◽

pp. 1325-1334 ◽

Cited By ~ 19

Author(s):

Yizhou Zhu ◽

Cagdas Tazearslan ◽

Yousin Suh

Keyword(s):

Molecular Mechanisms ◽

Target Genes ◽

Disease Risk ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Genetic Associations ◽

Functional Interpretation ◽

Genome Wide ◽

Coding Variants

Genome-wide association studies have shown that the far majority of disease-associated variants reside in the non-coding regions of the genome, suggesting that gene regulatory changes contribute to disease risk. To identify truly causal non-coding variants and their affected target genes remains challenging but is a critical step to translate the genetic associations to molecular mechanisms and ultimately clinical applications. Here we review genomic/epigenomic resources and in silico tools that can be used to identify causal non-coding variants and experimental strategies to validate their functionalities. Impact statement Most signals from genome-wide association studies (GWASs) map to the non-coding genome, and functional interpretation of these associations remained challenging. We reviewed recent progress in methodologies of studying the non-coding genome and argued that no single approach allows one to effectively identify the causal regulatory variants from GWAS results. By illustrating the advantages and limitations of each method, our review potentially provided a guideline for taking a combinatorial approach to accurately predict, prioritize, and eventually experimentally validate the causal variants.

Download Full-text

A Regression-based Framework for Scalable Pathway-guided Search in Genome-wide Association Studies

10.1101/241265 ◽

2017 ◽

Author(s):

Shrayashi Biswas ◽

Soumen Pal ◽

Samsiddhi Bhattacharjee

Keyword(s):

Association Studies ◽

Analytical Framework ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Biological Databases ◽

Disease Etiology ◽

P Values ◽

Guided Search ◽

Genome Wide ◽

A Genome

AbstractTraditional unbiased genome-wide association studies (GWAS) have successfully identified thousands of loci associated with various complex diseases but there is evidence to suggest that many variants were missed at stringent genome-wide thresholds. Fortunately, there is a rapidly increasing amount of prior knowledge in publicly available genomic datasets and biological databases that can be harnessed to enhance the power of discovering SNPs/Genes from existing or new GWAS datasets. For most diseases, many of the identified loci tend to cluster into a few specific biological pathways/networks. From the point of view of disease etiology, such clustering is generally to be expected. This phenomenon can be exploited to conduct a more powerful genome-wide scan that is tailored to identify loci that are interconnected in pathways. We propose a scalable regression-based analytical framework to enable such a pathway-guided GWAS and demonstrate that it provides significant gains in power to detect disease associated SNPs. Our method requires two inputs, namely a) genome-wide summary level data (e.g., SNP p-values) and b) a grouping of genes into biologically meaningful categories (e.g., a database of pathways). It automatically adjusts the input p-values by incorporating the knowledge derived adaptively from the data and the pathways specified. The method involves a regularized logistic regression analysis to derive priors of each SNP and then re-weights the p-values of SNPs so as to maximize overall power of making discoveries. It increases the power to discover SNPs co-clustering into some of these pathways, while maintaining the global type-1 error (FWER) at the desired level. We used whole-genome simulations and summary data from real GWA studies of psoriasis, SLE, coronary artery disease and type-2 diabetes to illustrate the power improvement achieved by pathway-guided search. Our pipeline implemented as an R package can flexibly handle large number of prior annotations possibly derived from multiple databases.

Download Full-text

Inferring relevant tissues and cell types for complex traits in genome-wide association studies

10.1101/2021.06.09.447805 ◽

2021 ◽

Author(s):

Rujin Wang ◽

Danyu Lin ◽

Yuchao Jiang

Keyword(s):

Single Cell ◽

Complex Traits ◽

Association Studies ◽

Cell Types ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Cell Type ◽

Disease Etiology ◽

Genome Wide ◽

Cell Type Specific

More than a decade of genome-wide association studies (GWASs) have identified genetic risk variants that are significantly associated with complex traits. Emerging evidence suggests that the function of trait-associated variants likely acts in a tissue- or cell-type-specific fashion. Yet, it remains challenging to prioritize trait-relevant tissues or cell types to elucidate disease etiology. Here, we present EPIC (cEll tyPe enrIChment), a statistical framework that relates large-scale GWAS summary statistics to cell-type-specific omics measurements from single-cell sequencing. We derive powerful gene-level test statistics for common and rare variants, separately and jointly, and adopt generalized least squares to prioritize trait-relevant tissues or cell types while accounting for the correlation structures both within and between genes. Using enrichment of loci associated with four lipid traits in the liver and enrichment of loci associated with three neurological disorders in the brain as ground truths, we show that EPIC outperforms existing methods. We extend our framework to single-cell transcriptomic data and identify cell types underlying type 2 diabetes and schizophrenia. The enrichment is replicated using independent GWAS and single-cell datasets and further validated using PubMed search and existing bulk case-control testing results.

Download Full-text