Impute.me: an open source, non-profit tool for using data from DTC genetic testing to calculate and interpret polygenic risk scores

Mapping Intimacies ◽

10.1101/861831 ◽

2019 ◽

Author(s):

Lasse Folkersen ◽

Oliver Pain ◽

Andres Ingasson ◽

Thomas Werge ◽

Cathryn M. Lewis ◽

...

Keyword(s):

Open Source ◽

Public Perception ◽

Disease Risk ◽

Risk Scores ◽

Genome Wide Association Studies ◽

Polygenic Risk ◽

Complex Disorders ◽

Risk Variants ◽

Genome Wide ◽

The Impact

AbstractTo date, interpretation of genomic information has focused on single variants conferring disease risk, but most disorders of major public concern have a polygenic architecture. Polygenic risk scores (PRS) give a single measure of disease liability by summarising disease risk across hundreds of thousands of genetic variants. They can be calculated in any genome-wide genotype data-source, using a prediction model based on genome-wide summary statistics from external studies.As genome-wide association studies increase in power, the predictive ability for disease risk will also increase. While PRS are unlikely ever to be fully diagnostic, they may give valuable medical information for risk stratification, prognosis, or treatment response prediction.Public engagement is therefore becoming important on the potential use and acceptability of PRS. However, the current public perception of genetics is that it provides ‘Yes/No’ answers about the presence/absence of a condition, or the potential for developing a condition, which in not the case for common, complex disorders with of polygenic architecture.Meanwhile, unregulated third-party applications are being developed to satisfy consumer demand for information on the impact of lower risk variants on common diseases that are highly polygenic. Often applications report results from single SNPs and disregard effect size, which is highly inappropriate for common, complex disorders where everybody carries risk variants.Tools are therefore needed to communicate our understanding of genetic predisposition as a continuous trait, where a genetic liability confers risk for disease. Impute.me is one such a tool, whose focus is on education and information on common, complex disorders with polygenetic architecture. Its research-focused open-source website allows users to upload consumer genetics data to obtain PRS, with results reported on a population-level normal distribution. Diseases can only be browsed by ICD10-chapter-location or alphabetically, thus prompting the user to consider genetic risk scores in a medical context of relevance to the individual.Here we present an overview of the implementation of the impute.me site, along with analysis of typical usage-patterns, which may advance public perception of genomic risk and precision medicine.

Download Full-text

Evaluating the Utility of Polygenic Risk Scores in Identifying High-Risk Individuals for Eight Common Cancers

JNCI Cancer Spectrum ◽

10.1093/jncics/pkaa021 ◽

2020 ◽

Vol 4 (3) ◽

Cited By ~ 3

Author(s):

Guochong Jia ◽

Yingchang Lu ◽

Wanqing Wen ◽

Jirong Long ◽

Ying Liu ◽

...

Keyword(s):

Cancer Risk ◽

Association Studies ◽

Elevated Risk ◽

Genome Wide Association ◽

Risk Scores ◽

Cancer Case ◽

Genome Wide Association Studies ◽

Polygenic Risk ◽

Risk Variants ◽

Genome Wide

Abstract Background Genome-wide association studies have identified common genetic risk variants in many loci associated with multiple cancers. We sought to systematically evaluate the utility of these risk variants in identifying high-risk individuals for eight common cancers. Methods We constructed polygenic risk scores (PRS) using genome-wide association studies–identified risk variants for each cancer. Using data from 400 812 participants of European descent in a population-based cohort study, UK Biobank, we estimated hazard ratios associated with PRS using Cox proportional hazard models and evaluated the performance of the PRS in cancer risk prediction and their ability to identify individuals at more than a twofold elevated risk, a risk level comparable to a moderate-penetrance mutation in known cancer predisposition genes. Results During a median follow-up of 5.8 years, 14 584 incident case patients of cancers were identified (ranging from 358 epithelial ovarian cancer case patients to 4430 prostate cancer case patients). Compared with those at an average risk, individuals among the highest 5% of the PRS had a two- to threefold elevated risk for cancer of the prostate, breast, pancreas, colorectal, or ovary, and an approximately 1.5-fold elevated risk of cancer of the lung, bladder, or kidney. The areas under the curve ranged from 0.567 to 0.662. Using PRS, 40.4% of the study participants can be classified as having more than a twofold elevated risk for at least one site-specific cancer. Conclusions A large proportion of the general population can be identified at an elevated cancer risk by PRS, supporting the potential clinical utility of PRS for personalized cancer risk prediction.

Download Full-text

Pathway specific polygenic risk scores identify pathways and patient clusters associated with inflammatory bowel disease risk, severity and treatment response

10.1101/2021.11.19.21266549 ◽

2021 ◽

Author(s):

Corneliu A Bodea ◽

Michael Macoritto ◽

Yingchun Liu ◽

Wenliang Zhang ◽

Jozsef Karman ◽

...

Keyword(s):

Treatment Response ◽

Genetic Risk ◽

Disease Risk ◽

Clinical Care ◽

Risk Scores ◽

Genome Wide Association Studies ◽

Polygenic Risk ◽

Genome Wide ◽

Stratification Method ◽

Inflammatory Bowel

Crohn's disease (CD) and ulcerative colitis (UC) are inflammatory bowel diseases (IBD) with a strong genetic component. Genome-wide association studies (GWAS) have successfully identified over 240 genetic loci that are statistically associated with risk of developing IBD, and these associations provide valuable insights into disease pathobiology. Building on GWAS findings, conventional polygenic risk scores (cPRS) aim to quantify the aggregated disease risk based on DNA variation, and these scores can identify individuals at high risk. While stratifying individuals based on cPRS has the potential to inform clinical care, the development of novel therapeutics requires deep insight into how aggregated genetic risk leads to disruption of specific biological pathways. Here, we developed a pathway-specific PRS (pPRS) methodology to assess IBD common variant genetic risk burden across 31 manually curated pathways. We first prioritized 206 genes based on comprehensive fine-mapping and eQTL colocalization analyses of genome-wide significant IBD GWAS loci and 58 highly penetrant genes based on their involvement in early onset IBD or autoimmunity-related colitis. These 264 genes were assigned to at least one of the 31 pathways based on Gene Ontology annotations and manual curation. Finally, we integrated these inputs into a novel pPRS model and performed an extensive investigation of IBD disease risk, severity, complications, and anti-TNF treatment response by applying our pPRS approach to three complementary datasets encompassing IBD cases and controls. Our analysis identified multiple promising pathways that can inform drug target discovery and provides a patient stratification method that offers insights into the biology of treatment response.

Download Full-text

E-MAGMA: an eQTL-informed method to identify risk genes using genome-wide association study summary statistics

Bioinformatics ◽

10.1093/bioinformatics/btab115 ◽

2021 ◽

Author(s):

Zachary F Gerring ◽

Angela Mina-Vargas ◽

Eric R Gamazon ◽

Eske M Derks

Keyword(s):

Genome Wide Association ◽

Chromosome 1 ◽

Supplementary Information ◽

Genome Wide Association Studies ◽

Summary Statistics ◽

Tissue Specific ◽

Complex Disorders ◽

Risk Variants ◽

Genome Wide ◽

Causal Genes

Abstract Motivation Genome-wide association studies have successfully identified multiple independent genetic loci that harbour variants associated with human traits and diseases, but the exact causal genes are largely unknown. Common genetic risk variants are enriched in non-protein-coding regions of the genome and often affect gene expression (expression quantitative trait loci, eQTL) in a tissue-specific manner. To address this challenge, we developed a methodological framework, E-MAGMA, which converts genome-wide association summary statistics into gene-level statistics by assigning risk variants to their putative genes based on tissue-specific eQTL information. Results We compared E-MAGMA to three eQTL informed gene-based approaches using simulated phenotype data. Phenotypes were simulated based on eQTL reference data using GCTA for all genes with at least one eQTL at chromosome 1. We performed 10 simulations per gene. The eQTL-h2 (i.e., the proportion of variation explained by the eQTLs) was set at 1%, 2%, and 5%. We found E-MAGMA outperforms other gene-based approaches across a range of simulated parameters (e.g. the number of identified causal genes). When applied to genome-wide association summary statistics for five neuropsychiatric disorders, E-MAGMA identified more putative candidate causal genes compared to other eQTL-based approaches. By integrating tissue-specific eQTL information, these results show E-MAGMA will help to identify novel candidate causal genes from genome-wide association summary statistics and thereby improve the understanding of the biological basis of complex disorders. Availability A tutorial and input files are made available in a github repository: https://github.com/eskederks/eMAGMA-tutorial. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Age-related late-onset disease heritability patterns and implications for genome-wide association studies

10.1101/349019 ◽

2018 ◽

Cited By ~ 1

Author(s):

Roman Teo Oliynyk

Keyword(s):

Old Age ◽

Cumulative Incidence ◽

Late Onset ◽

Association Studies ◽

Genome Wide Association ◽

Risk Scores ◽

Cerebral Stroke ◽

Genome Wide Association Studies ◽

Polygenic Risk ◽

Genome Wide

AbstractBackgroundGenome-wide association studies and other computational biology techniques are gradually discovering the causal gene variants that contribute to late-onset human diseases. After more than a decade of genome-wide association study efforts, these can account for only a fraction of the heritability implied by familial studies, the so-called “missing heritability” problem.MethodsComputer simulations of polygenic late-onset diseases in an aging population have quantified the risk allele frequency decrease at older ages caused by individuals with higher polygenic risk scores becoming ill proportionately earlier. This effect is most prominent for diseases characterized by high cumulative incidence and high heritability, examples of which include Alzheimer’s disease, coronary artery disease, cerebral stroke, and type 2 diabetes.ResultsThe incidence rate for late-onset diseases grows exponentially for decades after early onset ages, guaranteeing that the cohorts used for genome-wide association studies overrepresent older individuals with lower polygenic risk scores, whose disease cases are disproportionately due to environmental causes such as old age itself. This mechanism explains the decline in clinical predictive power with age and the lower discovery power of familial studies of heritability and genome-wide association studies. It also explains the relatively constant-with-age heritability found for late-onset diseases of lower prevalence, exemplified by cancers.ConclusionsFor late-onset polygenic diseases showing high cumulative incidence together with high initial heritability, rather than using relatively old age-matched cohorts, study cohorts combining the youngest possible cases with the oldest possible controls may significantly improve the discovery power of genome-wide association studies.

Download Full-text

Genome‐wide association studies and polygenic risk scores for skin cancer: clinically useful yet?

British Journal of Dermatology ◽

10.1111/bjd.17917 ◽

2019 ◽

Vol 181 (6) ◽

pp. 1146-1155 ◽

Cited By ~ 12

Author(s):

M.R. Roberts ◽

M.M. Asgari ◽

A.E. Toland

Keyword(s):

Skin Cancer ◽

Association Studies ◽

Genome Wide Association ◽

Risk Scores ◽

Genome Wide Association Studies ◽

Polygenic Risk ◽

Genome Wide

Download Full-text

Leveraging effect size distributions to improve polygenic risk scores derived from summary statistics of genome-wide association studies

PLoS Computational Biology ◽

10.1371/journal.pcbi.1007565 ◽

2020 ◽

Vol 16 (2) ◽

pp. e1007565 ◽

Cited By ~ 1

Author(s):

Shuang Song ◽

Wei Jiang ◽

Lin Hou ◽

Hongyu Zhao

Keyword(s):

Effect Size ◽

Association Studies ◽

Genome Wide Association ◽

Risk Scores ◽

Genome Wide Association Studies ◽

Size Distributions ◽

Summary Statistics ◽

Polygenic Risk ◽

Genome Wide

Download Full-text

Multiplex melanoma families are enriched for polygenic risk

Human Molecular Genetics ◽

10.1093/hmg/ddaa156 ◽

2020 ◽

Vol 29 (17) ◽

pp. 2976-2985

Author(s):

Matthew H Law ◽

Lauren G Aoude ◽

David L Duffy ◽

Georgina V Long ◽

Peter A Johansson ◽

...

Keyword(s):

Cutaneous Melanoma ◽

Cumulative Effect ◽

Risk Scores ◽

Polygenic Risk Score ◽

Single Family ◽

Polygenic Risk ◽

Complex Disorders ◽

Risk Variants ◽

High Penetrance ◽

Genetic Risks

Abstract Cancers, including cutaneous melanoma, can cluster in families. In addition to environmental etiological factors such as ultraviolet radiation, cutaneous melanoma has a strong genetic component. Genetic risks for cutaneous melanoma range from rare, high-penetrance mutations to common, low-penetrance variants. Known high-penetrance mutations account for only about half of all densely affected cutaneous melanoma families, and the causes of familial clustering in the remainder are unknown. We hypothesize that some clustering is due to the cumulative effect of a large number of variants of individually small effect. Common, low-penetrance genetic risk variants can be combined into polygenic risk scores. We used a polygenic risk score for cutaneous melanoma to compare families without known high-penetrance mutations with unrelated melanoma cases and melanoma-free controls. Family members had significantly higher mean polygenic load for cutaneous melanoma than unrelated cases or melanoma-free healthy controls (Bonferroni-corrected t-test P = 1.5 × 10−5 and 6.3 × 10−45, respectively). Whole genome sequencing of germline DNA from 51 members of 21 families with low polygenic risk for melanoma identified a CDKN2A p.G101W mutation in a single family but no other candidate high-penetrance melanoma susceptibility genes. This work provides further evidence that melanoma, like many other common complex disorders, can arise from the joint action of multiple predisposing factors, including rare high-penetrance mutations, as well as via a combination of large numbers of alleles of small effect.

Download Full-text

Perspective: The Clinical Use of Polygenic Risk Scores: Race, Ethnicity, and Health Disparities

Ethnicity & Disease ◽

10.18865/ed.29.3.513 ◽

2019 ◽

Vol 29 (3) ◽

pp. 513-516 ◽

Cited By ~ 2

Author(s):

Megan C. Roberts ◽

Muin J. Khoury ◽

George A. Mensah

Keyword(s):

Precision Medicine ◽

Association Studies ◽

Clinical Care ◽

Genomic Research ◽

Risk Scores ◽

Genome Wide Association Studies ◽

Polygenic Risk ◽

Adverse Health Outcomes ◽

Genome Wide ◽

Disease Risks

Polygenic risk scores (PRS) are an emerging precision medicine tool based on multiple gene variants that, taken alone, have weak associations with disease risks, but collectively may enhance disease predictive value in the population. However, the benefit of PRS may not be equal among non-European populations, as they are under-represented in genome-wide association studies (GWAS) that serve as the basis for PRS development. In this perspective, we discuss a path forward, which includes: 1) inclusion of underrepresented populations in PRS research; 2) global efforts to build capacity for genomic research; 3) equitable implementation of these tools in clinical practice; and 4) traditional public health approaches to reduce risk of adverse health outcomes as an important component to precision health. As precision medicine is implemented in clinical care, researchers must ensure that advances from PRS research will benefit all.Ethn Dis.2019;29(3):513-516; doi:10.18865/ed.29.3.513.

Download Full-text

Performance of polygenic risk scores for cancer prediction in an academic biobank.

Journal of Clinical Oncology ◽

10.1200/jco.2020.38.15_suppl.1528 ◽

2020 ◽

Vol 38 (15_suppl) ◽

pp. 1528-1528

Author(s):

Heena Desai ◽

Anh Le ◽

Ryan Hausler ◽

Shefali Verma ◽

Anurag Verma ◽

...

Keyword(s):

Risk Score ◽

Genetic Variants ◽

Association Studies ◽

Risk Scores ◽

Polygenic Risk Score ◽

Genome Wide Association Studies ◽

Polygenic Risk ◽

European Americans ◽

Genome Wide ◽

Common Genetic Variants

1528 Background: The discovery of rare genetic variants associated with cancer have a tremendous impact on reducing cancer morbidity and mortality when identified; however, rare variants are found in less than 5% of cancer patients. Genome wide association studies (GWAS) have identified hundreds of common genetic variants significantly associated with a number of cancers, but the clinical utility of individual variants or a polygenic risk score (PRS) derived from multiple variants is still unclear. Methods: We tested the ability of polygenic risk score (PRS) models developed from genome-wide significant variants to differentiate cases versus controls in the Penn Medicine Biobank. Cases for 15 different cancers and cancer-free controls were identified using electronic health record billing codes for 11,524 European American and 5,994 African American individuals from the Penn Medicine Biobank. Results: The discriminatory ability of the 15 PRS models to distinguish their respective cancer cases versus controls ranged from 0.68-0.79 in European Americans and 0.74-0.93 in African Americans. Seven of the 15 cancer PRS trended towards an association with their cancer at a p<0.05 (Table), and PRS for prostate, thyroid and melanoma were significantly associated with their cancers at a bonferroni corrected p<0.003 with OR 1.3-1.6 in European Americans. Conclusions: Our data demonstrate that common variants with significant associations from GWAS studies can distinguish cancer cases versus controls for some cancers in an unselected biobank population. Given the small effects, future studies are needed to determine how best to incorporate PRS with other risk factors in the precision prediction of cancer risk. [Table: see text]

Download Full-text

Proportion of idiopathic pulmonary fibrosis risk explained by known genetic loci

10.1101/2020.08.14.20172528 ◽

2020 ◽

Author(s):

Olivia C Leavy ◽

Shwu-Fan Ma ◽

Philip L Molyneaux ◽

Toby M Maher ◽

Justin M Oldham ◽

...

Keyword(s):

Idiopathic Pulmonary Fibrosis ◽

Pulmonary Fibrosis ◽

Genetic Architecture ◽

Association Studies ◽

Genome Wide Association Studies ◽

Genetic Loci ◽

Risk Variants ◽

Genome Wide ◽

The Impact ◽

Insight Into

Genome-wide association studies have identified 14 genetic loci associated with susceptibility to idiopathic pulmonary fibrosis (IPF), a devastating lung disease with poor prognosis. Of these, the variant with the strongest association, rs35705950, is located in the promoter region of the MUC5B gene and has a risk allele (T) frequency of 30-35% in IPF cases. Here we present estimates of the proportion of disease liability explained by each of the 14 IPF risk variants as well as estimates of the proportion of cases that can be attributed to each variant. We estimate that rs35705950 explains 5.9-9.4% of disease liability, which is much lower than previously reported estimates. Of every 100,000 individuals with the rs35705950_GG genotype we estimate 30 will have IPF, whereas for every 100,000 individuals with the rs35705950_GT genotype 152 will have IPF. Quantifying the impact of genetic risk factors on disease liability improves our understanding of the underlying genetic architecture of IPF and provides insight into the impact of genetic factors in risk prediction modelling.

Download Full-text