Common genetic variants and health outcomes appear geographically structured in the UK Biobank sample: Old concerns returning and their implications

Mapping Intimacies ◽

10.1101/294876 ◽

2018 ◽

Cited By ~ 12

Author(s):

Simon Haworth ◽

Ruth Mitchell ◽

Laura Corbin ◽

Kaitlin H Wade ◽

Tom Dudding ◽

...

Keyword(s):

Genetic Variants ◽

Complex Traits ◽

Large Scale ◽

Genetic Data ◽

Population Based ◽

Risk Scores ◽

Phenotypic Variance ◽

Uk Biobank ◽

Common Genetic Variants ◽

The Uk

Introductory paragraphThe inclusion of genetic data in large studies has enabled the discovery of genetic contributions to complex traits and their application in applied analyses including those using genetic risk scores (GRS) for the prediction of phenotypic variance. If genotypes show structure by location and coincident structure exists for the trait of interest, analyses can be biased. Having illustrated structure in an apparently homogeneous collection, we aimed to a) test for geographical stratification of genotypes in UK Biobank and b) assess whether stratification might induce bias in genetic association analysis.We found that single genetic variants are associated with birth location within UK Biobank and that geographic structure in genetic data could not be accounted for using routine adjustment for study centre and principal components (PCs) derived from genotype data. We found that GRS for complex traits do appear geographically structured and analysis using GRS can yield biased associations. We discuss the likely origins of these observations and potential implications for analysis within large-scale population based genetic studies.

Download Full-text

Phenome-wide Heritability Analysis of the UK Biobank

10.1101/070177 ◽

2016 ◽

Author(s):

Tian Ge ◽

Chia-Yen Chen ◽

Benjamin M. Neale ◽

Mert R. Sabuncu ◽

Jordan W. Smoller

Keyword(s):

Complex Traits ◽

Large Scale ◽

Prediction Models ◽

Population Characteristics ◽

Uk Biobank ◽

Multiple Traits ◽

Common Genetic Variants ◽

Heritability Estimation ◽

Heritability Analysis ◽

The Uk

Heritability estimation provides important information about the relative contribution of genetic and environmental factors to phenotypic variation, and provides an upper bound for the utility of genetic risk prediction models. Recent technological and statistical advances have enabled the estimation of additive heritability attributable to common genetic variants (SNP heritability) across a broad phenotypic spectrum. However, assessing the comparative heritability of multiple traits estimated in different cohorts may be misleading due to the population-specific nature of heritability. Here we report the SNP heritability for 551 complex traits derived from the large-scale, population-based UK Biobank, comprising both quantitative phenotypes and disease codes, and examine the moderating effect of three major demographic variables (age, sex and socioeconomic status) on the heritability estimates. Our study represents the first comprehensive phenome-wide heritability analysis in the UK Biobank, and underscores the importance of considering population characteristics in comparing and interpreting heritability.

Download Full-text

Multifactorial disorders and polygenic risk scores: predicting common diseases and the possibility of adverse selection in life and protection insurance

Annals of Actuarial Science ◽

10.1017/s1748499520000226 ◽

2020 ◽

pp. 1-16 ◽

Cited By ~ 1

Author(s):

Jessye M. Maxwell ◽

Richard A. Russell ◽

Hei Man Wu ◽

Natasha Sharapova ◽

Peter Banthorpe ◽

...

Keyword(s):

Risk Factors ◽

Adverse Selection ◽

Genetic Variants ◽

Risk Information ◽

Risk Scores ◽

Uk Biobank ◽

Polygenic Risk ◽

Common Genetic Variants ◽

Multifactorial Disorders ◽

The Uk

Abstract During the past decade, genetics research has allowed scientists and clinicians to explore the human genome in detail and reveal many thousands of common genetic variants associated with disease. Genetic risk scores, known as polygenic risk scores (PRSs), aggregate risk information from the most important genetic variants into a single score that describes an individual’s genetic predisposition to a given disease. This article reviews recent developments in the predictive utility of PRSs in relation to a person’s susceptibility to breast cancer and coronary artery disease. Prognostic models for these disorders are built using data from the UK Biobank, controlling for typical clinical and underwriting risk factors. Furthermore, we explore the possibility of adverse selection where genetic information about multifactorial disorders is available for insurance purchasers but not for underwriters. We demonstrate that prediction of multifactorial diseases, using PRSs, provides population risk information additional to that captured by normal underwriting risk factors. This research using the UK Biobank is in the public interest as it contributes to our understanding of predicting risk of disease in the population. Further research is imperative to understand how PRSs could cause adverse selection if consumers use this information to alter their insurance purchasing behaviour.

Download Full-text

Analysis of genetic dominance in the UK Biobank

10.1101/2021.08.15.456387 ◽

2021 ◽

Author(s):

Duncan S Palmer ◽

Wei Zhou ◽

Liam Abbott ◽

Nik Baya ◽

Claire Churchhouse ◽

...

Keyword(s):

Complex Traits ◽

Multiple Testing ◽

Model Organisms ◽

Systematic Evaluation ◽

Hair Color ◽

Phenotypic Variance ◽

Additive Effects ◽

Uk Biobank ◽

Genome Wide ◽

The Uk

In classical statistical genetic theory, a dominance effect is defined as the deviation from a purely additive genetic effect for a biallelic variant. Dominance effects are well documented in model organisms. However, evidence in humans is limited to a handful of traits, particularly those with strong single locus effects such as hair color. We carried out the largest systematic evaluation of dominance effects on phenotypic variance in the UK Biobank. We curated and tested over 1,000 phenotypes for dominance effects through GWAS scans, identifying 175 loci at genome-wide significance correcting for multiple testing (P < 4.7 × 10-11). Power to detect non-additive loci is much lower than power to detect additive effects for complex traits: based on the relative effect sizes at genome-wide significant additive loci, we estimate a factor of 20-30 increase in sample size will be necessary to capture clear evidence of dominance similar to those currently observed for additive effects. However, these localised dominance hits do not extend to a significant aggregate contribution to phenotypic variance genome-wide. By deriving a version of LD-score regression to detect dominance effects tagged by common variation genome-wide (minor allele frequency > 0.05), we found no strong evidence of a contribution to phenotypic variance when accounting for multiple testing. Across the 267 continuous and 793 binary traits the median contribution was 5.73 × 10-4, with unbiased point estimates ranging from -0.261 to 0.131. Finally, we introduce dominance fine-mapping to explore whether the more rapid decay of dominance LD can be leveraged to find causal variants. These results provide the most comprehensive assessment of dominance trait variation in humans to date.

Download Full-text

Integration of rare large-effect expression variants improves polygenic risk prediction

10.1101/2020.12.02.20242990 ◽

2020 ◽

Author(s):

Craig Smail ◽

Nicole M. Ferraro ◽

Matthew G. Durrant ◽

Abhiram S. Rao ◽

Matthew Aguirre ◽

...

Keyword(s):

Genetic Variants ◽

Rare Variants ◽

Complex Trait ◽

Risk Scores ◽

Multiple Traits ◽

Polygenic Risk ◽

Common Genetic Variants ◽

Using Data ◽

The Uk ◽

The Impact

SummaryPolygenic risk scores (PRS) aim to quantify the contribution of multiple genetic loci to an individual’s likelihood of a complex trait or disease. However, existing PRS estimate genetic liability using common genetic variants, excluding the impact of rare variants. We identified rare, large-effect variants in individuals with outlier gene expression from the GTEx project and then assessed their impact on PRS predictions in the UK Biobank (UKB). We observed large deviations from the PRS-predicted phenotypes for carriers of multiple outlier rare variants; for example, individuals classified as “low-risk” but in the top 1% of outlier rare variant burden had a 6-fold higher rate of severe obesity. We replicated these findings using data from the NHLBI Trans-Omics for Precision Medicine (TOPMed) biobank and the Million Veteran Program, and demonstrated that PRS across multiple traits will significantly benefit from the inclusion of rare genetic variants.

Download Full-text

Significant Sparse Polygenic Risk Scores across 428 traits in UK Biobank

10.1101/2021.09.02.21262942 ◽

2021 ◽

Author(s):

Yosuke Tanigawa ◽

Junyang Qian ◽

Guhan Ram Venkataraman ◽

Johanne M. Justesen ◽

Ruilin Li ◽

...

Keyword(s):

Genetic Variants ◽

Quantitative Traits ◽

Predictive Performance ◽

Risk Scores ◽

Polygenic Risk Score ◽

Uk Biobank ◽

Polygenic Risk ◽

Systematic Assessment ◽

Phenotype Data ◽

The Uk

We present a systematic assessment of polygenic risk score (PRS) prediction across more than 1,600 traits using genetic and phenotype data in the UK Biobank. We report 428 sparse PRS models with significant (p < 2.5e-5) incremental predictive performance when compared against the covariate-only model that considers age, sex, and the genotype principal components. We report a significant correlation between the number of genetic variants selected in the sparse PRS model and the incremental predictive performance in quantitative traits (Spearman's ρ = 0.54, p = 1.4e-15), but not in binary traits (ρ = 0.059, p = 0.35). The sparse PRS model trained on European individuals showed limited transferability when evaluated on individuals from non-European individuals in the UK Biobank. We provide the PRS model weights on the Global Biobank Engine (https://biobankengine.stanford.edu/prs).

Download Full-text

Within-family studies for Mendelian randomization: avoiding dynastic, assortative mating, and population stratification biases

10.1101/602516 ◽

2019 ◽

Cited By ~ 30

Author(s):

Ben Brumpton ◽

Eleanor Sanderson ◽

Fernando Pires Hartwig ◽

Sean Harrison ◽

Gunnhild Åberge Vie ◽

...

Keyword(s):

Population Stratification ◽

Mendelian Randomization ◽

Genetic Data ◽

Population Based ◽

Health Study ◽

Uk Biobank ◽

Family Effects ◽

Family Based ◽

Using Data ◽

The Uk

AbstractMendelian randomization (MR) is a widely-used method for causal inference using genetic data. Mendelian randomization studies of unrelated individuals may be susceptible to bias from family structure, for example, through dynastic effects which occur when parental genotypes directly affect offspring phenotypes. Here we describe methods for within-family Mendelian randomization and through simulations show that family-based methods can overcome bias due to dynastic effects. We illustrate these issues empirically using data from 61,008 siblings from the UK Biobank and Nord-Trøndelag Health Study. Both within-family and population-based Mendelian randomization analyses reproduced established effects of lower BMI reducing risk of diabetes and high blood pressure. However, while MR estimates from population-based samples of unrelated individuals suggested that taller height and lower BMI increase educational attainment, these effects largely disappeared in within-family MR analyses. We found differences between population-based and within-family based estimates, indicating the importance of controlling for family effects and population structure in Mendelian randomization studies.

Download Full-text

Polygenic prediction of breast cancer: comparison of genetic predictors and implications for screening

10.1101/448597 ◽

2018 ◽

Author(s):

Kristi Läll ◽

Maarja Lepamets ◽

Marili Palover ◽

Tõnu Esko ◽

Andres Metspalu ◽

...

Keyword(s):

Breast Cancer ◽

Genetic Risk ◽

Odds Ratio ◽

Population Based ◽

Risk Scores ◽

Full Potential ◽

Genome Wide Association Studies ◽

Uk Biobank ◽

Genetic Risk Scores ◽

The Uk

AbstractBackgroundPublished genetic risk scores for breast cancer (BC) so far have been based on a relatively small number of markers and are not necessarily using the full potential of large-scale Genome-Wide Association Studies. This study aims to identify an efficient polygenic predictor for BC based on best available evidence and to assess its potential for personalized risk prediction and screening strategies.MethodsFour different genetic risk scores (two already published and two newly developed) and their combinations (metaGRS) are compared in the subsets of two population-based biobank cohorts: the UK Biobank (UKBB, 3157 BC cases, 43,827 controls) and Estonian Biobank (EstBB, 317 prevalent and 308 incident BC cases in 32,557 women). In addition, correlations between different genetic risk scores and their associations with BC risk factors are studied in both cohorts.ResultsThe metaGRS that combines two genetic risk scores (metaGRS2 - based on 75 and 898 Single Nucleotide Polymorphisms, respectively) has the strongest association with prevalent BC status in both cohorts. One standard deviation difference in the metaGRS2 corresponds to an Odds Ratio = 1.6 (95% CI 1.54 to 1.66, p = 9.7*10-135) in the UK Biobank and accounting for family history marginally attenuates the effect (Odds Ratio = 1.58, 95% CI 1.53 to 1.64, p = 9.1*10-129). In the EstBB cohort, the hazard ratio of incident BC for the women in the top 5% of the metaGRS2 compared to women in the lowest 50% is 4.2 (95% CI 2.8 to 6.2, p = 8.1*10-13). The different GRSs are only moderately correlated with each other and are associated with different known predictors of BC. The classification of genetic risk for the same individual may vary considerably depending on the chosen GRS.ConclusionsWe have shown that metaGRS2 that combines on the effects of more than 900 SNPs provides best predictive ability for breast cancer in two different population-based cohorts. The strength of the effect of metaGRS2 indicates that the GRS could potentially be used to develop more efficient strategies for breast cancer screening for genotyped women.

Download Full-text

Genetically determined risk of keratinocyte carcinoma and risk of other cancers

International Journal of Epidemiology ◽

10.1093/ije/dyaa265 ◽

2020 ◽

Author(s):

Jean Claude Dusingize ◽

Catherine M Olsen ◽

Jiyuan An ◽

Nirmala Pandeya ◽

Upekha E Liyanage ◽

...

Keyword(s):

Genetic Variants ◽

Genome Wide Association Study ◽

Meta Analysis ◽

Epidemiological Studies ◽

Risk Scores ◽

Cancer Site ◽

Uk Biobank ◽

Individual Level ◽

Increased Risk ◽

The Uk

Abstract Background Epidemiological studies have consistently documented an increased risk of developing primary non-cutaneous malignancies among people with a history of keratinocyte carcinoma (KC). However, the mechanisms underlying this association remain unclear. We conducted two separate analyses to test whether genetically predicted KC is related to the risk of developing cancers at other sites. Methods In the first approach (one-sample), we calculated the polygenic risk scores (PRS) for KC using individual-level data in the UK Biobank (n = 394 306) and QSkin cohort (n = 16 896). The association between the KC PRS and each cancer site was assessed using logistic regression. In the secondary (two-sample) approach, we used genome-wide association study (GWAS) summary statistics identified from the most recent GWAS meta-analysis of KC and obtained GWAS data for each cancer site from the UK-Biobank participants only. We used inverse-variance-weighted methods to estimate risks across all genetic variants. Results Using the one-sample approach, we found that the risks of cancer at other sites increased monotonically with KC PRS quartiles, with an odds ratio (OR) of 1.16, 95% confidence interval (CI): 1.13–1.19 for those in KC PRS quartile 4 compared with those in quartile 1. In the two-sample approach, the pooled risk of developing other cancers was statistically significantly elevated, with an OR of 1.05, 95% CI: 1.03–1.07 per doubling in the odds of KC. We observed similar trends of increasing cancer risk with increasing KC PRS in the QSkin cohort. Conclusion Two different genetic approaches provide compelling evidence that an instrumental variable for KC constructed from genetic variants predicts the risk of cancers at other sites.

Download Full-text

Polygenic modulation of lipoprotein(a)-associated cardiovascular risk

10.1101/2020.02.22.20026757 ◽

2020 ◽

Author(s):

Mark Trinder ◽

Liam R. Brunham

Keyword(s):

Myocardial Infarction ◽

Cardiovascular Risk ◽

Genetic Variants ◽

Uk Biobank ◽

Lower Quintile ◽

Lipoprotein A ◽

Common Genetic Variants ◽

Artery Disease ◽

The Uk ◽

Genomic Risk

ABSTRACTAimsElevated levels of lipoprotein(a) are one of the strongest inherited risk factors for coronary artery disease (CAD). However, there is variability in cardiovascular risk among individuals with elevated lipoprotein(a). The sources of this variability are incompletely understood. We assessed the effects of a genomic risk score (GRS) for CAD on risk of myocardial infarction among individuals with elevated lipoprotein(a).MethodsWe calculated CAD GRSs for 408,896 individuals of British white ancestry from the UK Biobank using 6.27 million common genetic variants. Lipoprotein(a) levels were measured in 310,020 individuals. The prevalence and risk of myocardial infarction versus CAD GRS percentiles were compared for individuals with and without elevated lipoprotein(a) defined as ≥120 or 168 nmol/L (≈50 or 70 mg/dL, respectively).ResultsIndividuals with elevated lipoprotein(a) displayed significantly greater CAD GRSs than individuals without elevated lipoprotein(a), which was largely dependent on the influence of genetic variants within or near the LPA gene. Continuous levels of CAD GRS percentile were significantly associated with risk of myocardial infarction for individuals with elevated lipoprotein(a). Notably, the risk of myocardial infarction for males with elevated lipoprotein(a) levels, but a CAD GRS percentile in the lower quintile (<20th percentile), was less than the overall risk of myocardial infarction for males with non-elevated lipoprotein(a) levels (hazard ratio [95% CI]: 0.79 [0.64-0.97], p=0.02). Similar results were observed for females.ConclusionThese data suggest that CAD genomic scores influence cardiovascular risk among individuals with elevated lipoprotein(a) and may aid in identifying candidates for preventive therapies.

Download Full-text

The molecular genetics of hand preference revisited

10.1101/447177 ◽

2018 ◽

Cited By ~ 1

Author(s):

Carolien G.F. de Kovel ◽

Clyde Francks

Keyword(s):

Complex Traits ◽

Large Population ◽

Enrichment Analysis ◽

Hand Preference ◽

Population Based ◽

Gene Set Enrichment Analysis ◽

Uk Biobank ◽

Genome Wide ◽

A Genome ◽

The Uk

AbstractHand preference is a prominent behavioural trait linked to human brain asymmetry. A handful of genetic variants have been reported to associate with hand preference or quantitative measures related to it. Most of these reports were on the basis of limited sample sizes, by current standards for genetic analysis of complex traits. Here we performed a genome-wide association analysis of hand preference in the large, population-based UK Biobank cohort (N=331,037). We used gene-set enrichment analysis to investigate whether genes involved in visceral asymmetry are particularly relevant to hand preference, following one previous report. We found no evidence implicating any specific candidate variants previously reported. We also found no evidence that genes involved in visceral laterality play a role in hand preference. It remains possible that some of the previously reported genes or pathways are relevant to hand preference as assessed in other ways, or else are relevant within specific disorder populations. However, some or all of the earlier findings are likely to be false positives, and none of them appear relevant to hand preference as defined categorically in the general population. Within the UK Biobank itself, a significant association implicates the gene MAP2 in handedness.

Download Full-text