How genetic disease risks can be misestimated across global populations

Mapping Intimacies ◽

10.1101/195768 ◽

2017 ◽

Author(s):

Michelle S Kim ◽

Kane P Patel ◽

Andrew K Teng ◽

Ali J Berens ◽

Joseph Lachance

Keyword(s):

Genetic Disease ◽

Risk Allele ◽

Association Studies ◽

Allele Frequencies ◽

Risk Scores ◽

Genome Wide Association Studies ◽

Whole Genome ◽

Risk Alleles ◽

Disease Associations ◽

Disease Risks

AbstractBackgroundAccurate assessment of health disparities requires unbiased knowledge of genetic risks in different populations. Unfortunately, most genome-wide association studies use genotyping arrays and European samples. Here, we integrate whole genome sequence data from global populations, results from thousands of GWAS, and extensive computer simulations to identify how genetic disease risks can be misestimated.ResultsIn contrast to null expectations, we find that risk allele frequencies at known disease loci are significantly different for African populations compared to other continents. Strikingly, ancestral risk alleles are found at 9.51% higher frequency in Africa and derived risk alleles are found at 5.40% lower frequency in Africa. By simulating GWAS with different study populations, we find that non-African cohorts yield disease associations that have biased allele frequencies and that African cohorts yield disease associations that are relatively free of bias. We also find empirical evidence that genotyping arrays and SNP ascertainment bias contribute to continental differences in risk allele frequencies. Because of these causes, polygenic risk scores can be grossly misestimated for individuals of African descent. Importantly, continental differences in risk allele frequencies are only moderately reduced if GWAS use whole genome sequences and hundreds of thousands of cases and controls. Finally, comparisons between uncorrected and corrected genetic risk scores reveal the benefits of considering whether risk alleles are ancestral or derived.ConclusionsOur results imply that caution must be taken when extrapolating GWAS results from one population to predict disease risks in another population.

Download Full-text

Perspective: The Clinical Use of Polygenic Risk Scores: Race, Ethnicity, and Health Disparities

Ethnicity & Disease ◽

10.18865/ed.29.3.513 ◽

2019 ◽

Vol 29 (3) ◽

pp. 513-516 ◽

Cited By ~ 2

Author(s):

Megan C. Roberts ◽

Muin J. Khoury ◽

George A. Mensah

Keyword(s):

Precision Medicine ◽

Association Studies ◽

Clinical Care ◽

Genomic Research ◽

Risk Scores ◽

Genome Wide Association Studies ◽

Polygenic Risk ◽

Adverse Health Outcomes ◽

Genome Wide ◽

Disease Risks

Polygenic risk scores (PRS) are an emerging precision medicine tool based on multiple gene variants that, taken alone, have weak associations with disease risks, but collectively may enhance disease predictive value in the population. However, the benefit of PRS may not be equal among non-European populations, as they are under-represented in genome-wide association studies (GWAS) that serve as the basis for PRS development. In this perspective, we discuss a path forward, which includes: 1) inclusion of underrepresented populations in PRS research; 2) global efforts to build capacity for genomic research; 3) equitable implementation of these tools in clinical practice; and 4) traditional public health approaches to reduce risk of adverse health outcomes as an important component to precision health. As precision medicine is implemented in clinical care, researchers must ensure that advances from PRS research will benefit all.Ethn Dis.2019;29(3):513-516; doi:10.18865/ed.29.3.513.

Download Full-text

Evaluating the Potential of Younger Cases and Older Controls Cohorts to Improve Discovery Power in Genome-wide Association Studies of Late-onset Diseases

10.1101/693622 ◽

2019 ◽

Author(s):

Roman Teo Oliynyk

Keyword(s):

Cumulative Incidence ◽

Late Onset ◽

Association Studies ◽

Genome Wide Association ◽

Risk Scores ◽

Cerebral Stroke ◽

Genome Wide Association Studies ◽

Risk Alleles ◽

Genome Wide ◽

Artery Disease

AbstractFor more than a decade, genome-wide association studies have been making steady progress in discovering the causal gene variants that contribute to late-onset human diseases. Polygenic late-onset diseases in an aging population display the risk allele frequency decrease at older ages, caused by individuals with higher polygenic risk scores becoming ill proportionately earlier and bringing about a change in the distribution of risk alleles between new cases and the as-yet-unaffected population. This phenomenon is most prominent for diseases characterized by high cumulative incidence and high heritability, examples of which include Alzheimer’s disease, coronary artery disease, cerebral stroke, and type 2 diabetes, while for late-onset diseases with relatively lower prevalence and heritability, exemplified by cancers, the effect is significantly lower. Computer simulations have determined that genome-wide association studies of the late-onset polygenic diseases showing high cumulative incidence together with high initial heritability will benefit from using the youngest possible age-matched cohorts. Moreover, rather than using age-matched cohorts, study cohorts combining the youngest possible cases with the oldest possible controls may significantly improve the discovery power of genome-wide association studies.

Download Full-text

Evaluating the Potential of Younger Cases and Older Controls Cohorts to Improve Discovery Power in Genome-Wide Association Studies of Late-Onset Diseases

Journal of Personalized Medicine ◽

10.3390/jpm9030038 ◽

2019 ◽

Vol 9 (3) ◽

pp. 38 ◽

Cited By ~ 1

Author(s):

Roman Teo Oliynyk

Keyword(s):

Cumulative Incidence ◽

Late Onset ◽

Association Studies ◽

Genome Wide Association ◽

Risk Scores ◽

Cerebral Stroke ◽

Genome Wide Association Studies ◽

Risk Alleles ◽

Genome Wide ◽

Artery Disease

For more than a decade, genome-wide association studies have been making steady progress in discovering the causal gene variants that contribute to late-onset human diseases. Polygenic late-onset diseases in an aging population display a risk allele frequency decrease at older ages, caused by individuals with higher polygenic risk scores becoming ill proportionately earlier and bringing about a change in the distribution of risk alleles between new cases and the as-yet-unaffected population. This phenomenon is most prominent for diseases characterized by high cumulative incidence and high heritability, examples of which include Alzheimer’s disease, coronary artery disease, cerebral stroke, and type 2 diabetes, while for late-onset diseases with relatively lower prevalence and heritability, exemplified by cancers, the effect is significantly lower. In this research, computer simulations have demonstrated that genome-wide association studies of late-onset polygenic diseases showing high cumulative incidence together with high initial heritability will benefit from using the youngest possible age-matched cohorts. Moreover, rather than using age-matched cohorts, study cohorts combining the youngest possible cases with the oldest possible controls may significantly improve the discovery power of genome-wide association studies.

Download Full-text

Schizophrenia Risk Alleles Often Affect The Expression of Many Genes and Each Gene May Have a Different Effect On The Risk; A Mediation Analysis

10.1101/2020.01.27.904680 ◽

2020 ◽

Author(s):

Xi Peng ◽

Joel S. Bader ◽

Dimitrios Avramopoulos

Keyword(s):

Gene Expression ◽

Mediation Analysis ◽

Target Genes ◽

Risk Allele ◽

Association Studies ◽

Cell Types ◽

Developmental Time ◽

Genome Wide Association Studies ◽

Expression Change ◽

Risk Alleles

ABSTRACTVariants identified by genome-wide association studies (GWAS) are often expression quantitative trait loci (eQTLs), suggesting they are proxies or are themselves regulatory. Across many datasets analyses show that variants often affect multiple genes. Lacking data on many tissue types, developmental time points and homogeneous cell types, the extent of this one-to-many relationship is underestimated. This raises questions on whether a disease eQTL target gene explains the genetic association or is a by-stander and puts into question the direction of expression effect of on the risk, since the many variant - regulated genes may have opposing effects, imperfectly balancing each other. We used two brain gene expression datasets (CommonMind and BrainSeq) for mediation analysis of schizophrenia-associated variants. We confirm that eQTL target genes often mediate risk but the direction in which expression affects risk is often different from that in which the risk allele changes expression. Of 38 mediator genes significant in both datasets 33 showed consistent mediation direction (Chi2 test P=6*10−6). One might expect that the expression would correlate with the risk allele in the same direction it correlates with disease. For 15 of these 33 (45%), however, the expression change associated with the risk allele was protective, suggesting the likely presence of other target genes with overriding effects. Our results identify specific risk mediating genes and suggest caution in interpreting the biological consequences of targeted modifications of gene expression, as not all eQTL targets may be relevant to disease while those that are, might have different than expected directions.

Download Full-text

Population history of the Sardinian people inferred from whole-genome sequencing

10.1101/092148 ◽

2016 ◽

Cited By ~ 5

Author(s):

Charleston W K Chiang ◽

Joseph H Marcus ◽

Carlo Sidore ◽

Hussein Al-Asadi ◽

Magdalena Zoledziewska ◽

...

Keyword(s):

Bronze Age ◽

Disease Risk ◽

Association Studies ◽

Demographic History ◽

Population History ◽

Genome Wide Association Studies ◽

Whole Genome ◽

Risk Alleles ◽

Mediterranean Island ◽

History Of

AbstractThe population of the Mediterranean island of Sardinia has made important contributions to genome-wide association studies of traits and diseases. The history of the Sardinian population has also been the focus of much research, and in recent ancient DNA (aDNA) studies, Sardinia has provided unique insight into the peopling of Europe and the spread of agriculture. In this study, we analyze whole-genome sequences of 3,514 Sardinians to address hypotheses regarding the founding of Sardinia and its relation to the peopling of Europe, including examining fine-scale substructure, population size history, and signals of admixture. We find the population of the mountainous Gennargentu region shows elevated genetic isolation with higher levels of ancestry associated with mainland Neolithic farmers and depleted ancestry associated with more recent Bronze Age Steppe migrations on the mainland. Notably, the Gennargentu region also has elevated levels of pre-Neolithic hunter-gatherer ancestry and increased affinity to Basque populations. Further, allele sharing with pre-Neolithic and Neolithic mainland populations is larger on the X chromosome compared to the autosome, providing evidence for a sex-biased demographic history in Sardinia. These results give new insight to the demography of ancestral Sardinians and help further the understanding of sharing of disease risk alleles between Sardinia and mainland populations.

Download Full-text

Polygenic Link Between Blood Lipids And Amyotrophic Lateral Sclerosis

10.1101/138156 ◽

2017 ◽

Cited By ~ 1

Author(s):

Xu Chen ◽

Solmaz Yazdani ◽

Fredrik Piehl ◽

Patrik K.E. Magnusson ◽

Fang Fang

Keyword(s):

Amyotrophic Lateral Sclerosis ◽

Blood Lipids ◽

Association Studies ◽

Density Lipoprotein ◽

Lipoprotein Cholesterol ◽

Risk Scores ◽

Genome Wide Association Studies ◽

Nucleotide Polymorphisms ◽

Risk Alleles ◽

Lateral Sclerosis

AbstractDyslipidemia is common among patients with amyotrophic lateral sclerosis (ALS). We aimed to test the association and causality between blood lipids and ALS, using polygenic analyses on the summary results of genome-wide association studies. Polygenic risk scores (PRS) based on low-density lipoprotein cholesterol (LDL-C) and total cholesterol (TC) risk alleles were significantly associated with a higher risk of ALS. Using single nucleotide polymorphisms (SNPs) specifically associated with LDL-C and TC as the instrumental variables, statistically significant causal effects of LDL-C and TC on ALS risk were identified in Mendelian randomization analysis. No significant association was noted between PRS based on triglycerides or high-density lipoprotein cholesterol risk alleles and ALS, and the PRS based on ALS risk alleles were not associated with any studied lipids. This study supports that high levels of LDL-C and TC are risk factors for ALS, and it also suggests a causal relationship of LDL-C and TC to ALS.

Download Full-text

A custom genotyping array reveals population-level heterogeneity for the genetic risks of prostate cancer and other cancers in Africa

10.1101/702910 ◽

2019 ◽

Author(s):

Maxine Harlemon ◽

Olabode Ajayi ◽

Paidamoyo Kachambwa ◽

Michelle S. Kim ◽

Corinne N. Simonti ◽

...

Keyword(s):

Prostate Cancer ◽

Association Studies ◽

Genomic Medicine ◽

Sub Saharan Africa ◽

Risk Scores ◽

Genome Wide Association Studies ◽

Genotyping Array ◽

Common Genetic Variants ◽

African Populations ◽

Disease Associations

AbstractAlthough prostate cancer is the leading cause of cancer mortality for African men, the vast majority of known disease associations have been detected in European study cohorts. Furthermore, most genome-wide association studies have used genotyping arrays that are hindered by SNP ascertainment bias. To overcome these disparities in genomic medicine, the Men of African Descent and Carcinoma of the Prostate (MADCaP) Network has developed a genotyping array that is optimized for African populations. The MADCaP Array contains more than 1.5 million markers and an imputation backbone that successfully tags over 94% of common genetic variants in African populations. This array also has a high density of markers in genomic regions associated with cancer susceptibility, including 8q24. We assessed the effectiveness of the MADCaP Array by genotyping 399 prostate cancer cases and 403 controls from seven urban study sites in sub-Saharan Africa. We find that samples from Ghana and Nigeria cluster together, while samples from Senegal and South Africa yield distinct ancestry clusters. Using the MADCaP array, we identified cancer-associated loci that have large allele frequency differences across African populations. Polygenic risk scores were also generated for each genome in the MADCaP pilot dataset, and we found that predicted risks of CaP are lower in Senegal and higher in Nigeria.SignificanceWe have developed an Africa-specific genotyping array which enables investigators to identify novel disease associations and to fine-map genetic loci that are associated with prostate and other cancers.

Download Full-text

Validity of polygenic risk scores: are we measuring what we think we are?

Human Molecular Genetics ◽

10.1093/hmg/ddz205 ◽

2019 ◽

Vol 28 (R2) ◽

pp. R143-R150 ◽

Cited By ~ 5

Author(s):

A Cecile J W Janssens

Keyword(s):

Association Studies ◽

Theoretical Perspective ◽

Risk Scores ◽

Weighted Sums ◽

Genome Wide Association Studies ◽

Nucleotide Polymorphisms ◽

Polygenic Risk ◽

Risk Alleles ◽

Sum Scores ◽

Selection Of

Abstract Polygenic risk scores (PRSs) have become the standard for quantifying genetic liability in the prediction of disease risks. PRSs are generally constructed as weighted sum scores of risk alleles using effect sizes from genome-wide association studies as their weights. The construction of PRSs is being improved with more appropriate selection of independent single-nucleotide polymorphisms (SNPs) and optimized estimation of their weights but is rarely reflected upon from a theoretical perspective, focusing on the validity of the risk score. Borrowing from psychometrics, this paper discusses the validity of PRSs and introduces the three main types of validity that are considered in the evaluation of tests and measurements: construct, content, and criterion validity. This introduction is followed by a discussion of three topics that challenge the validity of PRS, namely, their claimed independence of clinical risk factors, the consequences of relaxing SNP inclusion thresholds and the selection of SNP weights. This discussion of the validity of PRS reminds us that we need to keep questioning if weighted sums of risk alleles are measuring what we think they are in the various scenarios in which PRSs are used and that we need to keep exploring alternative modeling strategies that might better reflect the underlying biological pathways.

Download Full-text

Genetic architecture of schizophrenia: a review of major advancements

Psychological Medicine ◽

10.1017/s0033291720005334 ◽

2021 ◽

pp. 1-10

Author(s):

Sophie E. Legge ◽

Marcos L. Santoro ◽

Sathish Periyasamy ◽

Adeniran Okewole ◽

Arsalan Arsalan ◽

...

Keyword(s):

Genetic Architecture ◽

Association Studies ◽

Copy Number Variants ◽

Risk Scores ◽

Genome Wide Association Studies ◽

Loss Of Function ◽

Genetic Loci ◽

Significant Enrichment ◽

High Heritability ◽

Coding Variants

Abstract Schizophrenia is a severe psychiatric disorder with high heritability. Consortia efforts and technological advancements have led to a substantial increase in knowledge of the genetic architecture of schizophrenia over the past decade. In this article, we provide an overview of the current understanding of the genetics of schizophrenia, outline remaining challenges, and summarise future directions of research. World-wide collaborations have resulted in genome-wide association studies (GWAS) in over 56 000 schizophrenia cases and 78 000 controls, which identified 176 distinct genetic loci. The latest GWAS from the Psychiatric Genetics Consortium, available as a pre-print, indicates that 270 distinct common genetic loci have now been associated with schizophrenia. Polygenic risk scores can currently explain around 7.7% of the variance in schizophrenia case-control status. Rare variant studies have implicated eight rare copy-number variants, and an increased burden of loss-of-function variants in SETD1A, as increasing the risk of schizophrenia. The latest exome sequencing study, available as a pre-print, implicates a burden of rare coding variants in a further nine genes. Gene-set analyses have demonstrated significant enrichment of both common and rare genetic variants associated with schizophrenia in synaptic pathways. To address current challenges, future genetic studies of schizophrenia need increased sample sizes from more diverse populations. Continued expansion of international collaboration will likely identify new genetic regions, improve fine-mapping to identify causal variants, and increase our understanding of the biology and mechanisms of schizophrenia.

Download Full-text

The Epidemiology and Genetics of Hyperuricemia and Gout across Major Racial Groups: A Literature Review and Population Genetics Secondary Database Analysis

Journal of Personalized Medicine ◽

10.3390/jpm11030231 ◽

2021 ◽

Vol 11 (3) ◽

pp. 231

Author(s):

Faven Butler ◽

Ali Alghubayshi ◽

Youssef Roman

Keyword(s):

Literature Review ◽

Risk Allele ◽

Statistical Significance ◽

Elevated Serum ◽

The United States ◽

Allele Frequencies ◽

Racial Groups ◽

1000 Genomes Project ◽

1000 Genomes ◽

Risk Alleles

Gout is an inflammatory condition caused by elevated serum urate (SU), a condition known as hyperuricemia (HU). Genetic variations, including single nucleotide polymorphisms (SNPs), can alter the function of urate transporters, leading to differential HU and gout prevalence across different populations. In the United States (U.S.), gout prevalence differentially affects certain racial groups. The objective of this proposed analysis is to compare the frequency of urate-related genetic risk alleles between Europeans (EUR) and the following major racial groups: Africans in Southwest U.S. (ASW), Han-Chinese (CHS), Japanese (JPT), and Mexican (MXL) from the 1000 Genomes Project. The Ensembl genome browser of the 1000 Genomes Project was used to conduct cross-population allele frequency comparisons of 11 SNPs across 11 genes, physiologically involved and significantly associated with SU levels and gout risk. Gene/SNP pairs included: ABCG2 (rs2231142), SLC2A9 (rs734553), SLC17A1 (rs1183201), SLC16A9 (rs1171614), GCKR (rs1260326), SLC22A11 (rs2078267), SLC22A12 (rs505802), INHBC (rs3741414), RREB1 (rs675209), PDZK1 (rs12129861), and NRXN2 (rs478607). Allele frequencies were compared to EUR using Chi-Square or Fisher’s Exact test, when appropriate. Bonferroni correction for multiple comparisons was used, with p < 0.0045 for statistical significance. Risk alleles were defined as the allele that is associated with baseline or higher HU and gout risks. The cumulative HU or gout risk allele index of the 11 SNPs was estimated for each population. The prevalence of HU and gout in U.S. and non-US populations was evaluated using published epidemiological data and literature review. Compared with EUR, the SNP frequencies of 7/11 in ASW, 9/11 in MXL, 9/11 JPT, and 11/11 CHS were significantly different. HU or gout risk allele indices were 5, 6, 9, and 11 in ASW, MXL, CHS, and JPT, respectively. Out of the 11 SNPs, the percentage of risk alleles in CHS and JPT was 100%. Compared to non-US populations, the prevalence of HU and gout appear to be higher in western world countries. Compared with EUR, CHS and JPT populations had the highest HU or gout risk allele frequencies, followed by MXL and ASW. These results suggest that individuals of Asian descent are at higher HU and gout risk, which may partly explain the nearly three-fold higher gout prevalence among Asians versus Caucasians in ambulatory care settings. Furthermore, gout remains a disease of developed countries with a marked global rising.

Download Full-text