A Novel Approach for the Simultaneous Analysis of Common and Rare Variants in Complex Traits

Bioinformatics and Biology Insights ◽

10.4137/bbi.s8852 ◽

2012 ◽

Vol 6 ◽

pp. BBI.S8852 ◽

Cited By ~ 4

Author(s):

Ao Yuan ◽

Guanjie Chen ◽

Yanxun Zhou ◽

Amy Bentley ◽

Charles Rotimi

Keyword(s):

Complex Traits ◽

Rare Variants ◽

Sequence Data ◽

Association Studies ◽

Simultaneous Analysis ◽

Genome Wide Association Studies ◽

Common Variants ◽

Disease Etiology ◽

Novel Approach ◽

Common Genetic Variants

Genome-wide association studies (GWAS) have been successful in detecting common genetic variants underlying common traits and diseases. Despite the GWAS success stories, the percent trait variance explained by GWAS signals, the so called “missing heritability” has been, at best, modest. Also, the predictive power of common variants identified by GWAS has not been encouraging. Given these observations along with the fact that the effects of rare variants are often, by design, unaccounted for by GWAS and the availability of sequence data, there is a growing need for robust analytic approaches to evaluate the contribution of rare variants to common complex diseases. Here we propose a new method that enables the simultaneous analysis of the association between rare and common variants in disease etiology. We refer to this method as SCARVA (simultaneous common and rare variants analysis). SCARVA is simple to use and is efficient. We used SCARVA to analyze two independent real datasets to identify rare and common variants underlying variation in obesity among participants in the Africa America Diabetes Mellitus (AADM) study and plasma triglyceride levels in the Dallas Heart Study (DHS). We found common and rare variants associated with both traits, consistent with published results.

Download Full-text

Abstract 367: Extreme High-Density Lipoprotein Cholesterol Genetics: An Assortment of Large and Small Polygenic Effects

Arteriosclerosis Thrombosis and Vascular Biology ◽

10.1161/atvb.37.suppl_1.367 ◽

2017 ◽

Vol 37 (suppl_1) ◽

Author(s):

Jacqueline S Dron ◽

Jian Wang ◽

Cécile Low-Kam ◽

Sumeet A Khetarpal ◽

John F Robinson ◽

...

Keyword(s):

Large Scale ◽

Genetic Basis ◽

Rare Variants ◽

Association Studies ◽

Density Lipoprotein ◽

Copy Number Variations ◽

Genome Wide Association Studies ◽

Common Variants ◽

Targeted Next Generation Sequencing ◽

Common Genetic Variants

Rationale: Although HDL-C levels are known to have a complex genetic basis, most studies have focused solely on identifying rare variants with large phenotypic effects to explain extreme HDL-C phenotypes. Objective: Here we concurrently evaluate the contribution of both rare and common genetic variants, as well as large-scale copy number variations (CNVs), towards extreme HDL-C concentrations. Methods: In clinically ascertained patients with low ( N =136) and high ( N =119) HDL-C profiles, we applied our targeted next-generation sequencing panel (LipidSeq TM ) to sequence genes involved in HDL metabolism, which were subsequently screened for rare variants and CNVs. We also developed a novel polygenic trait score (PTS) to assess patients’ genetic accumulations of common variants that have been shown by genome-wide association studies to associate primarily with HDL-C levels. Two additional cohorts of patients with extremely low and high HDL-C (total N =1,746 and N =1,139, respectively) were used for PTS validation. Results: In the discovery cohort, 32.4% of low HDL-C patients carried rare variants or CNVs in primary ( ABCA1 , APOA1 , LCAT ) and secondary ( LPL , LMF1 , GPD1 , APOE ) HDL-C–altering genes. Additionally, 13.4% of high HDL-C patients carried rare variants or CNVs in primary ( SCARB1 , CETP , LIPC , LIPG ) and secondary ( APOC3 , ANGPTL4 ) HDL-C–altering genes. For polygenic effects, patients with abnormal HDL-C profiles but without rare variants or CNVs were ~2-fold more likely to have an extreme PTS compared to normolipidemic individuals, indicating an increased frequency of common HDL-C–associated variants in these patients. Similar results in the two validation cohorts demonstrate that this novel PTS successfully quantifies common variant accumulation, further characterizing the polygenic basis for extreme HDL-C phenotypes. Conclusions: Patients with extreme HDL-C levels have various combinations of rare variants, common variants, or CNVs driving their phenotypes. Fully characterizing the genetic basis of HDL-C levels must extend to encompass multiple types of genetic determinants—not just rare variants—to further our understanding of this complex, controversial quantitative trait.

Download Full-text

The contribution of rare whole genome sequencing variants to plasma protein levels and to the missing heritability

10.21203/rs.3.rs-625433/v1 ◽

2021 ◽

Author(s):

Marcin Kierczak ◽

Nima Rafati ◽

Julia Höglund ◽

Hadrien Gourle ◽

Daniel Schmitz ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Genetic Variants ◽

Complex Traits ◽

Rare Variants ◽

Association Studies ◽

Genome Wide Association Studies ◽

Whole Genome ◽

Missing Heritability ◽

Common Genetic Variants

Abstract Despite the success in identifying effects of common genetic variants, using genome-wide association studies (GWAS), much of the genetic contribution to complex traits remains unexplained. Here, we analysed high coverage whole-genome sequencing (WGS) data, to evaluate the contribution of rare genetic variants to 414 plasma proteins. The frequency distribution of genetic variants was skewed towards the rare spectrum, and damaging variants were more often rare. However, only 2.24% of the heritability was estimated to be explained by rare variants. A gene-based approach, developed to also capture the effect of rare variants, identified associations for 249 of the proteins, which was 25% more as compared to a GWAS. Out of those, 24 associations were driven by rare variants, clearly highlighting the capacity of aggregated tests and WGS data. We conclude that, while many rare variants have considerable phenotypic effects, their contribution to the missing heritability is limited by their low frequencies.

Download Full-text

Targeted sequencing of Parkinson’s disease loci genes highlights SYT11, FGF20 and other associations

Brain ◽

10.1093/brain/awaa401 ◽

2020 ◽

Author(s):

Uladzislau Rudakou ◽

Eric Yu ◽

Lynne Krohn ◽

Jennifer A Ruskey ◽

Farnaz Asayesh ◽

...

Keyword(s):

Parkinson’S Disease ◽

Parkinson's Disease ◽

Linkage Disequilibrium ◽

Rare Variants ◽

Association Studies ◽

Strong Linkage Disequilibrium ◽

Genome Wide Association Studies ◽

Common Variants ◽

Common Genetic Variants

Abstract Genome-wide association studies (GWAS) have identified numerous loci associated with Parkinson’s disease. The specific genes and variants that drive the associations within the vast majority of these loci are unknown. We aimed to perform a comprehensive analysis of selected genes to determine the potential role of rare and common genetic variants within these loci. We fully sequenced 32 genes from 25 loci previously associated with Parkinson’s disease in 2657 patients and 3647 controls from three cohorts. Capture was done using molecular inversion probes targeting the exons, exon-intron boundaries and untranslated regions (UTRs) of the genes of interest, followed by sequencing. Quality control was performed to include only high-quality variants. We examined the role of rare variants (minor allele frequency < 0.01) using optimized sequence Kernel association tests. The association of common variants was estimated using regression models adjusted for age, sex and ethnicity as required in each cohort, followed by a meta-analysis. After Bonferroni correction, we identified a burden of rare variants in SYT11, FGF20 and GCH1 associated with Parkinson’s disease. Nominal associations were identified in 21 additional genes. Previous reports suggested that the SYT11 GWAS association is driven by variants in the nearby GBA gene. However, the association of SYT11 was mainly driven by a rare 3′ UTR variant (rs945006601) and was independent of GBA variants (P = 5.23 × 10−5 after exclusion of all GBA variant carriers). The association of FGF20 was driven by a rare 5′ UTR variant (rs1034608171) located in the promoter region. The previously reported association of GCH1 with Parkinson’s disease is driven by rare non-synonymous variants, some of which are known to cause dopamine-responsive dystonia. We also identified two LRRK2 variants, p.Arg793Met and p.Gln1353Lys, in 10 and eight controls, respectively, but not in patients. We identified common variants associated with Parkinson’s disease in MAPT, TMEM175, BST1, SNCA and GPNMB, which are all in strong linkage disequilibrium with known GWAS hits in their respective loci. A common coding PM20D1 variant, p.Ile149Val, was nominally associated with reduced risk of Parkinson’s disease (odds ratio 0.73, 95% confidence interval 0.60–0.89, P = 1.161 × 10−3). This variant is not in linkage disequilibrium with the top GWAS hits within this locus and may represent a novel association. These results further demonstrate the importance of fine mapping of GWAS loci, and suggest that SYT11, FGF20, and potentially PM20D1, BST1 and GPNMB should be considered for future studies as possible Parkinson’s disease-related genes.

Download Full-text

Simultaneous Analysis of Common and Rare Variants in Complex Traits: Application to SNPs (SCARVAsnp)

Bioinformatics and Biology Insights ◽

10.4137/bbi.s9966 ◽

2012 ◽

Vol 6 ◽

pp. BBI.S9966 ◽

Cited By ~ 2

Author(s):

Guanjie Chen ◽

Ao Yuan ◽

Yanxun Zhou ◽

Amy R. Bentley ◽

Jie Zhou ◽

...

Keyword(s):

Complex Traits ◽

Statistical Power ◽

Large Scale ◽

Rare Variants ◽

Real Data ◽

Simultaneous Analysis ◽

Common Variants ◽

Disease Etiology ◽

Modified Method ◽

Log Likelihood

Advances in technology and reduced costs are facilitating large-scale sequencing of genes and exomes as well as entire genomes. Recently, we described an approach based on haplotypes called SCARVA 1 that enables the simultaneous analysis of the association between rare and common variants in disease etiology. Here, we describe an extension of SCARVA that evaluates individual markers instead of haplotypes. This modified method (SCARVAsnp) is implemented in four stages. First, all common variants in a pre-specified region (eg, gene) are evaluated individually. Second, a union procedure is used to combined all rare variants (RVs) in the index region, and the ratio of the log likelihood with one RV excluded to the log likelihood of a model with all the collapsed RVs is calculated. On the basis of previously-reported simulation studies, 1 a likelihood ratio ≥ 1.3 is considered statistically significant. Third, the direction of the association of the removed RV is determined by evaluating the change in λ values with the inclusion and exclusion of that RV. Lastly, significant common and rare variants, along with covariates, are included in a final regression model to evaluate the association between the trait and variants in that region. We apply simulated and real data sets to show that the method is simple to use, computationally effcient, and that it can accurately identify both common and rare risk variants. This method overcomes several limitations of existing methods. For example, SCARVAsnp limits loss of statistical power by not including variants that are not associated with the trait of interest in the final model. Also, SCARVAsnp takes into consideration the direction of association by effectively modelling positively and negatively associated variants.

Download Full-text

Targeted sequencing of Parkinson's disease loci genes highlights SYT11, FGF20 and other associations

10.1101/2020.05.29.20116111 ◽

2020 ◽

Author(s):

Uladzislau Rudakou ◽

Eric Yu ◽

Lynne M Krohn ◽

Jennifer A Ruskey ◽

Farnaz Asayesh ◽

...

Keyword(s):

Parkinson’S Disease ◽

Parkinson's Disease ◽

Rare Variants ◽

Association Studies ◽

Meta Analysis ◽

Strong Linkage Disequilibrium ◽

Genome Wide Association Studies ◽

Common Variants ◽

Common Genetic Variants

Genome-wide association studies (GWAS) have identified numerous loci associated with Parkinson's disease. The specific genes and variants that drive the associations within the vast majority of these loci are unknown. We aimed to perform a comprehensive analysis of selected genes to determine the potential role of rare and common genetic variants within these loci. We fully sequenced 32 genes from 25 loci previously associated with Parkinson's disease in 2,657 patients and 3,647 controls from three cohorts. Capture was done using molecular inversion probes targeting the exons, exon-intron boundaries and untranslated regions (UTRs) of the genes of interest, followed by sequencing. Quality control was performed to include only high-quality variants. We examined the role of rare variants (minor allele frequency < 0.01) using optimized sequence Kernel association tests (SKAT-O). The association of common variants was estimated using regression models adjusted for age, sex and ethnicity as required in each cohort, followed by a meta-analysis. After Bonferroni correction, we identified a burden of rare variants in SYT11, FGF20 and GCH1 associated with Parkinson's disease. Nominal associations were identified in 21 additional genes. Previous reports suggested that the SYT11 GWAS association is driven by variants in the nearby GBA gene. However, the association of SYT11 was mainly driven by a rare 3' UTR variant (rs945006601) and was independent of GBA variants (p=5.23E-05 after exclusion of all GBA variant carriers). The association of FGF20 was driven by a rare 5' UTR variant (rs1034608171) located in the promoter region. The previously reported association of GCH1 with Parkinson's Disease is driven by rare nonsynonymous variants, some of which are known to cause dopamine-responsive dystonia. We also identified two LRRK2 variants, p.Arg793Met and p.Gln1353Lys, in ten and eight controls, respectively, but not in patients. We identified common variants associated with Parkinson's disease in MAPT, TMEM175, BST1, SNCA and GPNMB which are all in strong linkage disequilibrium (LD) with known GWAS hits in their respective loci. A common coding PM20D1 variant, p.Ile149Val, was nominally associated with reduced risk of Parkinson's disease (OR 0.73, 95% CI 0.60-0.89, p=1.161E-03). This variant is not in LD with the top GWAS hits within this locus and may represent a novel association. These results further demonstrate the importance of fine mapping of GWAS loci, and suggest that SYT11, FGF20, and potentially PM20D1, BST1 and GPNMB should be considered for future studies as possible Parkinson's disease-related genes.

Download Full-text

Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program

Nature ◽

10.1038/s41586-021-03205-y ◽

2021 ◽

Vol 590 (7845) ◽

pp. 290-299 ◽

Cited By ~ 22

Author(s):

Daniel Taliun ◽

◽

Daniel N. Harris ◽

Michael D. Kessler ◽

Jedidiah Carlson ◽

...

Keyword(s):

Rare Variants ◽

Sequence Data ◽

Association Studies ◽

Genotype Imputation ◽

Genome Wide Association Studies ◽

Phenotypic Data ◽

Treatment And Prevention ◽

Genome Wide ◽

Diverse Backgrounds ◽

Unmapped Reads

AbstractThe Trans-Omics for Precision Medicine (TOPMed) programme seeks to elucidate the genetic architecture and biology of heart, lung, blood and sleep disorders, with the ultimate goal of improving diagnosis, treatment and prevention of these diseases. The initial phases of the programme focused on whole-genome sequencing of individuals with rich phenotypic data and diverse backgrounds. Here we describe the TOPMed goals and design as well as the available resources and early insights obtained from the sequence data. The resources include a variant browser, a genotype imputation server, and genomic and phenotypic data that are available through dbGaP (Database of Genotypes and Phenotypes)1. In the first 53,831 TOPMed samples, we detected more than 400 million single-nucleotide and insertion or deletion variants after alignment with the reference genome. Additional previously undescribed variants were detected through assembly of unmapped reads and customized analysis in highly variable loci. Among the more than 400 million detected variants, 97% have frequencies of less than 1% and 46% are singletons that are present in only one individual (53% among unrelated individuals). These rare variants provide insights into mutational processes and recent human evolutionary history. The extensive catalogue of genetic variation in TOPMed studies provides unique opportunities for exploring the contributions of rare and noncoding sequence variants to phenotypic variation. Furthermore, combining TOPMed haplotypes with modern imputation methods improves the power and reach of genome-wide association studies to include variants down to a frequency of approximately 0.01%.

Download Full-text

GPA-MDS: A Visualization Approach to Investigate Genetic Architecture among Phenotypes Using GWAS Results

International Journal of Genomics ◽

10.1155/2016/6589843 ◽

2016 ◽

Vol 2016 ◽

pp. 1-6 ◽

Cited By ~ 2

Author(s):

Wei Wei ◽

Paula S. Ramos ◽

Kelly J. Hunt ◽

Bethany J. Wolf ◽

Gary Hardiman ◽

...

Keyword(s):

Complex Traits ◽

Genetic Architecture ◽

Association Studies ◽

Genetic Relationships ◽

Joint Analysis ◽

Genome Wide Association Studies ◽

Risk Variants ◽

Novel Approach ◽

Medical Benefits ◽

Rigorous Framework

Genome-wide association studies (GWAS) have identified tens of thousands of genetic variants associated with hundreds of phenotypes and diseases, which have provided clinical and medical benefits to patients with novel biomarkers and therapeutic targets. Recently, there has been accumulating evidence suggesting that different complex traits share a common risk basis, namely, pleiotropy. Previously, a statistical method, namely, GPA (Genetic analysis incorporating Pleiotropy and Annotation), was developed to improve identification of risk variants and to investigate pleiotropic structure through a joint analysis of multiple GWAS datasets. While GPA provides a statistically rigorous framework to evaluate pleiotropy between phenotypes, it is still not trivial to investigate genetic relationships among a large number of phenotypes using the GPA framework. In order to address this challenge, in this paper, we propose a novel approach, GPA-MDS, to visualize genetic relationships among phenotypes using the GPA algorithm and multidimensional scaling (MDS). This tool will help researchers to investigate common etiology among diseases, which can potentially lead to development of common treatments across diseases. We evaluate the proposed GPA-MDS framework using a simulation study and apply it to jointly analyze GWAS datasets examining 18 unique phenotypes, which helps reveal the shared genetic architecture of these phenotypes.

Download Full-text

SMARCA2 common variant association and rare variant excess in Schizophrenia patients from an Algerian Trio Cohort

European Psychiatry ◽

10.1016/s0924-9338(11)73051-6 ◽

2011 ◽

Vol 26 (S2) ◽

pp. 1346-1346

Author(s):

D. Benmessaoud ◽

A.-M. Lepagnol-Bestel ◽

M. Delepine ◽

J. Hager ◽

J.-M. Moalic ◽

...

Keyword(s):

Rare Variants ◽

Association Studies ◽

Common Variant ◽

Genome Wide Association Studies ◽

Common Variants ◽

Fisher Test ◽

Coding Regions ◽

Genome Wide ◽

Whole Exome ◽

Positive Evolution

Genome wide association studies (GWAS) of Schizophrenia (SZ) patients have identified common variants in ten genes including SMARCA2 (Koga et al., HMG, 2009). We found that the SZ-GWAS genes are part of an interacting network centered on SMARCA2 (Loe-Mie et al., HMG, 2010). Furthermore, SMARCA2 was found disrupted in SZ (Walsh et al., Science, 2008). SMARCA2 encodes the ATPase (BRM) of the SWI/SNF chromatin remodeling complex that is at the interface of genome and environmental adaptation.Taking advantage of an Algerian trio cohort of one hundred SZ patients (Benmessaoud et al., BMC Psychiatry, 2008), we replicated the association of SNP rs2296212 localized in exon 33, already shown associated in Koga study and resulting in D1546E amino acid change in the SMARCA2 protein. We studied SMARCA2 codons and found that exon 33 displays a signature of positive evolution in the primate lineage.Our working hypothesis is that the coding regions displaying positive selection are target of novel rare variants. To address this question, we sequenced two exons displaying positive evolution and one exon without evidence of positive evolution.We found (i) that rare variants are significantly in excess in SZ-patients compared to their parents (p = 0.038, Fisher test) and (ii) a higher proportion of rare variants in the primate-accelerated exons compared with the non-evolutionary exon in SZ-patients (p = 0.032, Fisher test).SMARCA2 exon sequencing and whole exome sequencing from patients harboring SNP rs2296212 common variant are under progress. Altogether, these results are expected to give new insights into the genetic architecture of SZ.

Download Full-text

Sequencing of over 100,000 individuals identifies multiple genes and rare variants associated with Crohns disease susceptibility

10.1101/2021.06.15.21258641 ◽

2021 ◽

Author(s):

Aleksejs Sazonovs ◽

Christine R Stevens ◽

Guhan R Venkataraman ◽

Kai Yuan ◽

Brandon Avila ◽

...

Keyword(s):

Rare Variants ◽

Disease Risk ◽

Sequence Data ◽

Association Studies ◽

Genome Wide Association Studies ◽

Crohns Disease ◽

Biological Targets ◽

Genome Wide ◽

Coding Variants ◽

First Time

Genome-wide association studies (GWAS) have identified hundreds of loci associated with Crohns disease (CD); however, as with all complex diseases, deriving pathogenic mechanisms from these non-coding GWAS discoveries has been challenging. To complement GWAS and better define actionable biological targets, we analysed sequence data from more than 30,000 CD cases and 80,000 population controls. We observe rare coding variants in established CD susceptibility genes as well as ten genes where coding variation directly implicates the gene in disease risk for the first time.

Download Full-text

Inferring relevant tissues and cell types for complex traits in genome-wide association studies

10.1101/2021.06.09.447805 ◽

2021 ◽

Author(s):

Rujin Wang ◽

Danyu Lin ◽

Yuchao Jiang

Keyword(s):

Single Cell ◽

Complex Traits ◽

Association Studies ◽

Cell Types ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Cell Type ◽

Disease Etiology ◽

Genome Wide ◽

Cell Type Specific

More than a decade of genome-wide association studies (GWASs) have identified genetic risk variants that are significantly associated with complex traits. Emerging evidence suggests that the function of trait-associated variants likely acts in a tissue- or cell-type-specific fashion. Yet, it remains challenging to prioritize trait-relevant tissues or cell types to elucidate disease etiology. Here, we present EPIC (cEll tyPe enrIChment), a statistical framework that relates large-scale GWAS summary statistics to cell-type-specific omics measurements from single-cell sequencing. We derive powerful gene-level test statistics for common and rare variants, separately and jointly, and adopt generalized least squares to prioritize trait-relevant tissues or cell types while accounting for the correlation structures both within and between genes. Using enrichment of loci associated with four lipid traits in the liver and enrichment of loci associated with three neurological disorders in the brain as ground truths, we show that EPIC outperforms existing methods. We extend our framework to single-cell transcriptomic data and identify cell types underlying type 2 diabetes and schizophrenia. The enrichment is replicated using independent GWAS and single-cell datasets and further validated using PubMed search and existing bulk case-control testing results.

Download Full-text