A Whole-Genome Association Approach for Large-scaled Inter-species Trait

Mapping Intimacies ◽

10.1101/454363 ◽

2018 ◽

Author(s):

Qi Wu ◽

Huizhong Fan ◽

Lei Chen ◽

Yibo Hu ◽

Fuwen Wei

Keyword(s):

Chromosome Number ◽

Genetic Variants ◽

Complex Traits ◽

Large Scale ◽

Association Studies ◽

Species Traits ◽

Genome Wide Association Studies ◽

Dna Double Strand Breaks ◽

Common Genetic Variants ◽

Genome Association

AbstractGenome wide association studies (GWAS) have provided an avenue for the association between common genetic variants and complex traits. However, using SNP as a genetic marker, GWAS has been confined to detect genetic basis traits only for within species but not for the large-scale inter-species traits. Here, we propose a practical statistical approach that is using kmer frequencies as the genetic markers to associate genetic variants with large scale inter-species traits. We applied this new approach to the trait of chromosome number in 96 mammalian proteomes, and we prioritized 130 genes including TP53 and BAD, of which 6 were candidate genes. These genes were proved to be associated with cellular reaction of DNA double-strand breaks caused by chromosome fission/fusion. Our study provides a new effective genomic strategy to perform association studies for large-scaled inter-species traits, using the chromosome number as a case. We hope this approach could provide exploration for broadly widely traits.

Download Full-text

circVAR database: genome-wide archive of genetic variants for human circular RNAs

10.21203/rs.3.rs-48904/v2 ◽

2020 ◽

Author(s):

Min Zhao ◽

Hong Qu

Keyword(s):

Genetic Variants ◽

Complex Traits ◽

Large Scale ◽

Rna Binding ◽

Rna Binding Proteins ◽

Association Studies ◽

Chromosome 17 ◽

Circular Rnas ◽

Genome Wide Association Studies ◽

Genome Wide

Abstract Background: Circular RNAs (circRNAs) play important roles in regulating gene expression through binding miRNAs and RNA binding proteins. Genetic variation of circRNAs may affect complex traits/diseases by changing their binding efficiency to target miRNAs and proteins. There is a growing demand for investigations of the functions of genetic changes using large-scale experimental evidence. However, there is no online genetic resource for circRNA genes. Results: We performed extensive genetic annotation of 295,526 circRNAs integrated from circBase, circNet and circRNAdb. All pre-computed genetic variants were presented at our online resource, circVAR, with data browsing and search functionality. We explored the chromosome-based distribution of circRNAs and their associated variants. We found that, based on mapping to the 1000 Genomes and ClinVAR databases, chromosome 17 has a relatively large number of circRNAs and associated common and health-related genetic variants. Following the annotation of genome wide association studies (GWAS)-based circRNA variants, we found many non-coding variants within circRNAs, suggesting novel mechanisms for common diseases reported from GWAS studies. For cancer-based somatic variants, we found that chromosome 7 has many highly complex mutations that have been overlooked in previous research. Conclusion: We used the circVAR database to collect SNPs and small insertions and deletions (INDELs) in putative circRNA regions and to identify their potential phenotypic information. To provide a reusable resource for the circRNA research community, we have published all the pre-computed genetic data concerning circRNAs and associated genes together with data query and browsing functions at http://soft.bioinfo-minzhao.org/circvar .

Download Full-text

circVAR Database: Genome-Wide Archive of Genetic Variants for Human Circular RNAs

10.21203/rs.3.rs-48904/v1 ◽

2020 ◽

Author(s):

Min Zhao ◽

Hong Qu

Keyword(s):

Genetic Variants ◽

Complex Traits ◽

Large Scale ◽

Rna Binding ◽

Rna Binding Proteins ◽

Association Studies ◽

Chromosome 17 ◽

Circular Rnas ◽

Genome Wide Association Studies ◽

Genome Wide

Abstract Background: Circular RNAs (circRNAs) play important roles in regulating gene expression through binding miRNAs and RNA binding proteins. Genetic variation of circRNAs may affect complex traits/diseases by changing their binding efficiency to target miRNAs and proteins. There is a growing demand for investigations of the functions of genetic changes using large-scale experimental evidence. However, there is no online genetic resource for circRNA genes. Results: We performed extensive genetic annotation of 295,526 circRNAs integrated from circBase, circNet and circRNAdb. All pre-computed genetic variants were presented at our online resource, circVAR, with data browsing and search functionality. We explored the chromosome-based distribution of circRNAs and their associated variants. We found that, based on mapping to the 1000 Genomes and ClinVAR databases, chromosome 17 has a relatively large number of circRNAs and associated common and health-related genetic variants. Following the annotation of genome wide association studies (GWAS)-based circRNA variants, we found many non-coding variants within circRNAs, suggesting novel mechanisms for common diseases reported from GWAS studies. For cancer-based somatic variants, we found that chromosome 7 has many highly complex mutations that have been overlooked in previous research.Conclusion: We used the circVAR database to collect SNPs and small insertions and deletions (INDELs) in putative circRNA regions and to identify their potential phenotypic information. To provide a reusable resource for the circRNA research community, we have published all the pre-computed genetic data concerning circRNAs and associated genes together with data query and browsing functions at http://soft.bioinfo-minzhao.org/circvar.

Download Full-text

The contribution of rare whole genome sequencing variants to plasma protein levels and to the missing heritability

10.21203/rs.3.rs-625433/v1 ◽

2021 ◽

Author(s):

Marcin Kierczak ◽

Nima Rafati ◽

Julia Höglund ◽

Hadrien Gourle ◽

Daniel Schmitz ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Genetic Variants ◽

Complex Traits ◽

Rare Variants ◽

Association Studies ◽

Genome Wide Association Studies ◽

Whole Genome ◽

Missing Heritability ◽

Common Genetic Variants

Abstract Despite the success in identifying effects of common genetic variants, using genome-wide association studies (GWAS), much of the genetic contribution to complex traits remains unexplained. Here, we analysed high coverage whole-genome sequencing (WGS) data, to evaluate the contribution of rare genetic variants to 414 plasma proteins. The frequency distribution of genetic variants was skewed towards the rare spectrum, and damaging variants were more often rare. However, only 2.24% of the heritability was estimated to be explained by rare variants. A gene-based approach, developed to also capture the effect of rare variants, identified associations for 249 of the proteins, which was 25% more as compared to a GWAS. Out of those, 24 associations were driven by rare variants, clearly highlighting the capacity of aggregated tests and WGS data. We conclude that, while many rare variants have considerable phenotypic effects, their contribution to the missing heritability is limited by their low frequencies.

Download Full-text

Influence of genetic variants on gene expression in human pancreatic islets – implications for type 2 diabetes

10.1101/655670 ◽

2019 ◽

Cited By ~ 9

Author(s):

Ana Viñuela ◽

Arushi Varshney ◽

Martijn van de Bunt ◽

Rashmi B. Prasad ◽

Olof Asplund ◽

...

Keyword(s):

Type 2 Diabetes ◽

Genetic Variants ◽

Complex Traits ◽

Large Scale ◽

Association Studies ◽

Cell Types ◽

Genome Wide Association Studies ◽

Selective Enrichment ◽

Gwas Signal

AbstractMost signals detected by genome-wide association studies map to non-coding sequence and their tissue-specific effects influence transcriptional regulation. However, many key tissues and cell-types required for appropriate functional inference are absent from large-scale resources such as ENCODE and GTEx. We explored the relationship between genetic variants influencing predisposition to type 2 diabetes (T2D) and related glycemic traits, and human pancreatic islet transcription using RNA-Seq and genotyping data from 420 islet donors. We find: (a) eQTLs have a variable replication rate across the 44 GTEx tissues (<73%), indicating that our study captured islet-specific cis-eQTL signals; (b) islet eQTL signals show marked overlap with islet epigenome annotation, though eQTL effect size is reduced in the stretch enhancers most strongly implicated in GWAS signal location; (c) selective enrichment of islet eQTL overlap with the subset of T2D variants implicated in islet dysfunction; and (d) colocalization between islet eQTLs and variants influencing T2D or related glycemic traits, delivering candidate effector transcripts at 23 loci, including DGKB and TCF7L2. Our findings illustrate the advantages of performing functional and regulatory studies in tissues of greatest disease-relevance while expanding our mechanistic insights into complex traits association loci activity with an expanded list of putative transcripts implicated in T2D development.

Download Full-text

Large-scale genetic investigation reveals genetic liability to multiple complex traits influencing a higher risk of ADHD

Scientific Reports ◽

10.1038/s41598-021-01517-7 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Luis M. García-Marín ◽

Adrián I. Campos ◽

Gabriel Cuéllar-Partida ◽

Sarah E. Medland ◽

Scott H. Kollins ◽

...

Keyword(s):

Genetic Variants ◽

Complex Traits ◽

Large Scale ◽

Association Studies ◽

Neurodevelopmental Disorder ◽

Deficiency Anemia ◽

Chronic Obstructive ◽

Genome Wide Association Studies ◽

Obstructive Pulmonary Disease ◽

Genetic Liability

AbstractAttention Deficit-Hyperactivity Disorder (ADHD) is a complex psychiatric and neurodevelopmental disorder that develops during childhood and spans into adulthood. ADHD’s aetiology is complex, and evidence about its cause and risk factors is limited. We leveraged genetic data from genome-wide association studies (GWAS) and performed latent causal variable analyses using a hypothesis-free approach to infer causal associations between 1387 complex traits and ADHD. We identified 37 inferred potential causal associations with ADHD risk. Our results reveal that genetic variants associated with iron deficiency anemia (ICD10), obesity, type 2 diabetes, synovitis and tenosynovitis (ICD10), polyarthritis (ICD10), neck or shoulder pain, and substance use in adults display partial genetic causality on ADHD risk in children. Genetic variants associated with ADHD have a partial genetic causality increasing the risk for chronic obstructive pulmonary disease and carpal tunnel syndrome. Protective factors for ADHD risk included genetic variants associated with the likelihood of participating in socially supportive and interactive activities. Our results show that genetic liability to multiple complex traits influences a higher risk for ADHD, highlighting the potential role of cardiometabolic phenotypes and physical pain in ADHD’s aetiology. These findings have the potential to inform future clinical studies and development of interventions.

Download Full-text

Genetic Overlap between General Cognitive Function and Schizophrenia: A Review of Cognitive GWASs

International Journal of Molecular Sciences ◽

10.3390/ijms19123822 ◽

2018 ◽

Vol 19 (12) ◽

pp. 3822 ◽

Cited By ~ 16

Author(s):

Kazutaka Ohi ◽

Chika Sumiyoshi ◽

Haruo Fujino ◽

Yuka Yasuda ◽

Hidenaga Yamamori ◽

...

Keyword(s):

Cognitive Function ◽

Genetic Variants ◽

Large Scale ◽

Molecular Mechanisms ◽

Association Studies ◽

Cognitive Impairments ◽

Treatment Strategies ◽

Genome Wide Association Studies ◽

Common Genetic Variants ◽

Health Related

General cognitive (intelligence) function is substantially heritable, and is a major determinant of economic and health-related life outcomes. Cognitive impairments and intelligence decline are core features of schizophrenia which are evident before the onset of the illness. Genetic overlaps between cognitive impairments and the vulnerability for the illness have been suggested. Here, we review the literature on recent large-scale genome-wide association studies (GWASs) of general cognitive function and correlations between cognitive function and genetic susceptibility to schizophrenia. In the last decade, large-scale GWASs (n > 30,000) of general cognitive function and schizophrenia have demonstrated that substantial proportions of the heritability of the cognitive function and schizophrenia are explained by a polygenic component consisting of many common genetic variants with small effects. To date, GWASs have identified more than 100 loci linked to general cognitive function and 108 loci linked to schizophrenia. These genetic variants are mostly intronic or intergenic. Genes identified around these genetic variants are densely expressed in brain tissues. Schizophrenia-related genetic risks are consistently correlated with lower general cognitive function (rg = −0.20) and higher educational attainment (rg = 0.08). Cognitive functions are associated with many of the socioeconomic and health-related outcomes. Current treatment strategies largely fail to improve cognitive impairments of schizophrenia. Therefore, further study is needed to understand the molecular mechanisms underlying both cognition and schizophrenia.

Download Full-text

circVAR database: genome-wide archive of genetic variants for human circular RNAs

BMC Genomics ◽

10.1186/s12864-020-07172-y ◽

2020 ◽

Vol 21 (1) ◽

Cited By ~ 1

Author(s):

Min Zhao ◽

Hong Qu

Keyword(s):

Genetic Variants ◽

Complex Traits ◽

Large Scale ◽

Rna Binding ◽

Rna Binding Proteins ◽

Association Studies ◽

Chromosome 17 ◽

Circular Rnas ◽

Genome Wide Association Studies ◽

Genome Wide

Abstract Background Circular RNAs (circRNAs) play important roles in regulating gene expression through binding miRNAs and RNA binding proteins. Genetic variation of circRNAs may affect complex traits/diseases by changing their binding efficiency to target miRNAs and proteins. There is a growing demand for investigations of the functions of genetic changes using large-scale experimental evidence. However, there is no online genetic resource for circRNA genes. Results We performed extensive genetic annotation of 295,526 circRNAs integrated from circBase, circNet and circRNAdb. All pre-computed genetic variants were presented at our online resource, circVAR, with data browsing and search functionality. We explored the chromosome-based distribution of circRNAs and their associated variants. We found that, based on mapping to the 1000 Genomes and ClinVAR databases, chromosome 17 has a relatively large number of circRNAs and associated common and health-related genetic variants. Following the annotation of genome wide association studies (GWAS)-based circRNA variants, we found many non-coding variants within circRNAs, suggesting novel mechanisms for common diseases reported from GWAS studies. For cancer-based somatic variants, we found that chromosome 7 has many highly complex mutations that have been overlooked in previous research. Conclusion We used the circVAR database to collect SNPs and small insertions and deletions (INDELs) in putative circRNA regions and to identify their potential phenotypic information. To provide a reusable resource for the circRNA research community, we have published all the pre-computed genetic data concerning circRNAs and associated genes together with data query and browsing functions at http://soft.bioinfo-minzhao.org/circvar.

Download Full-text

Better estimation of SNP heritability from summary statistics provides a new understanding of the genetic architecture of complex traits

10.1101/284976 ◽

2018 ◽

Cited By ~ 6

Author(s):

Doug Speed ◽

David J Balding

Keyword(s):

Complex Traits ◽

Genetic Architecture ◽

Large Scale ◽

Association Studies ◽

Genome Wide Association Studies ◽

Summary Statistics ◽

Confounding Bias ◽

Conserved Regions ◽

Genome Wide ◽

Variation Explained

LD Score Regression (LDSC) has been widely applied to the results of genome-wide association studies. However, its estimates of SNP heritability are derived from an unrealistic model in which each SNP is expected to contribute equal heritability. As a consequence, LDSC tends to over-estimate confounding bias, under-estimate the total phenotypic variation explained by SNPs, and provide misleading estimates of the heritability enrichment of SNP categories. Therefore, we present SumHer, software for estimating SNP heritability from summary statistics using more realistic heritability models. After demonstrating its superiority over LDSC, we apply SumHer to the results of 24 large-scale association studies (average sample size 121 000). First we show that these studies have tended to substantially over-correct for confounding, and as a result the number of genome-wide significant loci has under-reported by about 20%. Next we estimate enrichment for 24 categories of SNPs defined by functional annotations. A previous study using LDSC reported that conserved regions were 13-fold enriched, and found a further twelve categories with above 2-fold enrichment. By contrast, our analysis using SumHer finds that conserved regions are only 1.6-fold (SD 0.06) enriched, and that no category has enrichment above 1.7-fold. SumHer provides an improved understanding of the genetic architecture of complex traits, which enables more efficient analysis of future genetic data.

Download Full-text

A transcriptome-wide Mendelian randomization study to uncover tissue-dependent regulatory mechanisms across the human phenome

10.1101/563379 ◽

2019 ◽

Cited By ~ 2

Author(s):

Tom G Richardson ◽

Gibran Hemani ◽

Tom R Gaunt ◽

Caroline L Relton ◽

George Davey Smith

Keyword(s):

Gene Expression ◽

Genetic Variants ◽

Complex Traits ◽

Mendelian Randomization ◽

Drug Repositioning ◽

Association Studies ◽

Thyroid Tissue ◽

Genome Wide Association Studies ◽

Tissue Specific ◽

Genome Wide

AbstractBackgroundDeveloping insight into tissue-specific transcriptional mechanisms can help improve our understanding of how genetic variants exert their effects on complex traits and disease. By applying the principles of Mendelian randomization, we have undertaken a systematic analysis to evaluate transcriptome-wide associations between gene expression across 48 different tissue types and 395 complex traits.ResultsOverall, we identified 100,025 gene-trait associations based on conventional genome-wide corrections (P < 5 × 10−08) that also provided evidence of genetic colocalization. These results indicated that genetic variants which influence gene expression levels in multiple tissues are more likely to influence multiple complex traits. We identified many examples of tissue-specific effects, such as genetically-predicted TPO, NR3C2 and SPATA13 expression only associating with thyroid disease in thyroid tissue. Additionally, FBN2 expression was associated with both cardiovascular and lung function traits, but only when analysed in heart and lung tissue respectively.We also demonstrate that conducting phenome-wide evaluations of our results can help flag adverse on-target side effects for therapeutic intervention, as well as propose drug repositioning opportunities. Moreover, we find that exploring the tissue-dependency of associations identified by genome-wide association studies (GWAS) can help elucidate the causal genes and tissues responsible for effects, as well as uncover putative novel associations.ConclusionsThe atlas of tissue-dependent associations we have constructed should prove extremely valuable to future studies investigating the genetic determinants of complex disease. The follow-up analyses we have performed in this study are merely a guide for future research. Conducting similar evaluations can be undertaken systematically at http://mrcieu.mrsoftware.org/Tissue_MR_atlas/.

Download Full-text

Abstract 367: Extreme High-Density Lipoprotein Cholesterol Genetics: An Assortment of Large and Small Polygenic Effects

Arteriosclerosis Thrombosis and Vascular Biology ◽

10.1161/atvb.37.suppl_1.367 ◽

2017 ◽

Vol 37 (suppl_1) ◽

Author(s):

Jacqueline S Dron ◽

Jian Wang ◽

Cécile Low-Kam ◽

Sumeet A Khetarpal ◽

John F Robinson ◽

...

Keyword(s):

Large Scale ◽

Genetic Basis ◽

Rare Variants ◽

Association Studies ◽

Density Lipoprotein ◽

Copy Number Variations ◽

Genome Wide Association Studies ◽

Common Variants ◽

Targeted Next Generation Sequencing ◽

Common Genetic Variants

Rationale: Although HDL-C levels are known to have a complex genetic basis, most studies have focused solely on identifying rare variants with large phenotypic effects to explain extreme HDL-C phenotypes. Objective: Here we concurrently evaluate the contribution of both rare and common genetic variants, as well as large-scale copy number variations (CNVs), towards extreme HDL-C concentrations. Methods: In clinically ascertained patients with low ( N =136) and high ( N =119) HDL-C profiles, we applied our targeted next-generation sequencing panel (LipidSeq TM ) to sequence genes involved in HDL metabolism, which were subsequently screened for rare variants and CNVs. We also developed a novel polygenic trait score (PTS) to assess patients’ genetic accumulations of common variants that have been shown by genome-wide association studies to associate primarily with HDL-C levels. Two additional cohorts of patients with extremely low and high HDL-C (total N =1,746 and N =1,139, respectively) were used for PTS validation. Results: In the discovery cohort, 32.4% of low HDL-C patients carried rare variants or CNVs in primary ( ABCA1 , APOA1 , LCAT ) and secondary ( LPL , LMF1 , GPD1 , APOE ) HDL-C–altering genes. Additionally, 13.4% of high HDL-C patients carried rare variants or CNVs in primary ( SCARB1 , CETP , LIPC , LIPG ) and secondary ( APOC3 , ANGPTL4 ) HDL-C–altering genes. For polygenic effects, patients with abnormal HDL-C profiles but without rare variants or CNVs were ~2-fold more likely to have an extreme PTS compared to normolipidemic individuals, indicating an increased frequency of common HDL-C–associated variants in these patients. Similar results in the two validation cohorts demonstrate that this novel PTS successfully quantifies common variant accumulation, further characterizing the polygenic basis for extreme HDL-C phenotypes. Conclusions: Patients with extreme HDL-C levels have various combinations of rare variants, common variants, or CNVs driving their phenotypes. Fully characterizing the genetic basis of HDL-C levels must extend to encompass multiple types of genetic determinants—not just rare variants—to further our understanding of this complex, controversial quantitative trait.

Download Full-text