scholarly journals A Whole-Genome Association Approach for Large-scaled Inter-species Trait

2018 ◽  
Author(s):  
Qi Wu ◽  
Huizhong Fan ◽  
Lei Chen ◽  
Yibo Hu ◽  
Fuwen Wei

AbstractGenome wide association studies (GWAS) have provided an avenue for the association between common genetic variants and complex traits. However, using SNP as a genetic marker, GWAS has been confined to detect genetic basis traits only for within species but not for the large-scale inter-species traits. Here, we propose a practical statistical approach that is using kmer frequencies as the genetic markers to associate genetic variants with large scale inter-species traits. We applied this new approach to the trait of chromosome number in 96 mammalian proteomes, and we prioritized 130 genes including TP53 and BAD, of which 6 were candidate genes. These genes were proved to be associated with cellular reaction of DNA double-strand breaks caused by chromosome fission/fusion. Our study provides a new effective genomic strategy to perform association studies for large-scaled inter-species traits, using the chromosome number as a case. We hope this approach could provide exploration for broadly widely traits.

2020 ◽  
Author(s):  
Min Zhao ◽  
Hong Qu

Abstract Background: Circular RNAs (circRNAs) play important roles in regulating gene expression through binding miRNAs and RNA binding proteins. Genetic variation of circRNAs may affect complex traits/diseases by changing their binding efficiency to target miRNAs and proteins. There is a growing demand for investigations of the functions of genetic changes using large-scale experimental evidence. However, there is no online genetic resource for circRNA genes. Results: We performed extensive genetic annotation of 295,526 circRNAs integrated from circBase, circNet and circRNAdb. All pre-computed genetic variants were presented at our online resource, circVAR, with data browsing and search functionality. We explored the chromosome-based distribution of circRNAs and their associated variants. We found that, based on mapping to the 1000 Genomes and ClinVAR databases, chromosome 17 has a relatively large number of circRNAs and associated common and health-related genetic variants. Following the annotation of genome wide association studies (GWAS)-based circRNA variants, we found many non-coding variants within circRNAs, suggesting novel mechanisms for common diseases reported from GWAS studies. For cancer-based somatic variants, we found that chromosome 7 has many highly complex mutations that have been overlooked in previous research. Conclusion: We used the circVAR database to collect SNPs and small insertions and deletions (INDELs) in putative circRNA regions and to identify their potential phenotypic information. To provide a reusable resource for the circRNA research community, we have published all the pre-computed genetic data concerning circRNAs and associated genes together with data query and browsing functions at http://soft.bioinfo-minzhao.org/circvar .


2020 ◽  
Author(s):  
Min Zhao ◽  
Hong Qu

Abstract Background: Circular RNAs (circRNAs) play important roles in regulating gene expression through binding miRNAs and RNA binding proteins. Genetic variation of circRNAs may affect complex traits/diseases by changing their binding efficiency to target miRNAs and proteins. There is a growing demand for investigations of the functions of genetic changes using large-scale experimental evidence. However, there is no online genetic resource for circRNA genes. Results: We performed extensive genetic annotation of 295,526 circRNAs integrated from circBase, circNet and circRNAdb. All pre-computed genetic variants were presented at our online resource, circVAR, with data browsing and search functionality. We explored the chromosome-based distribution of circRNAs and their associated variants. We found that, based on mapping to the 1000 Genomes and ClinVAR databases, chromosome 17 has a relatively large number of circRNAs and associated common and health-related genetic variants. Following the annotation of genome wide association studies (GWAS)-based circRNA variants, we found many non-coding variants within circRNAs, suggesting novel mechanisms for common diseases reported from GWAS studies. For cancer-based somatic variants, we found that chromosome 7 has many highly complex mutations that have been overlooked in previous research.Conclusion: We used the circVAR database to collect SNPs and small insertions and deletions (INDELs) in putative circRNA regions and to identify their potential phenotypic information. To provide a reusable resource for the circRNA research community, we have published all the pre-computed genetic data concerning circRNAs and associated genes together with data query and browsing functions at http://soft.bioinfo-minzhao.org/circvar.


2021 ◽  
Author(s):  
Marcin Kierczak ◽  
Nima Rafati ◽  
Julia Höglund ◽  
Hadrien Gourle ◽  
Daniel Schmitz ◽  
...  

Abstract Despite the success in identifying effects of common genetic variants, using genome-wide association studies (GWAS), much of the genetic contribution to complex traits remains unexplained. Here, we analysed high coverage whole-genome sequencing (WGS) data, to evaluate the contribution of rare genetic variants to 414 plasma proteins. The frequency distribution of genetic variants was skewed towards the rare spectrum, and damaging variants were more often rare. However, only 2.24% of the heritability was estimated to be explained by rare variants. A gene-based approach, developed to also capture the effect of rare variants, identified associations for 249 of the proteins, which was 25% more as compared to a GWAS. Out of those, 24 associations were driven by rare variants, clearly highlighting the capacity of aggregated tests and WGS data. We conclude that, while many rare variants have considerable phenotypic effects, their contribution to the missing heritability is limited by their low frequencies.


2019 ◽  
Author(s):  
Ana Viñuela ◽  
Arushi Varshney ◽  
Martijn van de Bunt ◽  
Rashmi B. Prasad ◽  
Olof Asplund ◽  
...  

AbstractMost signals detected by genome-wide association studies map to non-coding sequence and their tissue-specific effects influence transcriptional regulation. However, many key tissues and cell-types required for appropriate functional inference are absent from large-scale resources such as ENCODE and GTEx. We explored the relationship between genetic variants influencing predisposition to type 2 diabetes (T2D) and related glycemic traits, and human pancreatic islet transcription using RNA-Seq and genotyping data from 420 islet donors. We find: (a) eQTLs have a variable replication rate across the 44 GTEx tissues (<73%), indicating that our study captured islet-specific cis-eQTL signals; (b) islet eQTL signals show marked overlap with islet epigenome annotation, though eQTL effect size is reduced in the stretch enhancers most strongly implicated in GWAS signal location; (c) selective enrichment of islet eQTL overlap with the subset of T2D variants implicated in islet dysfunction; and (d) colocalization between islet eQTLs and variants influencing T2D or related glycemic traits, delivering candidate effector transcripts at 23 loci, including DGKB and TCF7L2. Our findings illustrate the advantages of performing functional and regulatory studies in tissues of greatest disease-relevance while expanding our mechanistic insights into complex traits association loci activity with an expanded list of putative transcripts implicated in T2D development.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Luis M. García-Marín ◽  
Adrián I. Campos ◽  
Gabriel Cuéllar-Partida ◽  
Sarah E. Medland ◽  
Scott H. Kollins ◽  
...  

AbstractAttention Deficit-Hyperactivity Disorder (ADHD) is a complex psychiatric and neurodevelopmental disorder that develops during childhood and spans into adulthood. ADHD’s aetiology is complex, and evidence about its cause and risk factors is limited. We leveraged genetic data from genome-wide association studies (GWAS) and performed latent causal variable analyses using a hypothesis-free approach to infer causal associations between 1387 complex traits and ADHD. We identified 37 inferred potential causal associations with ADHD risk. Our results reveal that genetic variants associated with iron deficiency anemia (ICD10), obesity, type 2 diabetes, synovitis and tenosynovitis (ICD10), polyarthritis (ICD10), neck or shoulder pain, and substance use in adults display partial genetic causality on ADHD risk in children. Genetic variants associated with ADHD have a partial genetic causality increasing the risk for chronic obstructive pulmonary disease and carpal tunnel syndrome. Protective factors for ADHD risk included genetic variants associated with the likelihood of participating in socially supportive and interactive activities. Our results show that genetic liability to multiple complex traits influences a higher risk for ADHD, highlighting the potential role of cardiometabolic phenotypes and physical pain in ADHD’s aetiology. These findings have the potential to inform future clinical studies and development of interventions.


2018 ◽  
Vol 19 (12) ◽  
pp. 3822 ◽  
Author(s):  
Kazutaka Ohi ◽  
Chika Sumiyoshi ◽  
Haruo Fujino ◽  
Yuka Yasuda ◽  
Hidenaga Yamamori ◽  
...  

General cognitive (intelligence) function is substantially heritable, and is a major determinant of economic and health-related life outcomes. Cognitive impairments and intelligence decline are core features of schizophrenia which are evident before the onset of the illness. Genetic overlaps between cognitive impairments and the vulnerability for the illness have been suggested. Here, we review the literature on recent large-scale genome-wide association studies (GWASs) of general cognitive function and correlations between cognitive function and genetic susceptibility to schizophrenia. In the last decade, large-scale GWASs (n > 30,000) of general cognitive function and schizophrenia have demonstrated that substantial proportions of the heritability of the cognitive function and schizophrenia are explained by a polygenic component consisting of many common genetic variants with small effects. To date, GWASs have identified more than 100 loci linked to general cognitive function and 108 loci linked to schizophrenia. These genetic variants are mostly intronic or intergenic. Genes identified around these genetic variants are densely expressed in brain tissues. Schizophrenia-related genetic risks are consistently correlated with lower general cognitive function (rg = −0.20) and higher educational attainment (rg = 0.08). Cognitive functions are associated with many of the socioeconomic and health-related outcomes. Current treatment strategies largely fail to improve cognitive impairments of schizophrenia. Therefore, further study is needed to understand the molecular mechanisms underlying both cognition and schizophrenia.


BMC Genomics ◽  
2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Min Zhao ◽  
Hong Qu

Abstract Background Circular RNAs (circRNAs) play important roles in regulating gene expression through binding miRNAs and RNA binding proteins. Genetic variation of circRNAs may affect complex traits/diseases by changing their binding efficiency to target miRNAs and proteins. There is a growing demand for investigations of the functions of genetic changes using large-scale experimental evidence. However, there is no online genetic resource for circRNA genes. Results We performed extensive genetic annotation of 295,526 circRNAs integrated from circBase, circNet and circRNAdb. All pre-computed genetic variants were presented at our online resource, circVAR, with data browsing and search functionality. We explored the chromosome-based distribution of circRNAs and their associated variants. We found that, based on mapping to the 1000 Genomes and ClinVAR databases, chromosome 17 has a relatively large number of circRNAs and associated common and health-related genetic variants. Following the annotation of genome wide association studies (GWAS)-based circRNA variants, we found many non-coding variants within circRNAs, suggesting novel mechanisms for common diseases reported from GWAS studies. For cancer-based somatic variants, we found that chromosome 7 has many highly complex mutations that have been overlooked in previous research. Conclusion We used the circVAR database to collect SNPs and small insertions and deletions (INDELs) in putative circRNA regions and to identify their potential phenotypic information. To provide a reusable resource for the circRNA research community, we have published all the pre-computed genetic data concerning circRNAs and associated genes together with data query and browsing functions at http://soft.bioinfo-minzhao.org/circvar.


2018 ◽  
Author(s):  
Doug Speed ◽  
David J Balding

LD Score Regression (LDSC) has been widely applied to the results of genome-wide association studies. However, its estimates of SNP heritability are derived from an unrealistic model in which each SNP is expected to contribute equal heritability. As a consequence, LDSC tends to over-estimate confounding bias, under-estimate the total phenotypic variation explained by SNPs, and provide misleading estimates of the heritability enrichment of SNP categories. Therefore, we present SumHer, software for estimating SNP heritability from summary statistics using more realistic heritability models. After demonstrating its superiority over LDSC, we apply SumHer to the results of 24 large-scale association studies (average sample size 121 000). First we show that these studies have tended to substantially over-correct for confounding, and as a result the number of genome-wide significant loci has under-reported by about 20%. Next we estimate enrichment for 24 categories of SNPs defined by functional annotations. A previous study using LDSC reported that conserved regions were 13-fold enriched, and found a further twelve categories with above 2-fold enrichment. By contrast, our analysis using SumHer finds that conserved regions are only 1.6-fold (SD 0.06) enriched, and that no category has enrichment above 1.7-fold. SumHer provides an improved understanding of the genetic architecture of complex traits, which enables more efficient analysis of future genetic data.


2019 ◽  
Author(s):  
Tom G Richardson ◽  
Gibran Hemani ◽  
Tom R Gaunt ◽  
Caroline L Relton ◽  
George Davey Smith

AbstractBackgroundDeveloping insight into tissue-specific transcriptional mechanisms can help improve our understanding of how genetic variants exert their effects on complex traits and disease. By applying the principles of Mendelian randomization, we have undertaken a systematic analysis to evaluate transcriptome-wide associations between gene expression across 48 different tissue types and 395 complex traits.ResultsOverall, we identified 100,025 gene-trait associations based on conventional genome-wide corrections (P < 5 × 10−08) that also provided evidence of genetic colocalization. These results indicated that genetic variants which influence gene expression levels in multiple tissues are more likely to influence multiple complex traits. We identified many examples of tissue-specific effects, such as genetically-predicted TPO, NR3C2 and SPATA13 expression only associating with thyroid disease in thyroid tissue. Additionally, FBN2 expression was associated with both cardiovascular and lung function traits, but only when analysed in heart and lung tissue respectively.We also demonstrate that conducting phenome-wide evaluations of our results can help flag adverse on-target side effects for therapeutic intervention, as well as propose drug repositioning opportunities. Moreover, we find that exploring the tissue-dependency of associations identified by genome-wide association studies (GWAS) can help elucidate the causal genes and tissues responsible for effects, as well as uncover putative novel associations.ConclusionsThe atlas of tissue-dependent associations we have constructed should prove extremely valuable to future studies investigating the genetic determinants of complex disease. The follow-up analyses we have performed in this study are merely a guide for future research. Conducting similar evaluations can be undertaken systematically at http://mrcieu.mrsoftware.org/Tissue_MR_atlas/.


2017 ◽  
Vol 37 (suppl_1) ◽  
Author(s):  
Jacqueline S Dron ◽  
Jian Wang ◽  
Cécile Low-Kam ◽  
Sumeet A Khetarpal ◽  
John F Robinson ◽  
...  

Rationale: Although HDL-C levels are known to have a complex genetic basis, most studies have focused solely on identifying rare variants with large phenotypic effects to explain extreme HDL-C phenotypes. Objective: Here we concurrently evaluate the contribution of both rare and common genetic variants, as well as large-scale copy number variations (CNVs), towards extreme HDL-C concentrations. Methods: In clinically ascertained patients with low ( N =136) and high ( N =119) HDL-C profiles, we applied our targeted next-generation sequencing panel (LipidSeq TM ) to sequence genes involved in HDL metabolism, which were subsequently screened for rare variants and CNVs. We also developed a novel polygenic trait score (PTS) to assess patients’ genetic accumulations of common variants that have been shown by genome-wide association studies to associate primarily with HDL-C levels. Two additional cohorts of patients with extremely low and high HDL-C (total N =1,746 and N =1,139, respectively) were used for PTS validation. Results: In the discovery cohort, 32.4% of low HDL-C patients carried rare variants or CNVs in primary ( ABCA1 , APOA1 , LCAT ) and secondary ( LPL , LMF1 , GPD1 , APOE ) HDL-C–altering genes. Additionally, 13.4% of high HDL-C patients carried rare variants or CNVs in primary ( SCARB1 , CETP , LIPC , LIPG ) and secondary ( APOC3 , ANGPTL4 ) HDL-C–altering genes. For polygenic effects, patients with abnormal HDL-C profiles but without rare variants or CNVs were ~2-fold more likely to have an extreme PTS compared to normolipidemic individuals, indicating an increased frequency of common HDL-C–associated variants in these patients. Similar results in the two validation cohorts demonstrate that this novel PTS successfully quantifies common variant accumulation, further characterizing the polygenic basis for extreme HDL-C phenotypes. Conclusions: Patients with extreme HDL-C levels have various combinations of rare variants, common variants, or CNVs driving their phenotypes. Fully characterizing the genetic basis of HDL-C levels must extend to encompass multiple types of genetic determinants—not just rare variants—to further our understanding of this complex, controversial quantitative trait.


Sign in / Sign up

Export Citation Format

Share Document