scholarly journals A resource of variant effect predictions of single nucleotide variants in model organisms

2018 ◽  
Vol 14 (12) ◽  
Author(s):  
Omar Wagih ◽  
Marco Galardini ◽  
Bede P Busby ◽  
Danish Memon ◽  
Athanasios Typas ◽  
...  
2018 ◽  
Author(s):  
Omar Wagih ◽  
Bede Busby ◽  
Marco Galardini ◽  
Danish Memon ◽  
Athanasios Typas ◽  
...  

AbstractThe effect of single nucleotide variants (SNVs) in coding and non-coding regions is of great interest in genetics. Although many computational methods aim to elucidate the effects of SNVs on cellular mechanisms, it is not straightforward to comprehensively cover different molecular effects. To address this we compiled and benchmarked sequence and structure-based variant effect predictors and we analyzed the impact of nearly all possible amino acid and nucleotide variants in the reference genomes of H. sapiens, S. cerevisiae and E. coli. Studied mechanisms include protein stability, interaction interfaces, post-translational modifications and transcription factor binding sites. We apply this resource to the study of natural and disease coding variants. We also show how variant effects can be aggregated to generate protein complex burden scores that uncover protein complex to phenotype associations based on a set of newly generated growth profiles of 93 sequenced S. cerevisiae strains in 43 conditions. This resource is available through mutfunc, a tool by which users can query precomputed predictions by providing amino acid or nucleotide-level variants.


Genes ◽  
2019 ◽  
Vol 10 (9) ◽  
pp. 671 ◽  
Author(s):  
Pucker ◽  
Rückert ◽  
Stracke ◽  
Viehöver ◽  
Kalinowski ◽  
...  

Arabidopsis thaliana is one of the best studied plant model organisms. Besides cultivation in greenhouses, cells of this plant can also be propagated in suspension cell culture. At7 is one such cell line that was established about 25 years ago. Here, we report the sequencing and the analysis of the At7 genome. Large scale duplications and deletions compared to the Columbia-0 (Col-0) reference sequence were detected. The number of deletions exceeds the number of insertions, thus indicating that a haploid genome size reduction is ongoing. Patterns of small sequence variants differ from the ones observed between A. thaliana accessions, e.g., the number of single nucleotide variants matches the number of insertions/deletions. RNA-Seq analysis reveals that disrupted alleles are less frequent in the transcriptome than the native ones.


Genes ◽  
2020 ◽  
Vol 11 (9) ◽  
pp. 1076
Author(s):  
Victor Jaravine ◽  
James Balmford ◽  
Patrick Metzger ◽  
Melanie Boerries ◽  
Harald Binder ◽  
...  

A novel approach is developed to address the challenge of annotating with phenotypic effects those exome variants for which relevant empirical data are lacking or minimal. The predictive annotation method is implemented as a stacked ensemble of supervised base-learners, including distributed random forest and gradient boosting machines. Ensemble models were trained and cross-validated on evidence-based categorical variant effect annotations from the ClinVar database, and were applied to 84 million non-synonymous single nucleotide variants (SNVs). The consensus model combined 39 functional mutation impacts, cross-species conservation score, and gene indispensability score. The indispensability score, accounting for differences in variant pathogenicities including in essential and mutation-tolerant genes, considerably improved the predictions. The consensus combination is consistent with as many input scores as possible while minimizing false predictions. The input scores are ranked based on their ability to predict effects. The score rankings and categorical phenotypic variant effect predictions are aimed for direct use in clinical and biological applications to prioritize human exome variants and mutations.


2019 ◽  
Author(s):  
Boas Pucker ◽  
Christian Rückert ◽  
Ralf Stracke ◽  
Prisca Viehöver ◽  
Jörn Kalinowski ◽  
...  

AbstractArabidopsis thaliana is one of the best studied plant model organisms. Besides cultivation in greenhouses, cells of this plant can also be propagated in suspension cell culture. At7 is one such cell line that has been established about 25 years ago. Here we report the sequencing and the analysis of the At7 genome. Large scale duplications and deletions compared to the Col-0 reference sequence were detected. The number of deletions exceeds the number of insertions thus indicating that a haploid genome size reduction is ongoing. Patterns of small sequence variants differ from the ones observed between A. thaliana accessions e.g. the number of single nucleotide variants matches the number of insertions/deletions. RNA-Seq analysis reveals that disrupted alleles are less frequent in the transcriptome than the native ones.


2014 ◽  
Author(s):  
Maximilian Press ◽  
Keisha D. Carlson ◽  
Christine Queitsch

Short tandem repeat (STR) variation has been proposed as a major explanatory factor in the heritability of complex traits in humans and model organisms. However, we still struggle to incorporate STR variation into genotype-phenotype maps. Here, we review the promise of STRs in contributing to complex trait heritability, and highlight the challenges that STRs pose due to their repetitive nature. We argue that STR variants are more likely than single nucleotide variants to have epistatic interactions, reiterate the need for targeted assays to accurately genotype STRs, and call for more appropriate statistical methods in detecting STR-phenotype associations. Lastly, somatic STR variation within individuals may serve as a read-out of disease susceptibility, and is thus potentially a valuable covariate for future association studies.


2020 ◽  
Vol 21 (11) ◽  
pp. 1068-1077
Author(s):  
Xiaochao Sun ◽  
Bin Yang ◽  
Qunye Zhang

: Many studies have shown that the spatial distribution of genes within a single chromosome exhibits distinct patterns. However, little is known about the characteristics of inter-chromosomal distribution of genes (including protein-coding genes, processed transcripts and pseudogenes) in different genomes. In this study, we explored these issues using the available genomic data of both human and model organisms. Moreover, we also analyzed the distribution pattern of protein-coding genes that have been associated with 14 common diseases and the insert/deletion mutations and single nucleotide polymorphisms detected by whole genome sequencing in an acute promyelocyte leukemia patient. We obtained the following novel findings. Firstly, inter-chromosomal distribution of genes displays a nonstochastic pattern and the gene densities in different chromosomes are heterogeneous. This kind of heterogeneity is observed in genomes of both lower and higher species. Secondly, protein-coding genes involved in certain biological processes tend to be enriched in one or a few chromosomes. Our findings have added new insights into our understanding of the spatial distribution of genome and disease- related genes across chromosomes. These results could be useful in improving the efficiency of disease-associated gene screening studies by targeting specific chromosomes.


Author(s):  
Renata Parissi Buainain ◽  
Matheus Negri Boschiero ◽  
Bruno Camporeze ◽  
Paulo Henrique Pires de Aguiar ◽  
Fernando Augusto Lima Marson ◽  
...  

2021 ◽  
Vol 11 (1) ◽  
pp. 33
Author(s):  
Nayoung Han ◽  
Jung Mi Oh ◽  
In-Wha Kim

For predicting phenotypes and executing precision medicine, combination analysis of single nucleotide variants (SNVs) genotyping with copy number variations (CNVs) is required. The aim of this study was to discover SNVs or common copy CNVs and examine the combined frequencies of SNVs and CNVs in pharmacogenes using the Korean genome and epidemiology study (KoGES), a consortium project. The genotypes (N = 72,299) and CNV data (N = 1000) were provided by the Korean National Institute of Health, Korea Centers for Disease Control and Prevention. The allele frequencies of SNVs, CNVs, and combined SNVs with CNVs were calculated and haplotype analysis was performed. CYP2D6 rs1065852 (c.100C>T, p.P34S) was the most common variant allele (48.23%). A total of 8454 haplotype blocks in 18 pharmacogenes were estimated. DMD ranked the highest in frequency for gene gain (64.52%), while TPMT ranked the highest in frequency for gene loss (51.80%). Copy number gain of CYP4F2 was observed in 22 subjects; 13 of those subjects were carriers with CYP4F2*3 gain. In the case of TPMT, approximately one-half of the participants (N = 308) had loss of the TPMT*1*1 diplotype. The frequencies of SNVs and CNVs in pharmacogenes were determined using the Korean cohort-based genome-wide association study.


Author(s):  
Pauline Arnaud ◽  
Hélène Morel ◽  
Olivier Milleron ◽  
Laurent Gouya ◽  
Christine Francannet ◽  
...  

Abstract Purpose Individuals with mosaic pathogenic variants in the FBN1 gene are mainly described in the course of familial screening. In the literature, almost all these mosaic individuals are asymptomatic. In this study, we report the experience of our team on more than 5,000 Marfan syndrome (MFS) probands. Methods Next-generation sequencing (NGS) capture technology allowed us to identify five cases of MFS probands who harbored a mosaic pathogenic variant in the FBN1 gene. Results These five sporadic mosaic probands displayed classical features usually seen in Marfan syndrome. Combined with the results of the literature, these rare findings concerned both single-nucleotide variants and copy-number variations. Conclusion This underestimated finding should not be overlooked in the molecular diagnosis of MFS patients and warrants an adaptation of the parameters used in bioinformatics analyses. The five present cases of symptomatic MFS probands harboring a mosaic FBN1 pathogenic variant reinforce the fact that apparently asymptomatic mosaic parents should have a complete clinical examination and a regular cardiovascular follow-up. We advise that individuals with a typical MFS for whom no single-nucleotide pathogenic variant or exon deletion/duplication was identified should be tested by NGS capture panel with an adapted variant calling analysis.


Sign in / Sign up

Export Citation Format

Share Document