scholarly journals Global Analysis of Human mRNA Folding Disruptions in Synonymous Variants Demonstrates Significant Population Constraint

2019 ◽  
Author(s):  
Jeffrey B.S. Gaither ◽  
Grant E. Lammi ◽  
James L. Li ◽  
David M. Gordon ◽  
Harkness C. Kuck ◽  
...  

ABSTRACTBackgroundIn most organisms the structure of an mRNA molecule is crucial in determining speed of translation, half-life, splicing propensities and final protein configuration. Synonymous variants which distort this wildtype mRNA structure may be pathogenic as a consequence. However, current clinical guidelines classify synonymous or “silent” single nucleotide variants (sSNVs) as largely benign unless a role in RNA splicing can be demonstrated.ResultsWe developed novel software to conduct a global transcriptome study in which RNA folding statistics were computed for 469 million SNVs in 45,800 transcripts using an Apache Spark implementation of ViennaRNA in the cloud. Focusing our analysis on the subset of 17.9 million sSNVs, we discover that variants predicted to disrupt mRNA structure have lower rates of incidence in the human population. Given that the community lacks tools to evaluate the potential pathogenic impact of sSNVs, we introduce a “Structural Predictivity Index” (SPI) to quantify this constraint due to mRNA structure.ConclusionsOur findings support the hypothesis that sSNVs may play a role in genetic disorders due to their effects on mRNA structure. Our RNA-folding scores provide a means of gauging the structural constraint operating on any sSNV in the human genome. Given that the majority of patients with rare or as yet to be diagnosed disease lack a molecular diagnosis, these scores have the potential to enable discovery of novel genetic etiologies. Our RNA Stability Pipeline as well as ViennaRNA structural metrics and SPI scores for all human synonymous variants can be downloaded from GitHub https://github.com/nch-igm/rna-stability.

GigaScience ◽  
2021 ◽  
Vol 10 (4) ◽  
Author(s):  
Jeffrey B S Gaither ◽  
Grant E Lammi ◽  
James L Li ◽  
David M Gordon ◽  
Harkness C Kuck ◽  
...  

Abstract Background The role of synonymous single-nucleotide variants in human health and disease is poorly understood, yet evidence suggests that this class of “silent” genetic variation plays multiple regulatory roles in both transcription and translation. One mechanism by which synonymous codons direct and modulate the translational process is through alteration of the elaborate structure formed by single-stranded mRNA molecules. While tools to computationally predict the effect of non-synonymous variants on protein structure are plentiful, analogous tools to systematically assess how synonymous variants might disrupt mRNA structure are lacking. Results We developed novel software using a parallel processing framework for large-scale generation of secondary RNA structures and folding statistics for the transcriptome of any species. Focusing our analysis on the human transcriptome, we calculated 5 billion RNA-folding statistics for 469 million single-nucleotide variants in 45,800 transcripts. By considering the impact of all possible synonymous variants globally, we discover that synonymous variants predicted to disrupt mRNA structure have significantly lower rates of incidence in the human population. Conclusions These findings support the hypothesis that synonymous variants may play a role in genetic disorders due to their effects on mRNA structure. To evaluate the potential pathogenic impact of synonymous variants, we provide RNA stability, edge distance, and diversity metrics for every nucleotide in the human transcriptome and introduce a “Structural Predictivity Index” (SPI) to quantify structural constraint operating on any synonymous variant. Because no single RNA-folding metric can capture the diversity of mechanisms by which a variant could alter secondary mRNA structure, we generated a SUmmarized RNA Folding (SURF) metric to provide a single measurement to predict the impact of secondary structure altering variants in human genetic studies.


2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Hai Lin ◽  
Katherine A. Hargreaves ◽  
Rudong Li ◽  
Jill L. Reiter ◽  
Yue Wang ◽  
...  

AbstractSingle nucleotide variants (SNVs) in intronic regions have yet to be systematically investigated for their disease-causing potential. Using known pathogenic and neutral intronic SNVs (iSNVs) as training data, we develop the RegSNPs-intron algorithm based on a random forest classifier that integrates RNA splicing, protein structure, and evolutionary conservation features. RegSNPs-intron showed excellent performance in evaluating the pathogenic impacts of iSNVs. Using a high-throughput functional reporter assay called ASSET-seq (ASsay for Splicing using ExonTrap and sequencing), we evaluate the impact of RegSNPs-intron predictions on splicing outcome. Together, RegSNPs-intron and ASSET-seq enable effective prioritization of iSNVs for disease pathogenesis.


2017 ◽  
Author(s):  
Michael A. Lodato ◽  
Rachel E. Rodin ◽  
Craig L. Bohrson ◽  
Michael E. Coulter ◽  
Alison R. Barton ◽  
...  

SummaryIt has long been hypothesized that aging and neurodegeneration are associated with somatic mutation in neurons; however, methodological hurdles have prevented testing this hypothesis directly. We used single-cell whole-genome sequencing to perform genome-wide somatic single-nucleotide variant (sSNV) identification on DNA from 161 single neurons from the prefrontal cortex and hippocampus of fifteen normal individuals (aged 4 months to 82 years) as well as nine individuals affected by early-onset neurodegeneration due to genetic disorders of DNA repair (Cockayne syndrome and Xeroderma pigmentosum). sSNVs increased approximately linearly with age in both areas (with a higher rate in hippocampus) and were more abundant in neurodegenerative disease. The accumulation of somatic mutations with age—which we term genosenium—shows age-related, region-related, and disease-related molecular signatures, and may be important in other human age-associated conditions.One-Sentence SummarySomatic single-nucleotide variants accumulate in human neurons in aging with regional specificity and in progeroid diseases.


Author(s):  
Sathyavikasini K ◽  
Vijaya M S

Muscular dystrophy is a rare genetic disorder that affects the muscular system which deteriorates the skeletal muscles and hinders locomotion. In the finding of genetic disorders such as Muscular dystrophy, the disease is identified based on mutations in the gene sequence. A new model is proposed for classifying the disease accurately using gene sequences, mutated by adopting positional cloning on the reference cDNA sequence. The features of mutated gene sequences for missense, nonsense and silent mutations aims in distinguishing the type of disease and the classifiers are trained with commonly used supervised pattern learning techniques.10-fold cross validation results show that the decision tree algorithm was found to attain the best accuracy of 100%. In summary, this study provides an automatic model to classify the muscular dystrophy disease and shed a new light on predicting the genetic disorder from gene based features through pattern recognition model.


Cells ◽  
2020 ◽  
Vol 9 (11) ◽  
pp. 2409
Author(s):  
Riuko Ohashi ◽  
Hajime Umezu ◽  
Ayako Sato ◽  
Tatsuya Abé ◽  
Shuhei Kondo ◽  
...  

Ribosomal RNA (rRNA), the most abundant non-coding RNA species, is a major component of the ribosome. Impaired ribosome biogenesis causes the dysfunction of protein synthesis and diseases called “ribosomopathies,” including genetic disorders with cancer risk. However, the potential role of rRNA gene (rDNA) alterations in cancer is unknown. We investigated germline and somatic single-nucleotide variants (SNVs) in the rDNA promoter region (positions −248 to +100, relative to the transcription start site) in 82 lung adenocarcinomas (LUAC). Twenty-nine tumors (35.4%) carried germline SNVs, and eight tumors (9.8%) harbored somatic SNVs. Interestingly, the presence of germline SNVs between positions +1 and +100 (n = 12; 14.6%) was associated with significantly shorter recurrence-free survival (RFS) and overall survival (OS) by univariate analysis (p < 0.05, respectively), and was an independent prognostic factor for RFS and OS by multivariate analysis. LUAC cell line PC9, carrying rDNA promoter SNV at position +49, showed significantly higher ribosome biogenesis than H1650 cells without SNV. Upon nucleolar stress induced by actinomycin D, PC9 retained significantly higher ribosome biogenesis than H1650. These results highlight the possible functional role of SNVs at specific sites of the rDNA promoter region in ribosome biogenesis, the progression of LUAC, and their potential prognostic value.


2019 ◽  
Author(s):  
Hai Lin ◽  
Katherine A. Hargreaves ◽  
Rudong Li ◽  
Jill L. Reiter ◽  
Matthew Mort ◽  
...  

AbstractA large number of single nucleotide variants (SNVs) in the human genome are known to be responsible for inherited disease. An even larger number of SNVs, particularly those located in introns, have yet to be investigated for their pathogenic potential. Using known pathogenic and neutral intronic SNVs (iSNVs), we developed the regSNPs-intron algorithm based on a random forest classifier that integrates RNA splicing, protein structure and evolutionary conservation features. regSNPs-intron showed high accuracy in computing disease-causing probabilities of iSNVs. Using a high-throughput functional reporter assay called ASSET-seq (ASsay for Splicing using ExonTrap and sequencing), we validated regSNPs-intron predictions by measuring the impact of iSNVs on splicing outcome. Together, regSNPs-intron and ASSET-seq enable effective prioritization of iSNVs for disease pathogenesis. regSNPs-intron is available at https://regsnps-intron.ccbb.iupui.edu.


Author(s):  
Shengyu Xie ◽  
Xiangyou Leng ◽  
Dachang Tao ◽  
Yangwei Zhang ◽  
Zhaokun Wang ◽  
...  

Author(s):  
Renata Parissi Buainain ◽  
Matheus Negri Boschiero ◽  
Bruno Camporeze ◽  
Paulo Henrique Pires de Aguiar ◽  
Fernando Augusto Lima Marson ◽  
...  

2021 ◽  
Vol 11 (1) ◽  
pp. 33
Author(s):  
Nayoung Han ◽  
Jung Mi Oh ◽  
In-Wha Kim

For predicting phenotypes and executing precision medicine, combination analysis of single nucleotide variants (SNVs) genotyping with copy number variations (CNVs) is required. The aim of this study was to discover SNVs or common copy CNVs and examine the combined frequencies of SNVs and CNVs in pharmacogenes using the Korean genome and epidemiology study (KoGES), a consortium project. The genotypes (N = 72,299) and CNV data (N = 1000) were provided by the Korean National Institute of Health, Korea Centers for Disease Control and Prevention. The allele frequencies of SNVs, CNVs, and combined SNVs with CNVs were calculated and haplotype analysis was performed. CYP2D6 rs1065852 (c.100C>T, p.P34S) was the most common variant allele (48.23%). A total of 8454 haplotype blocks in 18 pharmacogenes were estimated. DMD ranked the highest in frequency for gene gain (64.52%), while TPMT ranked the highest in frequency for gene loss (51.80%). Copy number gain of CYP4F2 was observed in 22 subjects; 13 of those subjects were carriers with CYP4F2*3 gain. In the case of TPMT, approximately one-half of the participants (N = 308) had loss of the TPMT*1*1 diplotype. The frequencies of SNVs and CNVs in pharmacogenes were determined using the Korean cohort-based genome-wide association study.


Sign in / Sign up

Export Citation Format

Share Document