Global Analysis of Human mRNA Folding Disruptions in Synonymous Variants Demonstrates Significant Population Constraint

Mapping Intimacies ◽

10.1101/712679 ◽

2019 ◽

Author(s):

Jeffrey B.S. Gaither ◽

Grant E. Lammi ◽

James L. Li ◽

David M. Gordon ◽

Harkness C. Kuck ◽

...

Keyword(s):

Rna Splicing ◽

Rna Folding ◽

Global Analysis ◽

Genetic Disorders ◽

Rna Stability ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Mrna Structure ◽

Structural Metrics ◽

Protein Configuration

ABSTRACTBackgroundIn most organisms the structure of an mRNA molecule is crucial in determining speed of translation, half-life, splicing propensities and final protein configuration. Synonymous variants which distort this wildtype mRNA structure may be pathogenic as a consequence. However, current clinical guidelines classify synonymous or “silent” single nucleotide variants (sSNVs) as largely benign unless a role in RNA splicing can be demonstrated.ResultsWe developed novel software to conduct a global transcriptome study in which RNA folding statistics were computed for 469 million SNVs in 45,800 transcripts using an Apache Spark implementation of ViennaRNA in the cloud. Focusing our analysis on the subset of 17.9 million sSNVs, we discover that variants predicted to disrupt mRNA structure have lower rates of incidence in the human population. Given that the community lacks tools to evaluate the potential pathogenic impact of sSNVs, we introduce a “Structural Predictivity Index” (SPI) to quantify this constraint due to mRNA structure.ConclusionsOur findings support the hypothesis that sSNVs may play a role in genetic disorders due to their effects on mRNA structure. Our RNA-folding scores provide a means of gauging the structural constraint operating on any sSNV in the human genome. Given that the majority of patients with rare or as yet to be diagnosed disease lack a molecular diagnosis, these scores have the potential to enable discovery of novel genetic etiologies. Our RNA Stability Pipeline as well as ViennaRNA structural metrics and SPI scores for all human synonymous variants can be downloaded from GitHub https://github.com/nch-igm/rna-stability.

Download Full-text

Synonymous variants that disrupt messenger RNA structure are significantly constrained in the human population

GigaScience ◽

10.1093/gigascience/giab023 ◽

2021 ◽

Vol 10 (4) ◽

Author(s):

Jeffrey B S Gaither ◽

Grant E Lammi ◽

James L Li ◽

David M Gordon ◽

Harkness C Kuck ◽

...

Keyword(s):

Messenger Rna ◽

Human Population ◽

Large Scale ◽

Rna Folding ◽

Rna Stability ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Human Transcriptome ◽

Mrna Structure ◽

The Impact

Abstract Background The role of synonymous single-nucleotide variants in human health and disease is poorly understood, yet evidence suggests that this class of “silent” genetic variation plays multiple regulatory roles in both transcription and translation. One mechanism by which synonymous codons direct and modulate the translational process is through alteration of the elaborate structure formed by single-stranded mRNA molecules. While tools to computationally predict the effect of non-synonymous variants on protein structure are plentiful, analogous tools to systematically assess how synonymous variants might disrupt mRNA structure are lacking. Results We developed novel software using a parallel processing framework for large-scale generation of secondary RNA structures and folding statistics for the transcriptome of any species. Focusing our analysis on the human transcriptome, we calculated 5 billion RNA-folding statistics for 469 million single-nucleotide variants in 45,800 transcripts. By considering the impact of all possible synonymous variants globally, we discover that synonymous variants predicted to disrupt mRNA structure have significantly lower rates of incidence in the human population. Conclusions These findings support the hypothesis that synonymous variants may play a role in genetic disorders due to their effects on mRNA structure. To evaluate the potential pathogenic impact of synonymous variants, we provide RNA stability, edge distance, and diversity metrics for every nucleotide in the human transcriptome and introduce a “Structural Predictivity Index” (SPI) to quantify structural constraint operating on any synonymous variant. Because no single RNA-folding metric can capture the diversity of mechanisms by which a variant could alter secondary mRNA structure, we generated a SUmmarized RNA Folding (SURF) metric to provide a single measurement to predict the impact of secondary structure altering variants in human genetic studies.

Download Full-text

RegSNPs-intron: a computational framework for predicting pathogenic impact of intronic single nucleotide variants

Genome Biology ◽

10.1186/s13059-019-1847-4 ◽

2019 ◽

Vol 20 (1) ◽

Cited By ~ 5

Author(s):

Hai Lin ◽

Katherine A. Hargreaves ◽

Rudong Li ◽

Jill L. Reiter ◽

Yue Wang ◽

...

Keyword(s):

Rna Splicing ◽

Evolutionary Conservation ◽

Random Forest Classifier ◽

Training Data ◽

Reporter Assay ◽

Single Nucleotide Variants ◽

Excellent Performance ◽

Computational Framework ◽

Single Nucleotide ◽

The Impact

AbstractSingle nucleotide variants (SNVs) in intronic regions have yet to be systematically investigated for their disease-causing potential. Using known pathogenic and neutral intronic SNVs (iSNVs) as training data, we develop the RegSNPs-intron algorithm based on a random forest classifier that integrates RNA splicing, protein structure, and evolutionary conservation features. RegSNPs-intron showed excellent performance in evaluating the pathogenic impacts of iSNVs. Using a high-throughput functional reporter assay called ASSET-seq (ASsay for Splicing using ExonTrap and sequencing), we evaluate the impact of RegSNPs-intron predictions on splicing outcome. Together, RegSNPs-intron and ASSET-seq enable effective prioritization of iSNVs for disease pathogenesis.

Download Full-text

Aging and neurodegeneration are associated with increased mutations in single human neurons

10.1101/221960 ◽

2017 ◽

Cited By ~ 4

Author(s):

Michael A. Lodato ◽

Rachel E. Rodin ◽

Craig L. Bohrson ◽

Michael E. Coulter ◽

Alison R. Barton ◽

...

Keyword(s):

Somatic Mutations ◽

Genetic Disorders ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Age Related ◽

Genome Wide ◽

Normal Individuals ◽

Human Neurons ◽

Regional Specificity ◽

Associated Conditions

SummaryIt has long been hypothesized that aging and neurodegeneration are associated with somatic mutation in neurons; however, methodological hurdles have prevented testing this hypothesis directly. We used single-cell whole-genome sequencing to perform genome-wide somatic single-nucleotide variant (sSNV) identification on DNA from 161 single neurons from the prefrontal cortex and hippocampus of fifteen normal individuals (aged 4 months to 82 years) as well as nine individuals affected by early-onset neurodegeneration due to genetic disorders of DNA repair (Cockayne syndrome and Xeroderma pigmentosum). sSNVs increased approximately linearly with age in both areas (with a higher rate in hippocampus) and were more abundant in neurodegenerative disease. The accumulation of somatic mutations with age—which we term genosenium—shows age-related, region-related, and disease-related molecular signatures, and may be important in other human age-associated conditions.One-Sentence SummarySomatic single-nucleotide variants accumulate in human neurons in aging with regional specificity and in progeroid diseases.

Download Full-text

Identification of Rare Genetic Disorder from Single Nucleotide Variants Using Supervised Learning Technique

IAES International Journal of Artificial Intelligence (IJ-AI) ◽

10.11591/ijai.v6.i4.pp174-184 ◽

2017 ◽

Vol 6 (4) ◽

pp. 174

Author(s):

Sathyavikasini K ◽

Vijaya M S

Keyword(s):

Muscular Dystrophy ◽

Positional Cloning ◽

Genetic Disorders ◽

Genetic Disorder ◽

Gene Sequences ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Rare Genetic Disorder ◽

Learning Techniques ◽

Fold Cross Validation

Muscular dystrophy is a rare genetic disorder that affects the muscular system which deteriorates the skeletal muscles and hinders locomotion. In the finding of genetic disorders such as Muscular dystrophy, the disease is identified based on mutations in the gene sequence. A new model is proposed for classifying the disease accurately using gene sequences, mutated by adopting positional cloning on the reference cDNA sequence. The features of mutated gene sequences for missense, nonsense and silent mutations aims in distinguishing the type of disease and the classifiers are trained with commonly used supervised pattern learning techniques.10-fold cross validation results show that the decision tree algorithm was found to attain the best accuracy of 100%. In summary, this study provides an automatic model to classify the muscular dystrophy disease and shed a new light on predicting the genetic disorder from gene based features through pattern recognition model.

Download Full-text

Frequent Germline and Somatic Single Nucleotide Variants in the Promoter Region of the Ribosomal RNA Gene in Japanese Lung Adenocarcinoma Patients

Cells ◽

10.3390/cells9112409 ◽

2020 ◽

Vol 9 (11) ◽

pp. 2409

Author(s):

Riuko Ohashi ◽

Hajime Umezu ◽

Ayako Sato ◽

Tatsuya Abé ◽

Shuhei Kondo ◽

...

Keyword(s):

Ribosomal Rna ◽

Promoter Region ◽

Ribosome Biogenesis ◽

Genetic Disorders ◽

Univariate Analysis ◽

Rrna Gene ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Lung Adenocarcinomas

Ribosomal RNA (rRNA), the most abundant non-coding RNA species, is a major component of the ribosome. Impaired ribosome biogenesis causes the dysfunction of protein synthesis and diseases called “ribosomopathies,” including genetic disorders with cancer risk. However, the potential role of rRNA gene (rDNA) alterations in cancer is unknown. We investigated germline and somatic single-nucleotide variants (SNVs) in the rDNA promoter region (positions −248 to +100, relative to the transcription start site) in 82 lung adenocarcinomas (LUAC). Twenty-nine tumors (35.4%) carried germline SNVs, and eight tumors (9.8%) harbored somatic SNVs. Interestingly, the presence of germline SNVs between positions +1 and +100 (n = 12; 14.6%) was associated with significantly shorter recurrence-free survival (RFS) and overall survival (OS) by univariate analysis (p < 0.05, respectively), and was an independent prognostic factor for RFS and OS by multivariate analysis. LUAC cell line PC9, carrying rDNA promoter SNV at position +49, showed significantly higher ribosome biogenesis than H1650 cells without SNV. Upon nucleolar stress induced by actinomycin D, PC9 retained significantly higher ribosome biogenesis than H1650. These results highlight the possible functional role of SNVs at specific sites of the rDNA promoter region in ribosome biogenesis, the progression of LUAC, and their potential prognostic value.

Download Full-text

RegSNPs-Intron: A computational framework for prioritizing Intronic Single Nucleotide Variants in Human Genetic Disease

10.1101/515171 ◽

2019 ◽

Cited By ~ 1

Author(s):

Hai Lin ◽

Katherine A. Hargreaves ◽

Rudong Li ◽

Jill L. Reiter ◽

Matthew Mort ◽

...

Keyword(s):

Rna Splicing ◽

Evolutionary Conservation ◽

Reporter Assay ◽

Inherited Disease ◽

Single Nucleotide Variants ◽

Human Genetic Disease ◽

Computational Framework ◽

Pathogenic Potential ◽

Single Nucleotide ◽

The Impact

AbstractA large number of single nucleotide variants (SNVs) in the human genome are known to be responsible for inherited disease. An even larger number of SNVs, particularly those located in introns, have yet to be investigated for their pathogenic potential. Using known pathogenic and neutral intronic SNVs (iSNVs), we developed the regSNPs-intron algorithm based on a random forest classifier that integrates RNA splicing, protein structure and evolutionary conservation features. regSNPs-intron showed high accuracy in computing disease-causing probabilities of iSNVs. Using a high-throughput functional reporter assay called ASSET-seq (ASsay for Splicing using ExonTrap and sequencing), we validated regSNPs-intron predictions by measuring the impact of iSNVs on splicing outcome. Together, regSNPs-intron and ASSET-seq enable effective prioritization of iSNVs for disease pathogenesis. regSNPs-intron is available at https://regsnps-intron.ccbb.iupui.edu.

Download Full-text

Identification of novel single-nucleotide variants altering RNA splicing of PKD1 and PKD2

Journal of Human Genetics ◽

10.1038/s10038-021-00959-1 ◽

2021 ◽

Author(s):

Shengyu Xie ◽

Xiangyou Leng ◽

Dachang Tao ◽

Yangwei Zhang ◽

Zhaokun Wang ◽

...

Keyword(s):

Rna Splicing ◽

Single Nucleotide Variants ◽

Single Nucleotide

Download Full-text

Faculty Opinions recommendation of Phylogenetic and physicochemical analyses enhance the classification of rare nonsynonymous single nucleotide variants in type 1 and 2 long-QT syndrome.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.717960422.793463950 ◽

2012 ◽

Author(s):

Jeffrey Noebels ◽

Tara Klassen

Keyword(s):

Long Qt Syndrome ◽

Single Nucleotide Variants ◽

Long Qt ◽

Single Nucleotide ◽

Qt Syndrome

Download Full-text

Single-Nucleotide Variants in microRNAs Sequences or in their Target Genes Might Influence the Risk of Epilepsy: A Review

Cellular and Molecular Neurobiology ◽

10.1007/s10571-021-01058-7 ◽

2021 ◽

Author(s):

Renata Parissi Buainain ◽

Matheus Negri Boschiero ◽

Bruno Camporeze ◽

Paulo Henrique Pires de Aguiar ◽

Fernando Augusto Lima Marson ◽

...

Keyword(s):

Target Genes ◽

Single Nucleotide Variants ◽

Single Nucleotide

Download Full-text

Combination of Genome-Wide Polymorphisms and Copy Number Variations of Pharmacogenes in Koreans

Journal of Personalized Medicine ◽

10.3390/jpm11010033 ◽

2021 ◽

Vol 11 (1) ◽

pp. 33

Author(s):

Nayoung Han ◽

Jung Mi Oh ◽

In-Wha Kim

Keyword(s):

Copy Number ◽

Genome Wide Association Study ◽

Copy Number Gain ◽

Copy Number Variations ◽

Gene Gain ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Haplotype Blocks ◽

Genome Wide ◽

Control And Prevention

For predicting phenotypes and executing precision medicine, combination analysis of single nucleotide variants (SNVs) genotyping with copy number variations (CNVs) is required. The aim of this study was to discover SNVs or common copy CNVs and examine the combined frequencies of SNVs and CNVs in pharmacogenes using the Korean genome and epidemiology study (KoGES), a consortium project. The genotypes (N = 72,299) and CNV data (N = 1000) were provided by the Korean National Institute of Health, Korea Centers for Disease Control and Prevention. The allele frequencies of SNVs, CNVs, and combined SNVs with CNVs were calculated and haplotype analysis was performed. CYP2D6 rs1065852 (c.100C>T, p.P34S) was the most common variant allele (48.23%). A total of 8454 haplotype blocks in 18 pharmacogenes were estimated. DMD ranked the highest in frequency for gene gain (64.52%), while TPMT ranked the highest in frequency for gene loss (51.80%). Copy number gain of CYP4F2 was observed in 22 subjects; 13 of those subjects were carriers with CYP4F2*3 gain. In the case of TPMT, approximately one-half of the participants (N = 308) had loss of the TPMT*1*1 diplotype. The frequencies of SNVs and CNVs in pharmacogenes were determined using the Korean cohort-based genome-wide association study.

Download Full-text