Exponentially few RNA structures are designable

Mapping Intimacies ◽

10.1101/652313 ◽

2019 ◽

Author(s):

Hua-Ting Yao ◽

Mireille Regnier ◽

Cedric Chauve ◽

Yann Ponty

Keyword(s):

Secondary Structure ◽

Secondary Structures ◽

Rna Structures ◽

Folding Model ◽

Rna Sequences ◽

Rna Sequence ◽

Energy Models ◽

Rna Design ◽

Additional Constraints ◽

Alternative Structure

ABSTRACTThe problem of RNA design attempts to construct RNA sequences that perform a predefined biological function, identified by several additional constraints. One of the foremost objective of RNA design is that the designed RNA sequence should adopt a predefined target secondary structure preferentially to any alternative structure, according to a given metrics and folding model. It was observed in several works that some secondary structures are undesignable, i.e. no RNA sequence can fold into the target structure while satisfying some criterion measuring how preferential this folding is compared to alternative conformations.In this paper, we show that the proportion of designable secondary structures decreases exponentially with the size of the target secondary structure, for various popular combinations of energy models and design objectives. This exponential decay is, at least in part, due to the existence of undesignable motifs, which can be generically constructed, and jointly analyzed to yield asymptotic upper-bounds on the number of designable structures.

Download Full-text

The evolution of 5S RNA secondary structures

Canadian Journal of Biochemistry ◽

10.1139/o78-068 ◽

1978 ◽

Vol 56 (6) ◽

pp. 440-443 ◽

Cited By ~ 16

Author(s):

David Sankoff ◽

Anne-Marie Morin ◽

R. J. Cedergren

Keyword(s):

Free Energy ◽

Secondary Structure ◽

Base Pair ◽

Resistance To Change ◽

Strong Support ◽

Secondary Structures ◽

Free Energy Calculations ◽

5S Rna ◽

Rna Sequences ◽

Energy Calculations

We have applied the Pipas–McMahon algorithm based on free energy calculations to the search for a 5S RNA base-pair structure common to all known sequences. We find that a 'Y'-shaped model is consistently among the structures having the lowest free energy using 5S RNA sequences from either eukaryotic or prokaryotic sources. Comparison of this 'Y' structure with models which have recently been proposed show these models to be remarkably similar, and the minor differences are explicable based on the technique used to obtain the model. That prokaryotic and eukaryotic 5S RNA can adopt a similar secondary structure is strong support for its resistance to change during evolution.

Download Full-text

RNAfamProb Plus NeoFold: Estimations of Posterior Probabilities on RNA Structural Alignment and RNA Secondary Structures with Incorporating Homologous-RNA Sequences

10.1101/812891 ◽

2019 ◽

Author(s):

Masaki Tagashira ◽

Kiyoshi Asai

Keyword(s):

Secondary Structure ◽

Sequence Alignment ◽

Structural Alignment ◽

Secondary Structures ◽

Simultaneous Optimization ◽

Supplementary Information ◽

Sequence Alignments ◽

Rna Sequences ◽

Link Type ◽

Rna Structural Alignment

AbstractMotivationThe simultaneous optimization of the sequence alignment and secondary structures among RNAs, structural alignment, has been required for the more appropriate comparison of functional ncRNAs than sequence alignment. Pseudo-probabilities given RNA sequences on structural alignment have been desired for more-accurate secondary structures, sequence alignments, consensus secondary structures, and structural alignments. However, any algorithms have not been proposed for these pseudo-probabilities.ResultsWe invented the RNAfamProb algorithm, an algorithm for estimating these pseudo-probabilities. We performed the application of these pseudo-probabilities to two biological problems, the visualization with these pseudo-probabilities and maximum-expected-accuracy secondary-structure (estimation). The RNAfamProb program, an implementation of this algorithm, plus the NeoFold program, a maximum-expected-accuracy secondary-structure program with these pseudo-probabilities, demonstrated prediction accuracy better than three state-of-the-art programs of maximum-expected-accuracy secondary-structure while demanding running time far longer than these three programs as expected due to the intrinsic serious problem-complexity of structural alignment compared with independent secondary structure and sequence alignment. Both the RNAfamProb and NeoFold programs estimate matters more accurately with incorporating homologous-RNA sequences.AvailabilityThe source code of each of these two programs is available on each of “https://github.com/heartsh/rnafamprob” and “https://github.com/heartsh/neofold”.Contact“[email protected]” and “[email protected]”.Supplementary informationSupplementary data are available at Bioinformatics online.

Download Full-text

Mathematical and Biological Modelling of RNA Secondary Structure and Its Effects on Gene Expression

Computational and Mathematical Methods in Medicine ◽

10.1080/10273660600906416 ◽

2006 ◽

Vol 7 (1) ◽

pp. 37-43 ◽

Cited By ~ 2

Author(s):

T. A. Hughes ◽

J. N. McElwaine

Keyword(s):

Gene Expression ◽

Free Energy ◽

Secondary Structure ◽

Structure Prediction ◽

Secondary Structure Prediction ◽

Secondary Structures ◽

Minimum Free Energy ◽

Messenger Rnas ◽

Rna Sequences ◽

Translational Machinery

Secondary structures within the 5′ untranslated regions of messenger RNAs can have profound effects on the efficiency of translation of their messages and thereby on gene expression. Consequently they can act as important regulatory motifs in both physiological and pathological settings. Current approaches to predicting the secondary structure of these RNA sequences find the structure with the global-minimum free energy. However, since RNA folds progressively from the 5′ end when synthesised or released from the translational machinery, this may not be the most probable structure. We discuss secondary structure prediction based on local-minimisation of free energy with thermodynamic fluctuations as nucleotides are added to the 3′ end and show that these can result in different secondary structures. We also discuss approaches for studying the extent of the translational inhibition specified by structures within the 5′ untranslated region.

Download Full-text

Deep neural networks for inferring binding sites of RNA-binding proteins by using distributed representations of RNA primary sequence and secondary structure

BMC Genomics ◽

10.1186/s12864-020-07239-w ◽

2020 ◽

Vol 21 (S13) ◽

Author(s):

Lei Deng ◽

Youzhi Liu ◽

Yechuan Shi ◽

Wenhao Zhang ◽

Chun Yang ◽

...

Keyword(s):

Neural Networks ◽

Secondary Structure ◽

Binding Sites ◽

Large Scale ◽

Binding Proteins ◽

Rna Binding ◽

Rna Binding Proteins ◽

Secondary Structures ◽

Rna Sequences ◽

Distributed Representations

Abstract Background RNA binding proteins (RBPs) play a vital role in post-transcriptional processes in all eukaryotes, such as splicing regulation, mRNA transport, and modulation of mRNA translation and decay. The identification of RBP binding sites is a crucial step in understanding the biological mechanism of post-transcriptional gene regulation. However, the determination of RBP binding sites on a large scale is a challenging task due to high cost of biochemical assays. Quite a number of studies have exploited machine learning methods to predict binding sites. Especially, deep learning is increasingly used in the bioinformatics field by virtue of its ability to learn generalized representations from DNA and protein sequences. Results In this paper, we implemented a novel deep neural network model, DeepRKE, which combines primary RNA sequence and secondary structure information to effectively predict RBP binding sites. Specifically, we used word embedding algorithm to extract features of RNA sequences and secondary structures, i.e., distributed representation of k-mers sequence rather than traditional one-hot encoding. The distributed representations are taken as input of convolutional neural networks (CNN) and bidirectional long-term short-term memory networks (BiLSTM) to identify RBP binding sites. Our results show that deepRKE outperforms existing counterpart methods on two large-scale benchmark datasets. Conclusions Our extensive experimental results show that DeepRKE is an efficacious tool for predicting RBP binding sites. The distributed representations of RNA sequences and secondary structures can effectively detect the latent relationship and similarity between k-mers, and thus improve the predictive performance. The source code of DeepRKE is available at https://github.com/youzhiliu/DeepRKE/.

Download Full-text

Placozoa: at least two

Biologia ◽

10.2478/s11756-007-0143-z ◽

2007 ◽

Vol 62 (6) ◽

Cited By ~ 9

Author(s):

Matthias Wolf ◽

Christian Selig ◽

Tobias Müller ◽

Nicole Philippi ◽

Thomas Dandekar ◽

...

Keyword(s):

Secondary Structure ◽

Internal Transcribed Spacer ◽

Distinct Species ◽

Secondary Structures ◽

Its2 Sequence ◽

Structure Alignment ◽

Sequence Structure ◽

Rna Sequence ◽

Internal Transcribed Spacer 2 ◽

Compensatory Base Changes

AbstractIt was shown that compensatory base changes (CBCs) in internal transcribed spacer 2 (ITS2) sequence-structure alignments can be used for distinguishing species. Using the ITS2 Database in combination with 4SALE — a tool for synchronous RNA sequence and secondary structure alignment and editing — in this study we present an in-depth CBC analysis for placozoan ITS2 sequences and their respective secondary structures. This analysis indicates at least two distinct species in Trichoplax (Placozoa) supporting a recently suggested hypothesis, that Placozoa is “no longer a phylum of one”.

Download Full-text

On the Loop Homology of a Certain Complex of RNA Structures

Mathematics ◽

10.3390/math9151749 ◽

2021 ◽

Vol 9 (15) ◽

pp. 1749

Author(s):

Thomas J. X. Li ◽

Christian M. Reidys

Keyword(s):

Euler Characteristic ◽

Homology Group ◽

Secondary Structures ◽

Rna Structures ◽

Sequence Structure ◽

Evolutionary Transitions ◽

Partial Matching ◽

Rna Sequence ◽

Homology Groups ◽

Topological Framework

In this paper, we establish a topological framework of τ-structures to quantify the evolutionary transitions between two RNA sequence–structure pairs. τ-structures developed here consist of a pair of RNA secondary structures together with a non-crossing partial matching between the two backbones. The loop complex of a τ-structure captures the intersections of loops in both secondary structures. We compute the loop homology of τ-structures. We show that only the zeroth, first and second homology groups are free. In particular, we prove that the rank of the second homology group equals the number γ of certain arc-components in a τ-structure and that the rank of the first homology is given by γ−χ+1, where χ is the Euler characteristic of the loop complex.

Download Full-text

In Vivo and In Vitro Genome-Wide Profiling of RNA Secondary Structures Reveals Key Regulatory Features in Plasmodium falciparum

Frontiers in Cellular and Infection Microbiology ◽

10.3389/fcimb.2021.673966 ◽

2021 ◽

Vol 11 ◽

Author(s):

Yanwei Qi ◽

Yuhong Zhang ◽

Guixing Zheng ◽

Bingxia Chen ◽

Mengxin Zhang ◽

...

Keyword(s):

Secondary Structure ◽

Rna Structure ◽

Rna Secondary Structure ◽

Secondary Structures ◽

Rna Structures ◽

Trophozoite Stage ◽

Rna Secondary Structures ◽

Biological Programme

It is widely accepted that the structure of RNA plays important roles in a number of biological processes, such as polyadenylation, splicing, and catalytic functions. Dynamic changes in RNA structure are able to regulate the gene expression programme and can be used as a highly specific and subtle mechanism for governing cellular processes. However, the nature of most RNA secondary structures in Plasmodium falciparum has not been determined. To investigate the genome-wide RNA secondary structural features at single-nucleotide resolution in P. falciparum, we applied a novel high-throughput method utilizing the chemical modification of RNA structures to characterize these structures. Structural data from parasites are in close agreement with the known 18S ribosomal RNA secondary structures of P. falciparum and can help to predict the in vivo RNA secondary structure of a total of 3,396 transcripts in the ring-stage and trophozoite-stage developmental cycles. By parallel analysis of RNA structures in vivo and in vitro during the Plasmodium parasite ring-stage and trophozoite-stage intraerythrocytic developmental cycles, we identified some key regulatory features. Recent studies have established that the RNA structure is a ubiquitous and fundamental regulator of gene expression. Our study indicate that there is a critical connection between RNA secondary structure and mRNA abundance during the complex biological programme of P. falciparum. This work presents a useful framework and important results, which may facilitate further research investigating the interactions between RNA secondary structure and the complex biological programme in P. falciparum. The RNA secondary structure characterized in this study has potential applications and important implications regarding the identification of RNA structural elements, which are important for parasite infection and elucidating host-parasite interactions and parasites in the environment.

Download Full-text

Comparative analysis of coronavirus genomic RNA structure reveals conservation in SARS-like coronaviruses

10.1101/2020.06.15.153197 ◽

2020 ◽

Cited By ~ 10

Author(s):

Wes Sanders ◽

Ethan J. Fritch ◽

Emily A. Madden ◽

Rachel L. Graham ◽

Heather A. Vincent ◽

...

Keyword(s):

Secondary Structure ◽

Rna Structure ◽

Rna Secondary Structure ◽

Molecular Mechanisms ◽

Selective Pressure ◽

Secondary Structures ◽

Rna Structures ◽

Rna Secondary Structures ◽

The Past ◽

Novel Strategy

AbstractCoronaviruses, including SARS-CoV-2 the etiological agent of COVID-19 disease, have caused multiple epidemic and pandemic outbreaks in the past 20 years1–3. With no vaccines, and only recently developed antiviral therapeutics, we are ill equipped to handle coronavirus outbreaks4. A better understanding of the molecular mechanisms that regulate coronavirus replication and pathogenesis is needed to guide the development of new antiviral therapeutics and vaccines. RNA secondary structures play critical roles in multiple aspects of coronavirus replication, but the extent and conservation of RNA secondary structure across coronavirus genomes is unknown5. Here, we define highly structured RNA regions throughout the MERS-CoV, SARS-CoV, and SARS-CoV-2 genomes. We find that highly stable RNA structures are pervasive throughout coronavirus genomes, and are conserved between the SARS-like CoV. Our data suggests that selective pressure helps preserve RNA secondary structure in coronavirus genomes, suggesting that these structures may play important roles in virus replication and pathogenesis. Thus, disruption of conserved RNA secondary structures could be a novel strategy for the generation of attenuated SARS-CoV-2 vaccines for use against the current COVID-19 pandemic.

Download Full-text

Predicting pseudoknotted structures across two RNA sequences

Bioinformatics ◽

10.1093/bioinformatics/bts575 ◽

2012 ◽

Vol 28 (23) ◽

pp. 3058-3065 ◽

Cited By ~ 4

Author(s):

Jana Sperschneider ◽

Amitava Datta ◽

Michael J. Wise

Keyword(s):

Secondary Structure ◽

Rna Structure ◽

Structure Prediction ◽

Secondary Structure Prediction ◽

Prediction Method ◽

Supplementary Information ◽

Rna Structures ◽

Rna Sequences ◽

Test Set ◽

Comparative Structure

Abstract Motivation Laboratory RNA structure determination is demanding and costly and thus, computational structure prediction is an important task. Single sequence methods for RNA secondary structure prediction are limited by the accuracy of the underlying folding model, if a structure is supported by a family of evolutionarily related sequences, one can be more confident that the prediction is accurate. RNA pseudoknots are functional elements, which have highly conserved structures. However, few comparative structure prediction methods can handle pseudoknots due to the computational complexity. Results A comparative pseudoknot prediction method called DotKnot-PW is introduced based on structural comparison of secondary structure elements and H-type pseudoknot candidates. DotKnot-PW outperforms other methods from the literature on a hand-curated test set of RNA structures with experimental support. Availability DotKnot-PW and the RNA structure test set are available at the web site http://dotknot.csse.uwa.edu.au/pw. Contact [email protected] Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

SHAPE MATTERS: EFFECT OF POINT MUTATIONS ON RNA SECONDARY STRUCTURE

Advances in Complex Systems ◽

10.1142/s021952591250052x ◽

2013 ◽

Vol 16 (01) ◽

pp. 1250052 ◽

Cited By ~ 5

Author(s):

SUSANNA C. MANRUBIA ◽

RAFAEL SANJUÁN

Keyword(s):

Secondary Structure ◽

Probability Distributions ◽

Point Mutations ◽

Secondary Structures ◽

Minimum Free Energy ◽

Suitable Model ◽

Rna Sequences ◽

Hepatitis Δ Virus ◽

The One ◽

The Relationship

A suitable model to dive into the properties of genotype-phenotype landscapes is the relationship between RNA sequences and their corresponding minimum free energy secondary structures. Relevant issues related to molecular evolvability and robustness to mutations have been studied in this framework. Here, we analyze the one-mutant neighborhood of the predicted secondary structure of 46 different RNAs, including tRNAs, viroids, larger molecules such as Hepatitis-δ virus, and several random sequences. The probability distribution of the effect of point mutations in linear structural motifs of the secondary structure is well fit by Pareto or Lognormal probability distributions functions, independent of the origin of the RNA molecule. This extends previous results to the case of natural sequences of diverse origins. We introduce a new indicator of robustness, the average weighted length of linear motifs (AwL) and demonstrate that it correlates with the average effect of point mutations in RNA secondary structures. Structures with a high AwL value display the highest structural robustness and cluster in particular regions of sequence space.

Download Full-text