scholarly journals Exponentially few RNA structures are designable

2019 ◽  
Author(s):  
Hua-Ting Yao ◽  
Mireille Regnier ◽  
Cedric Chauve ◽  
Yann Ponty

ABSTRACTThe problem of RNA design attempts to construct RNA sequences that perform a predefined biological function, identified by several additional constraints. One of the foremost objective of RNA design is that the designed RNA sequence should adopt a predefined target secondary structure preferentially to any alternative structure, according to a given metrics and folding model. It was observed in several works that some secondary structures are undesignable, i.e. no RNA sequence can fold into the target structure while satisfying some criterion measuring how preferential this folding is compared to alternative conformations.In this paper, we show that the proportion of designable secondary structures decreases exponentially with the size of the target secondary structure, for various popular combinations of energy models and design objectives. This exponential decay is, at least in part, due to the existence of undesignable motifs, which can be generically constructed, and jointly analyzed to yield asymptotic upper-bounds on the number of designable structures.

1978 ◽  
Vol 56 (6) ◽  
pp. 440-443 ◽  
Author(s):  
David Sankoff ◽  
Anne-Marie Morin ◽  
R. J. Cedergren

We have applied the Pipas–McMahon algorithm based on free energy calculations to the search for a 5S RNA base-pair structure common to all known sequences. We find that a 'Y'-shaped model is consistently among the structures having the lowest free energy using 5S RNA sequences from either eukaryotic or prokaryotic sources. Comparison of this 'Y' structure with models which have recently been proposed show these models to be remarkably similar, and the minor differences are explicable based on the technique used to obtain the model. That prokaryotic and eukaryotic 5S RNA can adopt a similar secondary structure is strong support for its resistance to change during evolution.


2019 ◽  
Author(s):  
Masaki Tagashira ◽  
Kiyoshi Asai

AbstractMotivationThe simultaneous optimization of the sequence alignment and secondary structures among RNAs, structural alignment, has been required for the more appropriate comparison of functional ncRNAs than sequence alignment. Pseudo-probabilities given RNA sequences on structural alignment have been desired for more-accurate secondary structures, sequence alignments, consensus secondary structures, and structural alignments. However, any algorithms have not been proposed for these pseudo-probabilities.ResultsWe invented the RNAfamProb algorithm, an algorithm for estimating these pseudo-probabilities. We performed the application of these pseudo-probabilities to two biological problems, the visualization with these pseudo-probabilities and maximum-expected-accuracy secondary-structure (estimation). The RNAfamProb program, an implementation of this algorithm, plus the NeoFold program, a maximum-expected-accuracy secondary-structure program with these pseudo-probabilities, demonstrated prediction accuracy better than three state-of-the-art programs of maximum-expected-accuracy secondary-structure while demanding running time far longer than these three programs as expected due to the intrinsic serious problem-complexity of structural alignment compared with independent secondary structure and sequence alignment. Both the RNAfamProb and NeoFold programs estimate matters more accurately with incorporating homologous-RNA sequences.AvailabilityThe source code of each of these two programs is available on each of “https://github.com/heartsh/rnafamprob” and “https://github.com/heartsh/neofold”.Contact“[email protected]” and “[email protected]”.Supplementary informationSupplementary data are available at Bioinformatics online.


2006 ◽  
Vol 7 (1) ◽  
pp. 37-43 ◽  
Author(s):  
T. A. Hughes ◽  
J. N. McElwaine

Secondary structures within the 5′ untranslated regions of messenger RNAs can have profound effects on the efficiency of translation of their messages and thereby on gene expression. Consequently they can act as important regulatory motifs in both physiological and pathological settings. Current approaches to predicting the secondary structure of these RNA sequences find the structure with the global-minimum free energy. However, since RNA folds progressively from the 5′ end when synthesised or released from the translational machinery, this may not be the most probable structure. We discuss secondary structure prediction based on local-minimisation of free energy with thermodynamic fluctuations as nucleotides are added to the 3′ end and show that these can result in different secondary structures. We also discuss approaches for studying the extent of the translational inhibition specified by structures within the 5′ untranslated region.


BMC Genomics ◽  
2020 ◽  
Vol 21 (S13) ◽  
Author(s):  
Lei Deng ◽  
Youzhi Liu ◽  
Yechuan Shi ◽  
Wenhao Zhang ◽  
Chun Yang ◽  
...  

Abstract Background RNA binding proteins (RBPs) play a vital role in post-transcriptional processes in all eukaryotes, such as splicing regulation, mRNA transport, and modulation of mRNA translation and decay. The identification of RBP binding sites is a crucial step in understanding the biological mechanism of post-transcriptional gene regulation. However, the determination of RBP binding sites on a large scale is a challenging task due to high cost of biochemical assays. Quite a number of studies have exploited machine learning methods to predict binding sites. Especially, deep learning is increasingly used in the bioinformatics field by virtue of its ability to learn generalized representations from DNA and protein sequences. Results In this paper, we implemented a novel deep neural network model, DeepRKE, which combines primary RNA sequence and secondary structure information to effectively predict RBP binding sites. Specifically, we used word embedding algorithm to extract features of RNA sequences and secondary structures, i.e., distributed representation of k-mers sequence rather than traditional one-hot encoding. The distributed representations are taken as input of convolutional neural networks (CNN) and bidirectional long-term short-term memory networks (BiLSTM) to identify RBP binding sites. Our results show that deepRKE outperforms existing counterpart methods on two large-scale benchmark datasets. Conclusions Our extensive experimental results show that DeepRKE is an efficacious tool for predicting RBP binding sites. The distributed representations of RNA sequences and secondary structures can effectively detect the latent relationship and similarity between k-mers, and thus improve the predictive performance. The source code of DeepRKE is available at https://github.com/youzhiliu/DeepRKE/.


Biologia ◽  
2007 ◽  
Vol 62 (6) ◽  
Author(s):  
Matthias Wolf ◽  
Christian Selig ◽  
Tobias Müller ◽  
Nicole Philippi ◽  
Thomas Dandekar ◽  
...  

AbstractIt was shown that compensatory base changes (CBCs) in internal transcribed spacer 2 (ITS2) sequence-structure alignments can be used for distinguishing species. Using the ITS2 Database in combination with 4SALE — a tool for synchronous RNA sequence and secondary structure alignment and editing — in this study we present an in-depth CBC analysis for placozoan ITS2 sequences and their respective secondary structures. This analysis indicates at least two distinct species in Trichoplax (Placozoa) supporting a recently suggested hypothesis, that Placozoa is “no longer a phylum of one”.


Mathematics ◽  
2021 ◽  
Vol 9 (15) ◽  
pp. 1749
Author(s):  
Thomas J. X. Li ◽  
Christian M. Reidys

In this paper, we establish a topological framework of τ-structures to quantify the evolutionary transitions between two RNA sequence–structure pairs. τ-structures developed here consist of a pair of RNA secondary structures together with a non-crossing partial matching between the two backbones. The loop complex of a τ-structure captures the intersections of loops in both secondary structures. We compute the loop homology of τ-structures. We show that only the zeroth, first and second homology groups are free. In particular, we prove that the rank of the second homology group equals the number γ of certain arc-components in a τ-structure and that the rank of the first homology is given by γ−χ+1, where χ is the Euler characteristic of the loop complex.


Author(s):  
Yanwei Qi ◽  
Yuhong Zhang ◽  
Guixing Zheng ◽  
Bingxia Chen ◽  
Mengxin Zhang ◽  
...  

It is widely accepted that the structure of RNA plays important roles in a number of biological processes, such as polyadenylation, splicing, and catalytic functions. Dynamic changes in RNA structure are able to regulate the gene expression programme and can be used as a highly specific and subtle mechanism for governing cellular processes. However, the nature of most RNA secondary structures in Plasmodium falciparum has not been determined. To investigate the genome-wide RNA secondary structural features at single-nucleotide resolution in P. falciparum, we applied a novel high-throughput method utilizing the chemical modification of RNA structures to characterize these structures. Structural data from parasites are in close agreement with the known 18S ribosomal RNA secondary structures of P. falciparum and can help to predict the in vivo RNA secondary structure of a total of 3,396 transcripts in the ring-stage and trophozoite-stage developmental cycles. By parallel analysis of RNA structures in vivo and in vitro during the Plasmodium parasite ring-stage and trophozoite-stage intraerythrocytic developmental cycles, we identified some key regulatory features. Recent studies have established that the RNA structure is a ubiquitous and fundamental regulator of gene expression. Our study indicate that there is a critical connection between RNA secondary structure and mRNA abundance during the complex biological programme of P. falciparum. This work presents a useful framework and important results, which may facilitate further research investigating the interactions between RNA secondary structure and the complex biological programme in P. falciparum. The RNA secondary structure characterized in this study has potential applications and important implications regarding the identification of RNA structural elements, which are important for parasite infection and elucidating host-parasite interactions and parasites in the environment.


Author(s):  
Wes Sanders ◽  
Ethan J. Fritch ◽  
Emily A. Madden ◽  
Rachel L. Graham ◽  
Heather A. Vincent ◽  
...  

AbstractCoronaviruses, including SARS-CoV-2 the etiological agent of COVID-19 disease, have caused multiple epidemic and pandemic outbreaks in the past 20 years1–3. With no vaccines, and only recently developed antiviral therapeutics, we are ill equipped to handle coronavirus outbreaks4. A better understanding of the molecular mechanisms that regulate coronavirus replication and pathogenesis is needed to guide the development of new antiviral therapeutics and vaccines. RNA secondary structures play critical roles in multiple aspects of coronavirus replication, but the extent and conservation of RNA secondary structure across coronavirus genomes is unknown5. Here, we define highly structured RNA regions throughout the MERS-CoV, SARS-CoV, and SARS-CoV-2 genomes. We find that highly stable RNA structures are pervasive throughout coronavirus genomes, and are conserved between the SARS-like CoV. Our data suggests that selective pressure helps preserve RNA secondary structure in coronavirus genomes, suggesting that these structures may play important roles in virus replication and pathogenesis. Thus, disruption of conserved RNA secondary structures could be a novel strategy for the generation of attenuated SARS-CoV-2 vaccines for use against the current COVID-19 pandemic.


2012 ◽  
Vol 28 (23) ◽  
pp. 3058-3065 ◽  
Author(s):  
Jana Sperschneider ◽  
Amitava Datta ◽  
Michael J. Wise

Abstract Motivation Laboratory RNA structure determination is demanding and costly and thus, computational structure prediction is an important task. Single sequence methods for RNA secondary structure prediction are limited by the accuracy of the underlying folding model, if a structure is supported by a family of evolutionarily related sequences, one can be more confident that the prediction is accurate. RNA pseudoknots are functional elements, which have highly conserved structures. However, few comparative structure prediction methods can handle pseudoknots due to the computational complexity. Results A comparative pseudoknot prediction method called DotKnot-PW is introduced based on structural comparison of secondary structure elements and H-type pseudoknot candidates. DotKnot-PW outperforms other methods from the literature on a hand-curated test set of RNA structures with experimental support. Availability DotKnot-PW and the RNA structure test set are available at the web site http://dotknot.csse.uwa.edu.au/pw. Contact [email protected] Supplementary information Supplementary data are available at Bioinformatics online.


2013 ◽  
Vol 16 (01) ◽  
pp. 1250052 ◽  
Author(s):  
SUSANNA C. MANRUBIA ◽  
RAFAEL SANJUÁN

A suitable model to dive into the properties of genotype-phenotype landscapes is the relationship between RNA sequences and their corresponding minimum free energy secondary structures. Relevant issues related to molecular evolvability and robustness to mutations have been studied in this framework. Here, we analyze the one-mutant neighborhood of the predicted secondary structure of 46 different RNAs, including tRNAs, viroids, larger molecules such as Hepatitis-δ virus, and several random sequences. The probability distribution of the effect of point mutations in linear structural motifs of the secondary structure is well fit by Pareto or Lognormal probability distributions functions, independent of the origin of the RNA molecule. This extends previous results to the case of natural sequences of diverse origins. We introduce a new indicator of robustness, the average weighted length of linear motifs (AwL) and demonstrate that it correlates with the average effect of point mutations in RNA secondary structures. Structures with a high AwL value display the highest structural robustness and cluster in particular regions of sequence space.


Sign in / Sign up

Export Citation Format

Share Document