bpRNA: Large-scale Automated Annotation and Analysis of RNA Secondary Structure

Mapping Intimacies ◽

10.1101/271759 ◽

2018 ◽

Author(s):

Padideh Danaee ◽

Mason Rouches ◽

Michelle Wiley ◽

Dezhong Deng ◽

Liang Huang ◽

...

Keyword(s):

Secondary Structure ◽

Single Molecule ◽

Structure Prediction ◽

Rna Secondary Structure ◽

Large Scale ◽

Secondary Structure Prediction ◽

Sequence Data ◽

Secondary Structures ◽

Base Pairs ◽

Statistical Trends

ABSTRACTWhile RNA secondary structure prediction from sequence data has made remarkable progress, there is a need for improved strategies for annotating the features of RNA secondary structures. Here we present bpRNA, a novel annotation tool capable of parsing RNA structures, including complex pseudoknot-containing RNAs, to yield an objective, precise, compact, unambiguous, easily-interpretable description of all loops, stems, and pseudoknots, along with the positions, sequence, and flanking base pairs of each such structural feature. We also introduce several new informative representations of RNA structure types to improve structure visualization and interpretation. We have further used bpRNA to generate a web-accessible meta-database, “bpRNA-1m”, of over 100,000 single-molecule, known secondary structures; this is both more fully and accurately annotated and over 20-times larger than existing databases. We use a subset of the database with highly similar (≥90% identical) sequences filtered out to report on statistical trends in sequence, flanking base pairs, and length. Both the bpRNA method and the bpRNA-1m database will be valuable resources both for specific analysis of individual RNA molecules and large-scale analyses such as are useful for updating RNA energy parameters for computational thermodynamic predictions, improving machine learning models for structure prediction, and for benchmarking structure-prediction algorithms.

Download Full-text

RNA secondary structure prediction using deep learning with thermodynamic integration

10.1101/2020.08.10.244442 ◽

2020 ◽

Author(s):

Kengo Sato ◽

Manato Akiyama ◽

Yasubumi Sakakibara

Keyword(s):

Deep Learning ◽

Secondary Structure ◽

Structure Prediction ◽

Rna Secondary Structure ◽

Secondary Structure Prediction ◽

Secondary Structures ◽

Thermodynamic Integration ◽

Rna Secondary Structure Prediction ◽

Rna Secondary Structures ◽

Non Coding Rnas

RNA secondary structure prediction is one of the key technologies for revealing the essential roles of functional non-coding RNAs. Although machine learning-based rich-parametrized models have achieved extremely high performance in terms of prediction accuracy, the risk of overfitting for such models has been reported. In this work, we propose a new algorithm for predicting RNA secondary structures that uses deep learning with thermodynamic integration, thereby enabling robust predictions. Similar to our previous work, the folding scores, which are computed by a deep neural network, are integrated with traditional thermodynamic parameters to enable robust predictions. We also propose thermodynamic regularization for training our model without overfitting it to the training data. Our algorithm (MXfold2) achieved the most robust and accurate predictions in computational experiments designed for newly discovered non-coding RNAs, with significant 2–10 % improvements over our previous algorithm (MXfold) and standard algorithms for predicting RNA secondary structures in terms of F-value.

Download Full-text

Deep Learning Method for RNA Secondary Structure Prediction with Pseudoknots Based on Large-Scale Data

Journal of Healthcare Engineering ◽

10.1155/2021/6699996 ◽

2021 ◽

Vol 2021 ◽

pp. 1-9

Author(s):

Bowen Shen ◽

Hao Zhang ◽

Cong Li ◽

Tianheng Zhao ◽

Yuanning Liu

Keyword(s):

Deep Learning ◽

Secondary Structure ◽

Structure Prediction ◽

Rna Secondary Structure ◽

Large Scale ◽

Secondary Structure Prediction ◽

Learning Methods ◽

Rna Secondary Structure Prediction ◽

Large Scale Data ◽

Scale Data

Traditional machine learning methods are widely used in the field of RNA secondary structure prediction and have achieved good results. However, with the emergence of large-scale data, deep learning methods have more advantages than traditional machine learning methods. As the number of network layers increases in deep learning, there will often be problems such as increased parameters and overfitting. We used two deep learning models, GoogLeNet and TCN, to predict RNA secondary results. And from the perspective of the depth and width of the network, improvements are made based on the neural network model, which can effectively improve the computational efficiency while extracting more feature information. We process the existing real RNA data through experiments, use deep learning models to extract useful features from a large amount of RNA sequence data and structure data, and then predict the extracted features to obtain each base’s pairing probability. The characteristics of RNA secondary structure and dynamic programming methods are used to process the base prediction results, and the structure with the largest sum of the probability of each base pairing is obtained, and this structure will be used as the optimal RNA secondary structure. We, respectively, evaluated GoogLeNet and TCN models based on 5sRNA, tRNA data, and tmRNA data, and compared them with other standard prediction algorithms. The sensitivity and specificity of the GoogLeNet model on the 5sRNA and tRNA data sets are about 16% higher than the best prediction results in other algorithms. The sensitivity and specificity of the GoogLeNet model on the tmRNA dataset are about 9% higher than the best prediction results in other algorithms. As deep learning algorithms’ performance is related to the size of the data set, as the scale of RNA data continues to expand, the prediction accuracy of deep learning methods for RNA secondary structure will continue to improve.

Download Full-text

An efficient simulated annealing algorithm for the RNA secondary structure prediction with Pseudoknots

BMC Genomics ◽

10.1186/s12864-019-6300-2 ◽

2019 ◽

Vol 20 (S13) ◽

Cited By ~ 1

Author(s):

Zhang Kai ◽

Wang Yuting ◽

Lv Yulin ◽

Liu Jun ◽

He Juanjuan

Keyword(s):

Free Energy ◽

Secondary Structure ◽

Structure Prediction ◽

Rna Secondary Structure ◽

Simulated Annealing Algorithm ◽

Secondary Structure Prediction ◽

Stem Length ◽

Base Pairs ◽

Rna Secondary Structure Prediction ◽

Prediction Algorithms

Abstract Background RNA pseudoknot structures play an important role in biological processes. However, existing RNA secondary structure prediction algorithms cannot predict the pseudoknot structure efficiently. Although random matching can improve the number of base pairs, these non-consecutive base pairs cannot make contributions to reduce the free energy. Result In order to improve the efficiency of searching procedure, our algorithm take consecutive base pairs as the basic components. Firstly, our algorithm calculates and archive all the consecutive base pairs in triplet data structure, if the number of consecutive base pairs is greater than given minimum stem length. Secondly, the annealing schedule is adapted to select the optimal solution that has minimum free energy. Finally, the proposed algorithm is evaluated with the real instances in PseudoBase. Conclusion The experimental results have been demonstrated to provide a competitive and oftentimes better performance when compared against some chosen state-of-the-art RNA structure prediction algorithms.

Download Full-text

A NON-PARAMETRIC BAYESIAN APPROACH FOR PREDICTING RNA SECONDARY STRUCTURES

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720010004926 ◽

2010 ◽

Vol 08 (04) ◽

pp. 727-742 ◽

Cited By ~ 8

Author(s):

KENGO SATO ◽

MICHIAKI HAMADA ◽

TOUTAI MITUYAMA ◽

KIYOSHI ASAI ◽

YASUBUMI SAKAKIBARA

Keyword(s):

Secondary Structure ◽

Bayesian Approach ◽

Structure Prediction ◽

Rna Secondary Structure ◽

Secondary Structure Prediction ◽

Secondary Structures ◽

Generative Models ◽

Rna Secondary Structures ◽

Stochastic Context Free Grammars ◽

Non Parametric

Since many functional RNAs form stable secondary structures which are related to their functions, RNA secondary structure prediction is a crucial problem in bioinformatics. We propose a novel model for generating RNA secondary structures based on a non-parametric Bayesian approach, called hierarchical Dirichlet processes for stochastic context-free grammars (HDP-SCFGs). Here non-parametric means that some meta-parameters, such as the number of non-terminal symbols and production rules, do not have to be fixed. Instead their distributions are inferred in order to be adapted (in the Bayesian sense) to the training sequences provided. The results of our RNA secondary structure predictions show that HDP-SCFGs are more accurate than the MFE-based and other generative models.

Download Full-text

RNA secondary structure prediction using deep learning with thermodynamic integration

Nature Communications ◽

10.1038/s41467-021-21194-4 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Kengo Sato ◽

Manato Akiyama ◽

Yasubumi Sakakibara

Keyword(s):

Free Energy ◽

Deep Learning ◽

Secondary Structure ◽

Structure Prediction ◽

Rna Secondary Structure ◽

Nearest Neighbor ◽

Secondary Structure Prediction ◽

Secondary Structures ◽

Rna Secondary Structures ◽

Non Coding Rnas

AbstractAccurate predictions of RNA secondary structures can help uncover the roles of functional non-coding RNAs. Although machine learning-based models have achieved high performance in terms of prediction accuracy, overfitting is a common risk for such highly parameterized models. Here we show that overfitting can be minimized when RNA folding scores learnt using a deep neural network are integrated together with Turner’s nearest-neighbor free energy parameters. Training the model with thermodynamic regularization ensures that folding scores and the calculated free energy are as close as possible. In computational experiments designed for newly discovered non-coding RNAs, our algorithm (MXfold2) achieves the most robust and accurate predictions of RNA secondary structures without sacrificing computational efficiency compared to several other algorithms. The results suggest that integrating thermodynamic information could help improve the robustness of deep learning-based predictions of RNA secondary structure.

Download Full-text

Faculty Opinions recommendation of COFOLD: an RNA secondary structure prediction method that takes co-transcriptional folding into account.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.718010599.793476797 ◽

2013 ◽

Author(s):

Scott Silverman

Keyword(s):

Secondary Structure ◽

Structure Prediction ◽

Rna Secondary Structure ◽

Secondary Structure Prediction ◽

Prediction Method ◽

Rna Secondary Structure Prediction ◽

Structure Prediction Method ◽

Secondary Structure Prediction Method

Download Full-text

A Discrete Hopfield Neural Network Based MIS Finding Algorithm for Stems Selecting and Its Application in RNA Secondary Structure Prediction

Chinese Journal of Computers ◽

10.3724/sp.j.1016.2008.00051 ◽

2009 ◽

Vol 31 (1) ◽

pp. 51-58

Author(s):

Qi LIU ◽

Yin ZHANG ◽

Xiu-Zi YE ◽

Rong-Dong YU

Keyword(s):

Neural Network ◽

Secondary Structure ◽

Structure Prediction ◽

Rna Secondary Structure ◽

Secondary Structure Prediction ◽

Hopfield Neural Network ◽

Rna Secondary Structure Prediction

Download Full-text

A range of complex probabilistic models for RNA secondary structure prediction that includes the nearest-neighbor model and more

RNA ◽

10.1261/rna.030049.111 ◽

2011 ◽

Vol 18 (2) ◽

pp. 193-212 ◽

Cited By ~ 50

Author(s):

E. Rivas ◽

R. Lang ◽

S. R. Eddy

Keyword(s):

Secondary Structure ◽

Structure Prediction ◽

Rna Secondary Structure ◽

Probabilistic Models ◽

Nearest Neighbor ◽

Secondary Structure Prediction ◽

Rna Secondary Structure Prediction

Download Full-text

Evolutionary Algorithm for RNA Secondary Structure Prediction Based on Simulated SHAPE Data

PLoS ONE ◽

10.1371/journal.pone.0166965 ◽

2016 ◽

Vol 11 (11) ◽

pp. e0166965 ◽

Cited By ~ 4

Author(s):

Soheila Montaseri ◽

Mohammad Ganjtabesh ◽

Fatemeh Zare-Mirakabad

Keyword(s):

Secondary Structure ◽

Evolutionary Algorithm ◽

Structure Prediction ◽

Rna Secondary Structure ◽

Secondary Structure Prediction ◽

Rna Secondary Structure Prediction ◽

Shape Data

Download Full-text

RNA Secondary Structure Prediction and Gene Regulation by Small RNAs

Frontiers in Computational and Systems Biology - Computational Biology ◽

10.1007/978-1-84996-196-7_2 ◽

2010 ◽

pp. 19-37

Author(s):

Ye Ding

Keyword(s):

Gene Regulation ◽

Secondary Structure ◽

Small Rnas ◽

Structure Prediction ◽

Rna Secondary Structure ◽

Secondary Structure Prediction ◽

Rna Secondary Structure Prediction

Download Full-text