Neural Networks Applied to Protein Structure

1991 ◽  
Author(s):  
Henrik Bohr ◽  
Jacob Bohr ◽  
So̸ren Brunak ◽  
Rodney M. J. Cotterill ◽  
Henrik Fredholm ◽  
...  
1993 ◽  
Vol 48 (2) ◽  
pp. 1502-1515 ◽  
Author(s):  
Teresa Head-Gordon ◽  
Frank H. Stillinger

2018 ◽  
Vol 35 (14) ◽  
pp. 2403-2410 ◽  
Author(s):  
Jack Hanson ◽  
Kuldip Paliwal ◽  
Thomas Litfin ◽  
Yuedong Yang ◽  
Yaoqi Zhou

Abstract Motivation Sequence-based prediction of one dimensional structural properties of proteins has been a long-standing subproblem of protein structure prediction. Recently, prediction accuracy has been significantly improved due to the rapid expansion of protein sequence and structure libraries and advances in deep learning techniques, such as residual convolutional networks (ResNets) and Long-Short-Term Memory Cells in Bidirectional Recurrent Neural Networks (LSTM-BRNNs). Here we leverage an ensemble of LSTM-BRNN and ResNet models, together with predicted residue-residue contact maps, to continue the push towards the attainable limit of prediction for 3- and 8-state secondary structure, backbone angles (θ, τ, ϕ and ψ), half-sphere exposure, contact numbers and solvent accessible surface area (ASA). Results The new method, named SPOT-1D, achieves similar, high performance on a large validation set and test set (≈1000 proteins in each set), suggesting robust performance for unseen data. For the large test set, it achieves 87% and 77% in 3- and 8-state secondary structure prediction and 0.82 and 0.86 in correlation coefficients between predicted and measured ASA and contact numbers, respectively. Comparison to current state-of-the-art techniques reveals substantial improvement in secondary structure and backbone angle prediction. In particular, 44% of 40-residue fragment structures constructed from predicted backbone Cα-based θ and τ angles are less than 6 Å root-mean-squared-distance from their native conformations, nearly 20% better than the next best. The method is expected to be useful for advancing protein structure and function prediction. Availability and implementation SPOT-1D and its data is available at: http://sparks-lab.org/. Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
B. Biletskyy

Introduction. The task of determining the spatial structure of proteins is one of the most important unsolved problems of mankind. Life on the planet Earth is called protein, because protein molecules are the drivers of life processes in living organisms. Proteins make up about 80% of the dry mass of the cell and coordinate the processes of metabolism. The functions of proteins are defined by its spatial structure. The results of recent competitions in methods for determining protein structures have shown significant progress in this important area. One of the research groups presented the AlphaFold 2 method, the accuracy of which reached the accuracy of experimental methods. Purpose of the article. The aim of the work is to consider and analyze the basic principles of the AlphaFold software package for determining the spatial structure of proteins. Results. We consider the main stages in the process of recognizing the structure of a protein using the AlphaFold program complex. The stages and corresponding methods include: search for homologous proteins based on multiple alignment methods, construction of protein-specific differentiated potential using artificial neural networks and protein structure energy optimization using gradient descent and limited sampling. We discuss how combination of various bioinformatics techniques powered by data from open data sources can lead to significant improvements in accuracy of protein structure prediction. Special attention is paid to the use of artificial neural networks for building the smooth protein-specific potential and following energy minimization based on constructed potential. Conclusions. The combination of a number of methods and the use of information from protein and genetic data banks allows us to make significant progress in solving the extremely important task of determining the structure of a protein. Keywords: protein spatial structure, Machine Learning, AlphaFold.


2021 ◽  
Author(s):  
Anastasiya V Kulikova ◽  
Daniel J Diaz ◽  
James M Loy ◽  
Andrew D Ellington ◽  
Claus O Wilke

The fundamental problem of protein biochemistry is to predict protein structure from amino acid sequence. The inverse problem, predicting either entire sequences or individual mutations that are consistent with a given protein structure, has received much less attention even though it has important applications in both protein engineering and evolutionary biology. Here, we ask whether 3D convolutional neural networks (3D CNNs) can learn the local fitness landscape of protein structure to reliably predict either the wild-type amino acid or the consensus in a multiple sequence alignment from the local structural context surrounding a site of interest. We find that the network can predict wild type with good accuracy, and that network confidence is a reliable measure of whether a given prediction is likely going to be correct or not. Predictions of consensus are less accurate, and are primarily driven by whether or not the consensus matches the wild type. Our work suggests that high-confidence mis-predictions of the wild type may identify sites that are primed for mutation and likely targets for protein engineering.


2019 ◽  
Author(s):  
Tobias Sikosek

ABSTRACTMany applications in the biomedical domain involve the detailed molecular and functional characterization of macro-molecules such as proteins. Where possible, this involves the knowledge of detailed 3D coordinates of every atom within a protein. At the same time, machine learning has become the basis of much innovation within this domain in recent years. There are, however, a few challenges in applying machine learning to 3D protein structures, such as variability in size and high dimensionality of the data. It would therefore be beneficial to be able to map every protein structure to a smaller fixed-dimensional representation that is directly learned from the structure without manual curation. In addition, it would be valuable for biomedical researchers if such approaches would require little method development and instead draw from cutting-edge research such as image classification via deep neural networks. Here, such an approach is outlined that first re-formats protein structures as 2D color images and then applies off-the-shelf neural networks for image classification. It is shown that such neural networks can be trained to effectively encode the CATH protein classification database and that feature vectors extracted from such networks, once trained, can be transferred to a completely new task that is likely to benefit from molecular protein information, namely that of small molecule binding.


Sign in / Sign up

Export Citation Format

Share Document