Neural network-based prediction of mutation-induced protein stability changes in Staphylococcal nuclease at 20 residue positions

2005 ◽  
Vol 59 (2) ◽  
pp. 147-151 ◽  
Author(s):  
Christopher M. Frenz
2021 ◽  
Author(s):  
Yashas Samaga B L ◽  
Shampa Raghunathan ◽  
U. Deva Priyakumar

<div>Engineering proteins to have desired properties by mutating amino acids at specific sites is commonplace. Such engineered proteins must be stable to function. Experimental methods used to determine stability at throughputs required to scan the protein sequence space thoroughly are laborious. To this end, many machine learning based methods have been developed to predict thermodynamic stability changes upon mutation. These methods have been evaluated for symmetric consistency by testing with hypothetical reverse mutations. In this work, we propose transitive data augmentation, evaluating transitive consistency, and a new machine learning based method, first of its kind, that incorporates both symmetric and transitive properties into the architecture. Our method, called SCONES, is an interpretable neural network that estimates a residue's contributions towards protein stability dG in its local structural environment. The difference between independently predicted contributions of the reference and mutant residues in a missense mutation is reported as dG. We show that this self-consistent machine learning architecture is immune to many common biases in datasets, relies less on data than existing methods, and is robust to overfitting.</div><div><br></div>


1996 ◽  
Vol 5 (9) ◽  
pp. 1907-1916 ◽  
Author(s):  
Dagmar M. Truckses ◽  
Kenneth E. Prehoda ◽  
Stephen C. Miller ◽  
John L. Markley ◽  
John R. Somoza

Author(s):  
A.P. Hinck ◽  
W.F. Walkenhorst

The slow rates of peptide bond isomerization in imino acids and the substantial population of the cis peptide bond isomer in Xaa-Pro linkages in peptides were first recognized in NMR studies of proline-containing model compounds (Maia et al., 1971). The important role of this isomerization in protein stability and folding (reviewed by Kim and Baldwin, 1982, 1990; Schmid, 1993) were recognized several years later (Brandts et al., 1975) and the biological relevance of this process was substantiated by the discovery of a ubiquitous enzyme that catalyzes Xaa-Pro peptide bond isomerization (Fischer et al., 1984, 1989; Takahashi et al., 1989). The strict evolutionary conservation of some prolyl residues and the observation that the kinetics of interconversion between alternative functional forms of some systems is consistent with the time scale of proline isomerization suggest that proline isomerization may play a wide role in protein structure and function. Suggestive examples include the sodium pump of Escherichia coli, the disulfide isomerase/thioredoxin class of enzymes, concanavalin A, and bovine prothrornbin fragment I (Brown et al., 1977; Marsh et al, 1979; Dunker, 1982; Brandland Deber, 1986; Langsetmo et al, 1989). NMR spectroscopy is one of the most suitable tools for studying this isomerization reaction. The rates generally are slow on the time scale of NMR chemical shifts but, in favorable cases, are comparable to longitudinal relaxation rates so that the isomerization process can be investigated by chemical exchange spectroscopy. NMR data obtained on calbindin D9k (Chazin et al., 1989), insulin (Higgins et al., 1988), and staphylococcal nuclease (nuclease) as discussed below have shown that each exists in solution under native conditions as a mixture of slowly exchanging conformers. The fact that dynamic molecular heterogeneity in nuclease was first observed in the laboratory of Oleg Jardetzky, as manifested by splitting of the histidyl 1H ε1 resonance from His46 in one-dimensional 1H NMR spectra recorded at 100 MHz (Markley et al., 1970), makes this topic particularly appropriate to a volume celebrating his scientific contributions.


2001 ◽  
Vol 261 (3) ◽  
pp. 599-609 ◽  
Author(s):  
Hueih Min Chen ◽  
Theodore J. Dimagno ◽  
Wei Wang ◽  
Eric Leung ◽  
Cheng-Hao Lee ◽  
...  

2021 ◽  
Author(s):  
Yashas Samaga B L ◽  
Shampa Raghunathan ◽  
U. Deva Priyakumar

<div>Engineering proteins to have desired properties by mutating amino acids at specific sites is commonplace. Such engineered proteins must be stable to function. Experimental methods used to determine stability at throughputs required to scan the protein sequence space thoroughly are laborious. To this end, many machine learning based methods have been developed to predict thermodynamic stability changes upon mutation. These methods have been evaluated for symmetric consistency by testing with hypothetical reverse mutations. In this work, we propose transitive data augmentation, evaluating transitive consistency, and a new machine learning based method, first of its kind, that incorporates both symmetric and transitive properties into the architecture. Our method, called SCONES, is an interpretable neural network that estimates a residue's contributions towards protein stability dG in its local structural environment. The difference between independently predicted contributions of the reference and mutant residues in a missense mutation is reported as dG. We show that this self-consistent machine learning architecture is immune to many common biases in datasets, relies less on data than existing methods, and is robust to overfitting.</div><div><br></div>


2016 ◽  
Author(s):  
Noah Fleming ◽  
Benjamin Kinsella ◽  
Christopher Ing

AbstractA large number of human diseases result from disruptions to protein structure and function caused by missense mutations. Computational methods are frequently employed to assist in the prediction of protein stability upon mutation. These methods utilize a combination of protein sequence data, protein structure data, empirical energy functions, and physicochemical properties of amino acids. In this work, we present the first use of dynamic protein structural features in order to improve stability predictions upon mutation. This is achieved through the use of a set of timeseries extracted from microsecond timescale atomistic molecular dynamics simulations of proteins. Standard machine learning algorithms using mean, variance, and histograms of these timeseries were found to be 60-70% accurate in stability classification based on experimental ΔΔGor protein-chaperone interaction measurements. A recurrent neural network with full treatment of timeseries data was found to be 80% accurate according the F1 score. The performance of our models was found to be equal or better than two recently developed machine learning methods for binary classification as well as two industry-standard stability prediction algorithms. In addition to classification, understanding the molecular basis of protein stability disruption due to disease-causing mutations is a significant challenge that impedes the development of drugs and therapies that may be used treat genetic diseases. The use of dynamic structural features allows for novel insight into the molecular basis of protein disruption by mutation in a diverse set of soluble proteins. To assist in the interpretation of machine learning results, we present a technique for determining the importance of features to a recurrent neural network using Garson’s method. We propose a novel extension of neural interpretation diagrams by implementing Garson’s method to scale each node in the neural interpretation diagram according to its relative importance to the network.


Sign in / Sign up

Export Citation Format

Share Document