scholarly journals Beyond stability constraints: a biophysical model of enzyme evolution with selection for stability and activity

2018 ◽  
Author(s):  
Julian Echave

AbstractProteins trace trajectories in sequence space as their amino acids become substituted by other amino acids. The number of substitutions per unit time, the rate of evolution, varies among sites because of biophysical constraints. Several properties that characterize sites’ local environments have been proposed as biophysical determinants of site-specific evolutionary rates. Thus, rate increases with increasing solvent exposure, increasing flexibility, and decreasing local packing density. For enzymes, rate increases also with increasing distance from the protein’s active residues, presumably due to functional constraints. The dependence of rates on solvent accessibility, packing density, and flexibility has been mechanistically explained in terms of selection for stability. However, as I show here, a stability-based model fails to reproduce the observed rate-distance dependence, overestimating rates close to the active residues and underestimating rates of distant sites. Here, I pose a new biophysical model of enzyme evolution with selection for stability and activity (MSA) and compare it with a stability-based counterpart (MS). Testing these models on a structurally and functionally diverse dataset of monomeric enzymes, I found that MSA fits observed rates better than MS for most proteins. While both models reproduce the observed dependence of rates on solvent accessibility, packing, and flexibility, MSA fits these dependencies somewhat better. Importantly, while MS fails to reproduce the dependence of rates on distance from the active residues, MSA accounts for the rate-distance dependence quantitatively. Thus, the variation of evolutionary rate among enzyme sites is mechanistically underpinned by natural selection for both stability and activity.

2021 ◽  
Vol 7 ◽  
Author(s):  
Castrense Savojardo ◽  
Matteo Manfredi ◽  
Pier Luigi Martelli ◽  
Rita Casadio

Solvent accessibility (SASA) is a key feature of proteins for determining their folding and stability. SASA is computed from protein structures with different algorithms, and from protein sequences with machine-learning based approaches trained on solved structures. Here we ask the question as to which extent solvent exposure of residues can be associated to the pathogenicity of the variation. By this, SASA of the wild-type residue acquires a role in the context of functional annotation of protein single-residue variations (SRVs). By mapping variations on a curated database of human protein structures, we found that residues targeted by disease related SRVs are less accessible to solvent than residues involved in polymorphisms. The disease association is not evenly distributed among the different residue types: SRVs targeting glycine, tryptophan, tyrosine, and cysteine are more frequently disease associated than others. For all residues, the proportion of disease related SRVs largely increases when the wild-type residue is buried and decreases when it is exposed. The extent of the increase depends on the residue type. With the aid of an in house developed predictor, based on a deep learning procedure and performing at the state-of-the-art, we are able to confirm the above tendency by analyzing a large data set of residues subjected to variations and occurring in some 12,494 human protein sequences still lacking three-dimensional structure (derived from HUMSAVAR). Our data support the notion that surface accessible area is a distinguished property of residues that undergo variation and that pathogenicity is more frequently associated to the buried property than to the exposed one.


2014 ◽  
Vol 289 (44) ◽  
pp. 30221-30228 ◽  
Author(s):  
Shoshana D. Brown ◽  
Patricia C. Babbitt

Understanding how enzymes have evolved offers clues about their structure-function relationships and mechanisms. Here, we describe evolution of functionally diverse enzyme superfamilies, each representing a large set of sequences that evolved from a common ancestor and that retain conserved features of their structures and active sites. Using several examples, we describe the different structural strategies nature has used to evolve new reaction and substrate specificities in each unique superfamily. The results provide insight about enzyme evolution that is not easily obtained from studies of one or only a few enzymes.


Heterocycles ◽  
2006 ◽  
Vol 67 (1) ◽  
pp. 205 ◽  
Author(s):  
Stephen Hanessian ◽  
Clément Talbot ◽  
Marc Mauduit ◽  
Parthasarathy Saravanan ◽  
Jayapal Reddy Gone

2011 ◽  
Vol 09 (04) ◽  
pp. 489-502 ◽  
Author(s):  
TABREZ ANWAR SHAMIM MOHAMMAD ◽  
HAMPAPATHALU ADIMURTHY NAGARAJARAM

The knowledge collated from the known protein structures has revealed that the proteins are usually folded into the four structural classes: all-α, all-β, α/β and α + β. A number of methods have been proposed to predict the protein's structural class from its primary structure; however, it has been observed that these methods fail or perform poorly in the cases of distantly related sequences. In this paper, we propose a new method for protein structural class prediction using low homology (twilight-zone) protein sequences dataset. Since protein structural class prediction is a typical classification problem, we have developed a Support Vector Machine (SVM)-based method for protein structural class prediction that uses features derived from the predicted secondary structure and predicted burial information of amino acid residues. The examination of different individual as well as feature combinations revealed that the combination of secondary structural content, secondary structural and solvent accessibility state frequencies of amino acids gave rise to the best leave-one-out cross-validation accuracy of ~81% which is comparable to the best accuracy reported in the literature so far.


2021 ◽  
Vol 8 ◽  
Author(s):  
Noah B. Herrington ◽  
Glen E. Kellogg

Aspartic acid, glutamic acid and histidine are ionizable residues occupying various protein environments and perform many different functions in structures. Their roles are tied to their acid/base equilibria, solvent exposure, and backbone conformations. We propose that the number of unique environments for ASP, GLU and HIS is quite limited. We generated maps of these residue's environments using a hydropathic scoring function to record the type and magnitude of interactions for each residue in a 2703-protein structural dataset. These maps are backbone-dependent and suggest the existence of new structural motifs for each residue type. Additionally, we developed an algorithm for tuning these maps to any pH, a potentially useful element for protein design and structure building. Here, we elucidate the complex interplay between secondary structure, relative solvent accessibility, and residue ionization states: the degree of protonation for ionizable residues increases with solvent accessibility, which in turn is notably dependent on backbone structure.


Biology ◽  
2020 ◽  
Vol 9 (4) ◽  
pp. 64 ◽  
Author(s):  
Akanksha Pandey ◽  
Edward L. Braun

Phylogenomics, the use of large datasets to examine phylogeny, has revolutionized the study of evolutionary relationships. However, genome-scale data have not been able to resolve all relationships in the tree of life; this could reflect, at least in part, the poor-fit of the models used to analyze heterogeneous datasets. Some of the heterogeneity may reflect the different patterns of selection on proteins based on their structures. To test that hypothesis, we developed a pipeline to divide phylogenomic protein datasets into subsets based on secondary structure and relative solvent accessibility. We then tested whether amino acids in different structural environments had distinct signals for the topology of the deepest branches in the metazoan tree. We focused on a dataset that appeared to have a mixture of signals and we found that the most striking difference in phylogenetic signal reflected relative solvent accessibility. Analyses of exposed sites (residues located on the surface of proteins) yielded a tree that placed ctenophores sister to all other animals whereas sites buried inside proteins yielded a tree with a sponge+ctenophore clade. These differences in phylogenetic signal were not ameliorated when we conducted analyses using a set of maximum-likelihood profile mixture models. These models are very similar to the Bayesian CAT model, which has been used in many analyses of deep metazoan phylogeny. In contrast, analyses conducted after recoding amino acids to limit the impact of deviations from compositional stationarity increased the congruence in the estimates of phylogeny for exposed and buried sites; after recoding amino acid trees estimated using the exposed and buried site both supported placement of ctenophores sister to all other animals. Although the central conclusion of our analyses is that sites in different structural environments yield distinct trees when analyzed using models of protein evolution, our amino acid recoding analyses also have implications for metazoan evolution. Specifically, our results add to the evidence that ctenophores are the sister group of all other animals and they further suggest that the placozoa+cnidaria clade found in some other studies deserves more attention. Taken as a whole, these results provide striking evidence that it is necessary to achieve a better understanding of the constraints due to protein structure to improve phylogenetic estimation.


2005 ◽  
Vol 351 (2-4) ◽  
pp. 439-447 ◽  
Author(s):  
Muyoung Heo ◽  
Mookyung Cheon ◽  
Eun-Joung Moon ◽  
Suhkmann Kim ◽  
Kwanghoon Chung ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document