Beyond stability constraints: a biophysical model of enzyme evolution with selection for stability and activity

Mapping Intimacies ◽

10.1101/399154 ◽

2018 ◽

Author(s):

Julian Echave

Keyword(s):

Amino Acids ◽

Packing Density ◽

Solvent Accessibility ◽

Enzyme Evolution ◽

Biophysical Model ◽

Solvent Exposure ◽

Local Environments ◽

Distance Dependence ◽

Selection For ◽

Functionally Diverse

AbstractProteins trace trajectories in sequence space as their amino acids become substituted by other amino acids. The number of substitutions per unit time, the rate of evolution, varies among sites because of biophysical constraints. Several properties that characterize sites’ local environments have been proposed as biophysical determinants of site-specific evolutionary rates. Thus, rate increases with increasing solvent exposure, increasing flexibility, and decreasing local packing density. For enzymes, rate increases also with increasing distance from the protein’s active residues, presumably due to functional constraints. The dependence of rates on solvent accessibility, packing density, and flexibility has been mechanistically explained in terms of selection for stability. However, as I show here, a stability-based model fails to reproduce the observed rate-distance dependence, overestimating rates close to the active residues and underestimating rates of distant sites. Here, I pose a new biophysical model of enzyme evolution with selection for stability and activity (MSA) and compare it with a stability-based counterpart (MS). Testing these models on a structurally and functionally diverse dataset of monomeric enzymes, I found that MSA fits observed rates better than MS for most proteins. While both models reproduce the observed dependence of rates on solvent accessibility, packing, and flexibility, MSA fits these dependencies somewhat better. Importantly, while MS fails to reproduce the dependence of rates on distance from the active residues, MSA accounts for the rate-distance dependence quantitatively. Thus, the variation of evolutionary rate among enzyme sites is mechanistically underpinned by natural selection for both stability and activity.

Download Full-text

Beyond Stability Constraints: A Biophysical Model of Enzyme Evolution with Selection on Stability and Activity

Molecular Biology and Evolution ◽

10.1093/molbev/msy244 ◽

2018 ◽

Vol 36 (3) ◽

pp. 613-620 ◽

Cited By ~ 10

Author(s):

Julian Echave

Keyword(s):

Enzyme Evolution ◽

Biophysical Model

Download Full-text

Solvent Accessibility of Residues Undergoing Pathogenic Variations in Humans: From Protein Structures to Protein Sequences

Frontiers in Molecular Biosciences ◽

10.3389/fmolb.2020.626363 ◽

2021 ◽

Vol 7 ◽

Author(s):

Castrense Savojardo ◽

Matteo Manfredi ◽

Pier Luigi Martelli ◽

Rita Casadio

Keyword(s):

Solvent Accessibility ◽

Protein Structures ◽

Three Dimensional ◽

Protein Sequences ◽

Large Data ◽

Human Protein ◽

Dimensional Structure ◽

Wild Type ◽

Solvent Exposure ◽

Data Set

Solvent accessibility (SASA) is a key feature of proteins for determining their folding and stability. SASA is computed from protein structures with different algorithms, and from protein sequences with machine-learning based approaches trained on solved structures. Here we ask the question as to which extent solvent exposure of residues can be associated to the pathogenicity of the variation. By this, SASA of the wild-type residue acquires a role in the context of functional annotation of protein single-residue variations (SRVs). By mapping variations on a curated database of human protein structures, we found that residues targeted by disease related SRVs are less accessible to solvent than residues involved in polymorphisms. The disease association is not evenly distributed among the different residue types: SRVs targeting glycine, tryptophan, tyrosine, and cysteine are more frequently disease associated than others. For all residues, the proportion of disease related SRVs largely increases when the wild-type residue is buried and decreases when it is exposed. The extent of the increase depends on the residue type. With the aid of an in house developed predictor, based on a deep learning procedure and performing at the state-of-the-art, we are able to confirm the above tendency by analyzing a large data set of residues subjected to variations and occurring in some 12,494 human protein sequences still lacking three-dimensional structure (derived from HUMSAVAR). Our data support the notion that surface accessible area is a distinguished property of residues that undergo variation and that pathogenicity is more frequently associated to the buried property than to the exposed one.

Download Full-text

New Insights about Enzyme Evolution from Large Scale Studies of Sequence and Structure Relationships

Journal of Biological Chemistry ◽

10.1074/jbc.r114.569350 ◽

2014 ◽

Vol 289 (44) ◽

pp. 30221-30228 ◽

Cited By ~ 46

Author(s):

Shoshana D. Brown ◽

Patricia C. Babbitt

Keyword(s):

Structure Function ◽

Active Sites ◽

Large Scale ◽

Common Ancestor ◽

Enzyme Evolution ◽

Large Set ◽

Substrate Specificities ◽

Functionally Diverse

Understanding how enzymes have evolved offers clues about their structure-function relationships and mechanisms. Here, we describe evolution of functionally diverse enzyme superfamilies, each representing a large set of sequences that evolved from a common ancestor and that retain conserved features of their structures and active sites. Using several examples, we describe the different structural strategies nature has used to evolve new reaction and substrate specificities in each unique superfamily. The results provide insight about enzyme evolution that is not easily obtained from studies of one or only a few enzymes.

Download Full-text

From Amino Acids to Polycyclic Heterocycles - Synthesis of Enantiopure, Functionally Diverse Isopavines and Dihydromethanodibenzoazocines

Heterocycles ◽

10.3987/com-05-s(t)13 ◽

2006 ◽

Vol 67 (1) ◽

pp. 205 ◽

Cited By ~ 5

Author(s):

Stephen Hanessian ◽

Clément Talbot ◽

Marc Mauduit ◽

Parthasarathy Saravanan ◽

Jayapal Reddy Gone

Keyword(s):

Amino Acids ◽

Functionally Diverse

Download Full-text

SVM-BASED METHOD FOR PROTEIN STRUCTURAL CLASS PREDICTION USING SECONDARY STRUCTURAL CONTENT AND STRUCTURAL INFORMATION OF AMINO ACIDS

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720011005422 ◽

2011 ◽

Vol 09 (04) ◽

pp. 489-502 ◽

Cited By ~ 6

Author(s):

TABREZ ANWAR SHAMIM MOHAMMAD ◽

HAMPAPATHALU ADIMURTHY NAGARAJARAM

Keyword(s):

Amino Acids ◽

Structural Information ◽

Solvent Accessibility ◽

Protein Structures ◽

Classification Problem ◽

Support Vector ◽

Class Prediction ◽

Structural Class ◽

Protein Structural Class ◽

Structural Content

The knowledge collated from the known protein structures has revealed that the proteins are usually folded into the four structural classes: all-α, all-β, α/β and α + β. A number of methods have been proposed to predict the protein's structural class from its primary structure; however, it has been observed that these methods fail or perform poorly in the cases of distantly related sequences. In this paper, we propose a new method for protein structural class prediction using low homology (twilight-zone) protein sequences dataset. Since protein structural class prediction is a typical classification problem, we have developed a Support Vector Machine (SVM)-based method for protein structural class prediction that uses features derived from the predicted secondary structure and predicted burial information of amino acid residues. The examination of different individual as well as feature combinations revealed that the combination of secondary structural content, secondary structural and solvent accessibility state frequencies of amino acids gave rise to the best leave-one-out cross-validation accuracy of ~81% which is comparable to the best accuracy reported in the literature so far.

Download Full-text

Selection for transport competence of C-terminal polypeptides derived from Escherichia coli hemolysin: the shortest peptide capable of autonomous HIyB/HIyD-dependent secretion comprises the C-terminal 62 amino acids of HlyA

MGG Molecular & General Genetics ◽

10.1007/bf00279750 ◽

1994 ◽

Vol 245 (1) ◽

pp. 53-60 ◽

Cited By ~ 50

Author(s):

T. Jarchau ◽

T. Chakraborty ◽

F. Garcia ◽

W. Goebel

Keyword(s):

Escherichia Coli ◽

Amino Acids ◽

Selection For ◽

Escherichia Coli Hemolysin

Download Full-text

3D Interaction Homology: Computational Titration of Aspartic Acid, Glutamic Acid and Histidine Can Create pH-Tunable Hydropathic Environment Maps

Frontiers in Molecular Biosciences ◽

10.3389/fmolb.2021.773385 ◽

2021 ◽

Vol 8 ◽

Author(s):

Noah B. Herrington ◽

Glen E. Kellogg

Keyword(s):

Glutamic Acid ◽

Aspartic Acid ◽

Protein Design ◽

Solvent Accessibility ◽

Scoring Function ◽

Relative Solvent Accessibility ◽

Solvent Exposure ◽

3D Interaction ◽

Ionizable Residues ◽

Environment Maps

Aspartic acid, glutamic acid and histidine are ionizable residues occupying various protein environments and perform many different functions in structures. Their roles are tied to their acid/base equilibria, solvent exposure, and backbone conformations. We propose that the number of unique environments for ASP, GLU and HIS is quite limited. We generated maps of these residue's environments using a hydropathic scoring function to record the type and magnitude of interactions for each residue in a 2703-protein structural dataset. These maps are backbone-dependent and suggest the existence of new structural motifs for each residue type. Additionally, we developed an algorithm for tuning these maps to any pH, a potentially useful element for protein design and structure building. Here, we elucidate the complex interplay between secondary structure, relative solvent accessibility, and residue ionization states: the degree of protonation for ionizable residues increases with solvent accessibility, which in turn is notably dependent on backbone structure.

Download Full-text

Phylogenetic Analyses of Sites in Different Protein Structural Environments Result in Distinct Placements of the Metazoan Root

Biology ◽

10.3390/biology9040064 ◽

2020 ◽

Vol 9 (4) ◽

pp. 64 ◽

Cited By ~ 6

Author(s):

Akanksha Pandey ◽

Edward L. Braun

Keyword(s):

Amino Acids ◽

Amino Acid ◽

Solvent Accessibility ◽

Phylogenetic Signal ◽

Phylogenetic Analyses ◽

Sister Group ◽

Striking Difference ◽

Relative Solvent Accessibility ◽

Protein Datasets ◽

The Impact

Phylogenomics, the use of large datasets to examine phylogeny, has revolutionized the study of evolutionary relationships. However, genome-scale data have not been able to resolve all relationships in the tree of life; this could reflect, at least in part, the poor-fit of the models used to analyze heterogeneous datasets. Some of the heterogeneity may reflect the different patterns of selection on proteins based on their structures. To test that hypothesis, we developed a pipeline to divide phylogenomic protein datasets into subsets based on secondary structure and relative solvent accessibility. We then tested whether amino acids in different structural environments had distinct signals for the topology of the deepest branches in the metazoan tree. We focused on a dataset that appeared to have a mixture of signals and we found that the most striking difference in phylogenetic signal reflected relative solvent accessibility. Analyses of exposed sites (residues located on the surface of proteins) yielded a tree that placed ctenophores sister to all other animals whereas sites buried inside proteins yielded a tree with a sponge+ctenophore clade. These differences in phylogenetic signal were not ameliorated when we conducted analyses using a set of maximum-likelihood profile mixture models. These models are very similar to the Bayesian CAT model, which has been used in many analyses of deep metazoan phylogeny. In contrast, analyses conducted after recoding amino acids to limit the impact of deviations from compositional stationarity increased the congruence in the estimates of phylogeny for exposed and buried sites; after recoding amino acid trees estimated using the exposed and buried site both supported placement of ctenophores sister to all other animals. Although the central conclusion of our analyses is that sites in different structural environments yield distinct trees when analyzed using models of protein evolution, our amino acid recoding analyses also have implications for metazoan evolution. Specifically, our results add to the evidence that ctenophores are the sister group of all other animals and they further suggest that the placozoa+cnidaria clade found in some other studies deserves more attention. Taken as a whole, these results provide striking evidence that it is necessary to achieve a better understanding of the constraints due to protein structure to improve phylogenetic estimation.

Download Full-text

Extension of the pairwise-contact energy parameters for proteins with the local environments of amino acids

Physica A Statistical Mechanics and its Applications ◽

10.1016/j.physa.2004.12.044 ◽

2005 ◽

Vol 351 (2-4) ◽

pp. 439-447 ◽

Cited By ~ 2

Author(s):

Muyoung Heo ◽

Mookyung Cheon ◽

Eun-Joung Moon ◽

Suhkmann Kim ◽

Kwanghoon Chung ◽

...

Keyword(s):

Amino Acids ◽

Energy Parameters ◽

Local Environments

Download Full-text

Site-specific solvent exposure analysis of a membrane protein using unnatural amino acids and 19F nuclear magnetic resonance

Biochemical and Biophysical Research Communications ◽

10.1016/j.bbrc.2011.09.082 ◽

2011 ◽

Vol 414 (2) ◽

pp. 379-383 ◽

Cited By ~ 2

Author(s):

Pan Shi ◽

Dong Li ◽

Hongwei Chen ◽

Ying Xiong ◽

Changlin Tian

Keyword(s):

Amino Acids ◽

Nuclear Magnetic Resonance ◽

Magnetic Resonance ◽

Membrane Protein ◽

Unnatural Amino Acids ◽

Solvent Exposure ◽

Site Specific ◽

Exposure Analysis ◽

Nuclear Magnetic

Download Full-text