scholarly journals Site-specific amino-acid preferences are mostly conserved in two closely related protein homologs

2015 ◽  
Author(s):  
Michael B Doud ◽  
Orr Ashenberg ◽  
Jesse Bloom

Evolution drives changes in a protein's sequence over time. The extent to which these changes in sequence lead to shifts in the underlying preference for each amino acid at each site is an important question with implications for comparative sequence-analysis methods such as molecular phylogenetics. To quantify the extent that site-specific amino-acid preferences shift during evolution, we performed deep mutational scanning on two homologs of human influenza nucleoprotein with 94% amino-acid identity. We found that only a modest fraction of sites exhibited shifts in amino-acid preferences that exceeded the noise in our experiments. Furthermore, even among sites that did exhibit detectable shifts, the magnitude tended to be small relative to differences between non-homologous proteins. Given the limited change in amino-acid preferences between these close homologs, we tested whether our measurements could inform site-specific substitution models that describe the evolution of nucleoproteins from more diverse influenza viruses. We found that site-specific evolutionary models informed by our experiments greatly outperformed non-site-specific alternatives in fitting phylogenies of nucleoproteins from human, swine, equine, and avian influenza. Combining the experimental data from both homologs improved phylogenetic fit, partly because measurements in multiple genetic contexts better captured the evolutionary average of the amino-acid preferences for sites with shifting preferences. Our results show that site-specific amino-acid preferences are sufficiently conserved that measuring mutational effects in one protein provides information that can improve quantitative evolutionary modeling of nearby homologs.

2020 ◽  
Author(s):  
Mackenzie M. Johnson ◽  
Claus O. Wilke

AbstractIn many applications of evolutionary inference, a model of protein evolution needs to be fitted to the amino acid variation at individual sites in a multiple sequence alignment. Most existing models fall into one of two extremes: Either they provide a coarse-grained description that lacks biophysical realism (e.g. dN/dS models), or they require a large number of parameters to be fitted (e.g. mutation–selection models). Here, we ask whether a middle ground is possible: Can we obtain a realistic description of site-specific amino acid frequencies while severely restricting the number of free parameters in the model? We show that a distribution with a single free parameter can accurately capture the variation in amino acid frequency at most sites in an alignment, as long as we are willing to restrict our analysis to predicting amino acid frequencies by rank rather than by amino acid identity. This result holds equally well both in alignments of empirical protein sequences and of sequences evolved under a biophysically realistic all-atom force field. Our analysis reveals a near universal shape of the frequency distributions of amino acids. This insight has the potential to lead to new models of evolution that have both increased realism and a limited number of free parameters.


2018 ◽  
Author(s):  
Sarah K Hilton ◽  
Jesse D Bloom

Molecular phylogenetics is often used to estimate the time since the divergence of modern gene sequences. For highly diverged sequences, such phylogenetic techniques sometimes estimate surprisingly recent divergence times. In the case of viruses, independent evidence indicates that the estimates of deep divergence times from molecular phylogenetics are sometimes too recent. This discrepancy is caused in part by inadequate models of purifying selection leading to branch-length underestimation. Here we examine the effect on branch-length estimation of using models that incorporate experimental measurements of purifying selection. We find that models informed by experimentally measured site-specific amino-acid preferences estimate longer deep branches on phylogenies of influenza virus hemagglutinin. This lengthening of branches is due to more realistic stationary states of the models, and is mostly independent of the branch-length-extension from modeling site-to-site variation in amino-acid substitution rate. The branch-length extension from experimentally informed site-specific models is similar to that achieved by other approaches that allow the stationary state to vary across sites. However, the improvements from all of these site-specific but time-homogeneous and site-independent models are limited by the fact that a protein's amino-acid preferences gradually shift as it evolves. Overall, our work underscores the importance of modeling site-specific amino-acid preferences when estimating deep divergence times---but also shows the inherent limitations of approaches that fail to account for how these preferences shift over time.


1992 ◽  
Vol 62 (1) ◽  
pp. 77-78 ◽  
Author(s):  
D. Kosk-Kosicka ◽  
T. Bzdega ◽  
A. Wawrzynow ◽  
D.M. Watterson ◽  
T.J. Lukas

2004 ◽  
Vol 22 (3) ◽  
pp. 630-638 ◽  
Author(s):  
Markus Porto ◽  
H. Eduardo Roman ◽  
Michele Vendruscolo ◽  
Ugo Bastolla

2021 ◽  
Vol 3 (3) ◽  
Author(s):  
Tair Shauli ◽  
Nadav Brandes ◽  
Michal Linial

Abstract Human genetic variation in coding regions is fundamental to the study of protein structure and function. Most methods for interpreting missense variants consider substitution measures derived from homologous proteins across different species. In this study, we introduce human-specific amino acid (AA) substitution matrices that are based on genetic variations in the modern human population. We analyzed the frequencies of >4.8M single nucleotide variants (SNVs) at codon and AA resolution and compiled human-centric substitution matrices that are fundamentally different from classic cross-species matrices (e.g. BLOSUM, PAM). Our matrices are asymmetric, with some AA replacements showing significant directional preference. Moreover, these AA matrices are only partly predicted by nucleotide substitution rates. We further test the utility of our matrices in exposing functional signals of experimentally-validated protein annotations. A significant reduction in AA transition frequencies was observed across nine post-translational modification (PTM) types and four ion-binding sites. Our results propose a purifying selection signal in the human proteome across a diverse set of functional protein annotations and provide an empirical baseline for interpreting human genetic variation in coding regions.


2000 ◽  
Vol 46 (9) ◽  
pp. 1478-1486 ◽  
Author(s):  
Allan S Hoffman

Abstract Polymers that respond to small changes in environmental stimuli with large, sometimes discontinuous changes in their physical state or properties are often called “intelligent” or “smart” polymers. We have conjugated these polymers to different recognition proteins, including antibodies, protein A, streptavidin, and enzymes. These bioconjugates have been prepared by random polymer conjugation to lysine amino groups on the protein surface, and also by site-specific conjugation of the polymer to specific amino acid sites, such as cysteine sulfhydryl groups, that are genetically engineered into the known amino acid sequence of the protein. We have conjugated several different smart polymers to streptavidin, including temperature-, pH-, and light-sensitive polymers. The preparation of these conjugates and their many fascinating applications are reviewed here.


Sign in / Sign up

Export Citation Format

Share Document