scholarly journals Structural Phylogenetics with Confidence

2020 ◽  
Vol 37 (9) ◽  
pp. 2711-2726
Author(s):  
Ashar J Malik ◽  
Anthony M Poole ◽  
Jane R Allison

Abstract For evaluating the deepest evolutionary relationships among proteins, sequence similarity is too low for application of sequence-based homology search or phylogenetic methods. In such cases, comparison of protein structures, which are often better conserved than sequences, may provide an alternative means of uncovering deep evolutionary signal. Although major protein structure databases such as SCOP and CATH hierarchically group protein structures, they do not describe the specific evolutionary relationships within a hierarchical level. Structural phylogenies have the potential to fill this gap. However, it is difficult to assess evolutionary relationships derived from structural phylogenies without some means of assessing confidence in such trees. We therefore address two shortcomings in the application of structural data to deep phylogeny. First, we examine whether phylogenies derived from pairwise structural comparisons are sensitive to differences in protein length and shape. We find that structural phylogenetics is best employed where structures have very similar lengths, and that shape fluctuations generated during molecular dynamics simulations impact pairwise comparisons, but not so drastically as to eliminate evolutionary signal. Second, we address the absence of statistical support for structural phylogeny. We present a method for assessing confidence in a structural phylogeny using shape fluctuations generated via molecular dynamics or Monte Carlo simulations of proteins. Our approach will aid the evolutionary reconstruction of relationships across structurally defined protein superfamilies. With the Protein Data Bank now containing in excess of 158,000 entries (December 2019), we predict that structural phylogenetics will become a useful tool for ordering the protein universe.

2020 ◽  
Vol 21 (4) ◽  
pp. 1352 ◽  
Author(s):  
János András Mótyán ◽  
Márió Miczi ◽  
József Tőzsér

The life cycles of retroviruses rely on the limited proteolysis catalyzed by the viral protease. Numerous eukaryotic organisms also express endogenously such proteases, which originate from retrotransposons or retroviruses, including DNA damage-inducible 1 and 2 (Ddi1 and Ddi2, respectively) proteins. In this study, we performed a comparative analysis based on the structural data currently available in Protein Data Bank (PDB) and Structural summaries of PDB entries (PDBsum) databases, with a special emphasis on the regions involved in dimerization of retroviral and retroviral-like Ddi proteases. In addition to Ddi1 and Ddi2, at least one member of all seven genera of the Retroviridae family was included in this comparison. We found that the studied retroviral and non-viral proteases show differences in the mode of dimerization and density of intermonomeric contacts, and distribution of the structural characteristics is in agreement with their evolutionary relationships. Multiple sequence and structure alignments revealed that the interactions between the subunits depend mainly on the overall organization of the dimer interface. We think that better understanding of the general and specific features of proteases may support the characterization of retroviral-like proteases.


2020 ◽  
Vol 49 (D1) ◽  
pp. D452-D457
Author(s):  
Lisanna Paladin ◽  
Martina Bevilacqua ◽  
Sara Errigo ◽  
Damiano Piovesan ◽  
Ivan Mičetić ◽  
...  

Abstract The RepeatsDB database (URL: https://repeatsdb.org/) provides annotations and classification for protein tandem repeat structures from the Protein Data Bank (PDB). Protein tandem repeats are ubiquitous in all branches of the tree of life. The accumulation of solved repeat structures provides new possibilities for classification and detection, but also increasing the need for annotation. Here we present RepeatsDB 3.0, which addresses these challenges and presents an extended classification scheme. The major conceptual change compared to the previous version is the hierarchical classification combining top levels based solely on structural similarity (Class > Topology > Fold) with two new levels (Clan > Family) requiring sequence similarity and describing repeat motifs in collaboration with Pfam. Data growth has been addressed with improved mechanisms for browsing the classification hierarchy. A new UniProt-centric view unifies the increasingly frequent annotation of structures from identical or similar sequences. This update of RepeatsDB aligns with our commitment to develop a resource that extracts, organizes and distributes specialized information on tandem repeat protein structures.


2014 ◽  
Vol 70 (9) ◽  
pp. 2344-2355 ◽  
Author(s):  
Ryan McGreevy ◽  
Abhishek Singharoy ◽  
Qufei Li ◽  
Jingfen Zhang ◽  
Dong Xu ◽  
...  

X-ray crystallography remains the most dominant method for solving atomic structures. However, for relatively large systems, the availability of only medium-to-low-resolution diffraction data often limits the determination of all-atom details. A new molecular dynamics flexible fitting (MDFF)-based approach, xMDFF, for determining structures from such low-resolution crystallographic data is reported. xMDFF employs a real-space refinement scheme that flexibly fits atomic models into an iteratively updating electron-density map. It addresses significant large-scale deformations of the initial model to fit the low-resolution density, as tested with synthetic low-resolution maps of D-ribose-binding protein. xMDFF has been successfully applied to re-refine six low-resolution protein structures of varying sizes that had already been submitted to the Protein Data Bank. Finally,viasystematic refinement of a series of data from 3.6 to 7 Å resolution, xMDFF refinements together with electrophysiology experiments were used to validate the first all-atom structure of the voltage-sensing protein Ci-VSP.


1998 ◽  
Vol 54 (6) ◽  
pp. 1147-1154 ◽  
Author(s):  
Tim J. P. Hubbard ◽  
Bart Ailey ◽  
Steven E. Brenner ◽  
Alexey G. Murzin ◽  
Cyrus Chothia

The Structural Classification of Proteins (SCOP) database provides a detailed and comprehensive description of the relationships of all known protein structures. The classification is on hierarchical levels: the first two levels, family and superfamily, describe near and far evolutionary relationships; the third, fold, describes geometrical relationships. The distinction between evolutionary relationships and those that arise from the physics and chemistry of proteins is a feature that is unique to this database, so far. The database can be used as a source of data to calibrate sequence search algorithms and for the generation of population statistics on protein structures. The database and its associated files are freely accessible from a number of WWW sites mirrored from URL http://scop.mrc-lmb.cam.ac.uk/scop/.


2017 ◽  
Author(s):  
Yang Liu ◽  
Qing Ye ◽  
Liwei Wang ◽  
Jian Peng

AbstractMotivationUnderstanding the relationship between protein structure and function is a fundamental problem in protein science. Given a protein of unknown function, fast identification of similar protein structures from the Protein Data Bank (PDB) is a critical step for inferring its biological function. Such structural neighbors can provide evolutionary insights into protein conformation, interfaces and binding sites that are not detectable from sequence similarity. However, the computational cost of performing pairwise structural alignment against all structures in PDB is prohibitively expensive. Alignment-free approaches have been introduced to enable fast but coarse comparisons by representing each protein as a vector of structure features or fingerprints and only computing similarity between vectors. As a notable example, FragBag represents each protein by a “bag of fragments”, which is a vector of frequencies of contiguous short backbone fragments from a predetermined library.ResultsHere we present a new approach to learning effective structural motif presentations using deep learning. We develop DeepFold, a deep convolutional neural network model to extract structural motif features of a protein structure. Similar to FragBag, DeepFold represents each protein structure or fold using a vector of learned structural motif features. We demonstrate that DeepFold substantially outperforms FragBag on protein structural search on a non-redundant protein structure database and a set of newly released structures. Remarkably, DeepFold not only extracts meaningful backbone segments but also finds important long-range interacting motifs for structural comparison. We expect that DeepFold will provide new insights into the evolution and hierarchical organization of protein structural motifs.Availabilityhttps://github.com/largelymfs/[email protected]


2020 ◽  
Author(s):  
Florencia Klein ◽  
Daniela Cáceres-Rojas ◽  
Monica Carrasco ◽  
Juan Carlos Tapia ◽  
Julio Caballero ◽  
...  

<p>Although molecular dynamics simulations allow for the study of interactions among virtually all biomolecular entities, metal ions still pose significant challenges to achieve an accurate structural and dynamical description of many biological assemblies. This is particularly the case for coarse-grained (CG) models. Although the reduced computational cost of CG methods often makes them the technique of choice for the study of large biomolecular systems, the parameterization of metal ions is still very crude or simply not available for the vast majority of CG- force fields. Here, we show that incorporating statistical data retrieved from the Protein Data Bank (PDB) to set specific Lennard-Jones interactions can produce structurally accurate CG molecular dynamics simulations. Using this simple approach, we provide a set of interaction parameters for Calcium, Magnesium, and Zinc ions, which cover more than 80% of the metal-bound structures reported on the PDB. Simulations performed using the SIRAH force field on several proteins and DNA systems show that using the present approach it is possible to obtain non-bonded interaction parameters that obviate the use of topological constraints. </p>


2020 ◽  
Author(s):  
Lim Heo ◽  
Collin Arbour ◽  
Michael Feig

Protein structures provide valuable information for understanding biological processes. Protein structures can be determined by experimental methods such as X-ray crystallography, nuclear magnetic resonance (NMR) spectroscopy, or cryogenic electron microscopy. As an alternative, in silico methods can be used to predict protein structures. Those methods utilize protein structure databases for structure prediction via template-based modeling or for training machine-learning models to generate predictions. Structure prediction for proteins distant from proteins with known structures often results in lower accuracy with respect to the true physiological structures. Physics-based protein model refinement methods can be applied to improve model accuracy in the predicted models. Refinement methods rely on conformational sampling around the predicted structures, and if structures closer to the native states are sampled, improvements in the model quality become possible. Molecular dynamics simulations have been especially successful for improving model qualities but although consistent refinement can be achieved, the improvements in model qualities are still moderate. To extend the refinement performance of a simulation-based protocol, we explored new schemes that focus on an optimized use of biasing functions and the application of increased simulation temperatures. In addition, we tested the use of alternative initial models so that the simulations can explore conformational space more broadly. Based on the insight of this analysis we are proposing a new refinement protocol that significantly outperformed previous state-of-the-art molecular dynamics simulation-based protocols in the benchmark tests described here. <br>


Viruses ◽  
2021 ◽  
Vol 13 (2) ◽  
pp. 313
Author(s):  
Heli A. M. Mönttinen ◽  
Janne J. Ravantti ◽  
Minna M. Poranen

RNA viruses are the fastest evolving known biological entities. Consequently, the sequence similarity between homologous viral proteins disappears quickly, limiting the usability of traditional sequence-based phylogenetic methods in the reconstruction of relationships and evolutionary history among RNA viruses. Protein structures, however, typically evolve more slowly than sequences, and structural similarity can still be evident, when no sequence similarity can be detected. Here, we used an automated structural comparison method, homologous structure finder, for comprehensive comparisons of viral RNA-dependent RNA polymerases (RdRps). We identified a common structural core of 231 residues for all the structurally characterized viral RdRps, covering segmented and non-segmented negative-sense, positive-sense, and double-stranded RNA viruses infecting both prokaryotic and eukaryotic hosts. The grouping and branching of the viral RdRps in the structure-based phylogenetic tree follow their functional differentiation. The RdRps using protein primer, RNA primer, or self-priming mechanisms have evolved independently of each other, and the RdRps cluster into two large branches based on the used transcription mechanism. The structure-based distance tree presented here follows the recently established RdRp-based RNA virus classification at genus, subfamily, family, order, class and subphylum ranks. However, the topology of our phylogenetic tree suggests an alternative phylum level organization.


2021 ◽  
Vol 22 (13) ◽  
pp. 6709
Author(s):  
Xiao-Xuan Shi ◽  
Peng-Ye Wang ◽  
Hong Chen ◽  
Ping Xie

The transition between strong and weak interactions of the kinesin head with the microtubule, which is regulated by the change of the nucleotide state of the head, is indispensable for the processive motion of the kinesin molecular motor on the microtubule. Here, using all-atom molecular dynamics simulations, the interactions between the kinesin head and tubulin are studied on the basis of the available high-resolution structural data. We found that the strong interaction can induce rapid large conformational changes of the tubulin, whereas the weak interaction cannot. Furthermore, we found that the large conformational changes of the tubulin have a significant effect on the interaction of the tubulin with the head in the weak-microtubule-binding ADP state. The calculated binding energy of the ADP-bound head to the tubulin with the large conformational changes is only about half that of the tubulin without the conformational changes.


Sign in / Sign up

Export Citation Format

Share Document