Analysis of the Torsion Angles between Helical Axes in Pairs of Helices in Protein Molecules

Д.А. Тихонов; D.A. Tikhonov

doi:10.17537/2017.12.398

Analysis of the Torsion Angles between Helical Axes in Pairs of Helices in Protein Molecules

Математическая биология и биоинформатика ◽

10.17537/2017.12.398 ◽

2017 ◽

Vol 12 (2) ◽

pp. 398-410 ◽

Cited By ~ 6

Author(s):

Д.А. Тихонов ◽

D.A. Tikhonov

Keyword(s):

Protein Data Bank ◽

Data Bank ◽

Structural Features ◽

Torsion Angles ◽

Protein Molecules ◽

Helix Packing

In this study, an analysis of distribution of the torsion angles Ω between helical axes in pairs of connected helices found in known proteins has been performed. The database for helical pairs was compiled using the Protein Data Bank taking into account the definite rules suggested earlier. The database was analyzed in order to elaborate its classification and find out novel structural features in helix packing. The database was subdivided into three subsets according to criterion of crossing helix projections on the parallel planes passing through the axes of the helices. It was shown that helical pairs not having crossing projections are distributed along whole range of angles Ω, although there are two maxima at Ω = 0° and Ω = 180°. Most of helical pairs of this subset are pairs formed by α-helices and 310- helices. It is shown that the distribution of all the helical pairs having the crossing helix projections has a maximum at 20° < Ω < 25°. In this subset, most helical pairs are formed by α-helices. The distribution of only α-helical pairs having crossing axes projections has three maxima, at –50° < Ω < –25°, 20° < Ω < 25°, and 70° < Ω < 110°.

Download Full-text

Conformation-dependent restraints for polynucleotides: the sugar moiety

Nucleic Acids Research ◽

10.1093/nar/gkz1122 ◽

2019 ◽

Vol 48 (2) ◽

pp. 962-973

Author(s):

Marcin Kowiel ◽

Dariusz Brzezinski ◽

Miroslaw Gilski ◽

Mariusz Jaskolski

Keyword(s):

Nucleic Acids ◽

Crystal Structures ◽

Protein Data Bank ◽

Data Bank ◽

Experimental Methods ◽

Side Chain ◽

Torsion Angles ◽

Sugar Moiety ◽

Structural Database ◽

Different Parts

Abstract Stereochemical restraints are commonly used to aid the refinement of macromolecular structures obtained by experimental methods at lower resolution. The standard restraint library for nucleic acids has not been updated for over two decades and needs revision. In this paper, geometrical restraints for nucleic acids sugars are derived using information from high-resolution crystal structures in the Cambridge Structural Database. In contrast to the existing restraints, this work shows that different parts of the sugar moiety form groups of covalent geometry dependent on various chemical and conformational factors, such as the type of ribose or the attached nucleobase, and ring puckering or rotamers of the glycosidic (χ) or side-chain (γ) torsion angles. Moreover, the geometry of the glycosidic link and the endocyclic ribose bond angles are functionally dependent on χ and sugar pucker amplitude (τm), respectively. The proposed restraints have been positively validated against data from the Nucleic Acid Database, compared with an ultrahigh-resolution Z-DNA structure in the Protein Data Bank, and tested by re-refining hundreds of crystal structures in the Protein Data Bank. The conformation-dependent sugar restraints presented in this work are publicly available in REFMAC, PHENIX and SHELXL format through a dedicated RestraintLib web server with an API function.

Download Full-text

Lemon: a framework for rapidly mining structural information from the Protein Data Bank

Bioinformatics ◽

10.1093/bioinformatics/btz178 ◽

2019 ◽

Vol 35 (20) ◽

pp. 4165-4167 ◽

Cited By ~ 1

Author(s):

Jonathan Fine ◽

Gaurav Chopra

Keyword(s):

Protein Data Bank ◽

Structural Information ◽

Computational Cost ◽

Data Bank ◽

Structural Features ◽

Supplementary Information ◽

Develop Software ◽

Reading Text ◽

Essential Resource ◽

3D Descriptors

Abstract Motivation The Protein Data Bank (PDB) currently holds over 140 000 biomolecular structures and continues to release new structures on a weekly basis. The PDB is an essential resource to the structural bioinformatics community to develop software that mine, use, categorize and analyze such data. New computational biology methods are evaluated using custom benchmarking sets derived as subsets of 3D experimentally determined structures and structural features from the PDB. Currently, such benchmarking features are manually curated with custom scripts in a non-standardized manner that results in slow distribution and updates with new experimental structures. Finally, there is a scarcity of standardized tools to rapidly query 3D descriptors of the entire PDB. Results Our solution is the Lemon framework, a C++11 library with Python bindings, which provides a consistent workflow methodology for selecting biomolecular interactions based on user criterion and computing desired 3D structural features. This framework can parse and characterize the entire PDB in <10 min on modern, multithreaded hardware. The speed in parsing is obtained by using the recently developed MacroMolecule Transmission Format to reduce the computational cost of reading text-based PDB files. The use of C++ lambda functions and Python bindings provide extensive flexibility for analysis and categorization of the PDB by allowing the user to write custom functions to suite their objective. We think Lemon will become a one-stop-shop to quickly mine the entire PDB to generate desired structural biology features. Availability and implementation The Lemon software is available as a C++ header library along with a PyPI package and example functions at https://github.com/chopralab/lemon. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Lemon: a framework for rapidly mining structural information from the Protein Data Bank

10.1101/379891 ◽

2018 ◽

Author(s):

Jonathan Fine ◽

Gaurav Chopra

Keyword(s):

Protein Data Bank ◽

Structural Information ◽

Computational Cost ◽

Data Bank ◽

Structural Features ◽

Develop Software ◽

Reading Text ◽

One Stop ◽

Essential Resource ◽

3D Descriptors

AbstractMotivationThe protein data bank (PDB) currently holds over 140,000 biomolecular structures and continues to release new structures on a weekly basis. The PDB is an essential resource to the structural bioinformatics community to develop software that mine, use, categorize, and analyze such data. New computational biology methods are evaluated using custom benchmarking sets derived as subsets of 3D experimentally determined structures and structural features from the PDB. Currently, such benchmarking features are manually curated with custom scripts in a non-standardized manner that results in slow distribution and updates with new experimental structures. Finally, there is a scarcity of standardized tools to rapidly query 3D descriptors of the entire PDB.ApproachOur solution is the Lemon framework, a C++11 library with Python bindings, which provides a consistent workflow methodology for selecting biomolecular interactions based on user criterion and computing desired 3D structural features. This framework can parse and characterize the entire PDB in less than ten minutes on modern, multithreaded hardware. The speed in parsing is obtained by using the recently developed MacroMolecule Transmission Format (MMTF) to reduce the computational cost of reading text-based PDB files. The use of C++ lambda functions and Python binds provide extensive flexibility for analysis and categorization of the PDB by allowing the user to write custom functions to suite their objective. We think Lemon will become a one-stop-shop to quickly mine the entire PDB to generate desired structural biology features. The Lemon software is available as a C++ header library along with example functions at https://github.com/chopralab/lemon.

Download Full-text

SCCJ Cafe –Season 4– Shape of Protein Molecules (1) ^|^quot;Protein Data Bank^|^quot;

Journal of Computer Chemistry Japan ◽

10.2477/jccj.2014-0036 ◽

2014 ◽

Vol 13 (4) ◽

pp. A14-A17 ◽

Cited By ~ 1

Author(s):

Takahiro KUDOU

Keyword(s):

Protein Data Bank ◽

Data Bank ◽

Protein Molecules

Download Full-text

A global Ramachandran score identifies protein structures with unlikely stereochemistry

10.1101/2020.03.26.010587 ◽

2020 ◽

Cited By ~ 1

Author(s):

Oleg V. Sobolev ◽

Pavel V. Afonine ◽

Nigel W. Moriarty ◽

Maarten L. Hekkelman ◽

Robbie P. Joosten ◽

...

Keyword(s):

Protein Data Bank ◽

Gold Standard ◽

Protein Structures ◽

Data Bank ◽

Z Score ◽

Torsion Angles ◽

Current Gold Standard ◽

Quality Metric ◽

Ramachandran Plots ◽

Experimental Structure

SummaryRamachandran plots report the distribution of the (φ, Ψ) torsion angles of the protein backbone and are one of the best quality metrics of experimental structure models. Typically, validation software reports the number of residues belonging to “outlier”, “allowed” and “favored” regions. While “zero unexplained outliers” can be considered the current “gold standard”, this can be misleading if deviations from expected distributions, even within the favored region, are not considered. We therefore revisited the Ramachandran Z-score (Rama-Z), a quality metric introduced more than two decades ago, but underutilized. We describe a re-implementation of the Rama-Z score in the Computational Crystallography Toolbox along with a new algorithm to estimate its uncertainty for individual models; final implementations are available both in Phenix and in PDB-REDO. We discuss the interpretation of the Rama-Z score and advocate including it in the validation reports provided by the Protein Data Bank. We also advocate reporting it alongside the outlier/allowed/favored counts in structural publications.

Download Full-text

A library of coiled-coil domains: from regular bundles to peculiar twists

Bioinformatics ◽

10.1093/bioinformatics/btaa1041 ◽

2020 ◽

Author(s):

Krzysztof Szczepaniak ◽

Adriana Bukala ◽

Antonio Marinho da Silva Neto ◽

Jan Ludwiczak ◽

Stanislaw Dunin-Horkawicz

Keyword(s):

Protein Data Bank ◽

Conformational Changes ◽

Coiled Coil ◽

Data Bank ◽

Structural Features ◽

Coiled Coils ◽

Supplementary Information ◽

Numerical Representation ◽

Data Set ◽

Potential Applications

Abstract Motivation Coiled coils are widespread protein domains involved in diverse processes ranging from providing structural rigidity to the transduction of conformational changes. They comprise two or more α-helices that are wound around each other to form a regular supercoiled bundle. Owing to this regularity, coiled-coil structures can be described with parametric equations, thus enabling the numerical representation of their properties, such as the degree and handedness of supercoiling, rotational state of the helices, and the offset between them. These descriptors are invaluable in understanding the function of coiled coils and designing new structures of this type. The existing tools for such calculations require manual preparation of input and are therefore not suitable for the high-throughput analyses. Results To address this problem, we developed SamCC-Turbo, a software for fully-automated, per-residue measurement of coiled coils. By surveying Protein Data Bank with SamCC-Turbo, we generated a comprehensive atlas of ∼50,000 coiled-coil regions. This machine learning-ready data set features precise measurements as well as decomposes coiled-coil structures into fragments characterized by various degrees of supercoiling. The potential applications of SamCC-Turbo are exemplified by analyses in which we reveal general structural features of coiled coils involved in functions requiring conformational plasticity. Finally, we discuss further directions in the prediction and modeling of coiled coils. Availability SamCC-Turbo is available as a web server (https://lbs.cent.uw.edu.pl/samcc_turbo) and as a Python library (https://github.com/labstructbioinf/samcc_turbo), whereas the results of the Protein Data Bank scan can be browsed and downloaded at https://lbs.cent.uw.edu.pl/ccdb. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Statistical Analysis of the Internal Distances of Helical Pairs in Protein Molecules

Математическая биология и биоинформатика ◽

10.17537/2016.11.170 ◽

2016 ◽

Vol 11 (2) ◽

pp. 170-190 ◽

Cited By ~ 10

Author(s):

Д.А. Тихонов ◽

D.A. Tikhonov

Keyword(s):

Regression Analysis ◽

Statistical Analysis ◽

Protein Data Bank ◽

Long Range ◽

Data Bank ◽

Interplanar Distance ◽

Minimal Distance ◽

Gamma Distributions ◽

Protein Molecules ◽

Plane Passing

The statistical analysis of interhelical distances in pairs of connected α-helices found in known proteins has been performed. In accordance with the certain rules, a database of the pairs found in the Protein Data Bank has been compiled. This set was subdivided into three subsets according to criterion of crossing helix projections on the parallel plane passing through the axis of the helix. It was shown that the distribution of distances between the pairs of helices whose projections are not crossed has a more long-range nature than those whose projections are overlapped. Using the regression analysis the nature of distributions is investigated. In particular, it is shown that the distributions of interhelical distances in the subset of pairs of helices without intersections belong to the gamma distributions. It is also shown that the subset of the pairs with crossing projections have a smaller ratio of the minimal distance between the helical axes to the interplanar distance that is contrast to the set without crossing projections. It was concluded that the helical pairs with crossing projections are additionally stabilized by internal interactions.

Download Full-text