scholarly journals RCSB Protein Data Bank 1D tools and services

Author(s):  
Joan Segura ◽  
Yana Rose ◽  
John Westbrook ◽  
Stephen K Burley ◽  
Jose M Duarte

Abstract Motivation Interoperability between polymer sequences and structural data is essential for providing a complete picture of protein and gene features and helping to understand biomolecular function. Results Herein, we present two resources designed to improve interoperability between the RCSB Protein Data Bank, the NCBI and the UniProtKB data resources and visualize integrated data therefrom. The underlying tools provide a flexible means of mapping between the different coordinate spaces and an interactive tool allows convenient visualization of the 1-dimensional data over the web. Availabilityand implementation https://1d-coordinates.rcsb.org and https://rcsb.github.io/rcsb-saguaro. Supplementary information Supplementary data are available at Bioinformatics online.

Author(s):  
Konrad Diedrich ◽  
Joel Graef ◽  
Katrin Schöning-Stierand ◽  
Matthias Rarey

Abstract Summary The searching of user-defined 3D queries in molecular interfaces is a computationally challenging problem that is not satisfactorily solved so far. Most of the few existing tools focused on that purpose are desktop based and not openly available. Besides that, they show a lack of query versatility, search efficiency and user-friendliness. We address this issue with GeoMine, a publicly available web application that provides textual, numerical and geometrical search functionality for protein–ligand binding sites derived from structural data contained in the Protein Data Bank (PDB). The query generation is supported by a 3D representation of a start structure that provides interactively selectable elements like atoms, bonds and interactions. GeoMine gives full control over geometric variability in the query while performing a deterministic, precise search. Reasonably selective queries are processed on the entire set of protein–ligand complexes in the PDB within a few minutes. GeoMine offers an interactive and iterative search process of successive result analyses and query adaptations. From the numerous potential applications, we picked two from the field of side-effect analyze showcasing the usefulness of GeoMine. Availability and implementation GeoMine is part of the ProteinsPlus web application suite and freely available at https://proteins.plus. Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Vol 36 (9) ◽  
pp. 2750-2754 ◽  
Author(s):  
Sam M Ireland ◽  
Andrew C R Martin

Abstract Summary Structural biology relies on specific file formats to convey information about macromolecular structures. Traditionally this has been the PDB format, but increasingly newer formats, such as PDBML, mmCIF and MMTF are being used. Here we present atomium, a modern, lightweight, Python library for parsing, manipulating and saving PDB, mmCIF and MMTF file formats. In addition, we provide a web service, pdb2json, which uses atomium to give a consistent JSON representation to the entire Protein Data Bank. Availability and implementation atomium is implemented in Python and its performance is equivalent to the existing library BioPython. However, it has significant advantages in features and API design. atomium is available from atomium.bioinf.org.uk and pdb2json can be accessed at pdb2json.bioinf.org.uk Supplementary information Supplementary data are available at Bioinformatics online.


2019 ◽  
Vol 35 (18) ◽  
pp. 3510-3511 ◽  
Author(s):  
Abhik Mukhopadhyay ◽  
Neera Borkakoti ◽  
Lukáš Pravda ◽  
Jonathan D Tyzack ◽  
Janet M Thornton ◽  
...  

Abstract Motivation Cofactors are essential for many enzyme reactions. The Protein Data Bank (PDB) contains >67 000 entries containing enzyme structures, many with bound cofactor or cofactor-like molecules. This work aims to identify and categorize these small molecules in the PDB and make it easier to find them. Results The Protein Data Bank in Europe (PDBe; pdbe.org) has implemented a pipeline to identify enzyme cofactor and cofactor-like molecules, which are now part of the PDBe weekly release process. Availability and implementation Information is made available on the individual PDBe entry pages at pdbe.org and programmatically through the PDBe REST API (pdbe.org/api). Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Keisuke Arikawa

On the basis of robot kinematics, we have thus far developed a method for predicting the motion of proteins from their 3D structural data given in the Protein Data Bank (PDB data). In this method, proteins are modeled as serial manipulators constrained by springs and the structural compliance properties of the models are evaluated. We focus on localized instead of whole structures of proteins. Employing the same model used in our method of motion prediction, the motion properties of the localized structures and the relation between the motion properties of localized and whole structures are analyzed. First, we present a method for graphically expressing the deformation of objects with a complex shape, such as proteins, by approximating the shape as a rectangular prism with a mesh on its surface. We then formulate a method for comparing the motion properties of localized structures cleaved from the whole structure and those remaining in it by expressing the motion of the latter using the decomposed motion modes of the former according to the structural compliance. Finally, we show a method for evaluating the effect of a localized structure on the motion properties of proteins by applying forces to localized structures. In the formulations, we demonstrate applications as illustrative examples using the PDB data of a real protein.


2011 ◽  
Vol 40 (D1) ◽  
pp. D453-D460 ◽  
Author(s):  
A. R. Kinjo ◽  
H. Suzuki ◽  
R. Yamashita ◽  
Y. Ikegawa ◽  
T. Kudou ◽  
...  

Author(s):  
Michele Berselli ◽  
Enrico Lavezzo ◽  
Stefano Toppo

Abstract Motivation G-quadruplexes (G4s) are non-canonical nucleic acid conformations that are widespread in all kingdoms of life and are emerging as important regulators both in RNA and DNA. Recently, two new higher-order architectures have been reported: adjacent interacting G4s, and G4s with stable long loops forming stem-loop structures. As there are no specialized tools to identify these conformations, we developed QPARSE. Results QPARSE can exhaustively search for degenerate potential quadruplex-forming sequences (PQSs) containing bulges and/or mismatches at genomic level, as well as either multimeric or long-looped PQS (MPQS and LLPQS respectively). While its assessment vs. known reference datasets is comparable with the state-of-the-art, what is more interesting is its performance in the identification of MPQS and LLPQS that present algorithms are not designed to search for. We report a comprehensive analysis of MPQS in human gene promoters and the analysis of LLPQS on three experimentally validated case studies from HIV-1, BCL2, and hTERT. Availability QPARSE is freely accessible on the web at http://www.medcomp.medicina.unipd.it/qparse/index or downloadable from github as a python 2.7 program https://github.com/B3rse/qparse Supplementary information Supplementary data are available at Bioinformatics online.


2019 ◽  
Vol 35 (20) ◽  
pp. 4165-4167 ◽  
Author(s):  
Jonathan Fine ◽  
Gaurav Chopra

Abstract Motivation The Protein Data Bank (PDB) currently holds over 140 000 biomolecular structures and continues to release new structures on a weekly basis. The PDB is an essential resource to the structural bioinformatics community to develop software that mine, use, categorize and analyze such data. New computational biology methods are evaluated using custom benchmarking sets derived as subsets of 3D experimentally determined structures and structural features from the PDB. Currently, such benchmarking features are manually curated with custom scripts in a non-standardized manner that results in slow distribution and updates with new experimental structures. Finally, there is a scarcity of standardized tools to rapidly query 3D descriptors of the entire PDB. Results Our solution is the Lemon framework, a C++11 library with Python bindings, which provides a consistent workflow methodology for selecting biomolecular interactions based on user criterion and computing desired 3D structural features. This framework can parse and characterize the entire PDB in <10 min on modern, multithreaded hardware. The speed in parsing is obtained by using the recently developed MacroMolecule Transmission Format to reduce the computational cost of reading text-based PDB files. The use of C++ lambda functions and Python bindings provide extensive flexibility for analysis and categorization of the PDB by allowing the user to write custom functions to suite their objective. We think Lemon will become a one-stop-shop to quickly mine the entire PDB to generate desired structural biology features. Availability and implementation The Lemon software is available as a C++ header library along with a PyPI package and example functions at https://github.com/chopralab/lemon. Supplementary information Supplementary data are available at Bioinformatics online.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Glen van Ginkel ◽  
Lukáš Pravda ◽  
José M. Dana ◽  
Mihaly Varadi ◽  
Peter Keller ◽  
...  

Abstract Background Biomacromolecular structural data outgrew the legacy Protein Data Bank (PDB) format which the scientific community relied on for decades, yet the use of its successor PDBx/Macromolecular Crystallographic Information File format (PDBx/mmCIF) is still not widespread. Perhaps one of the reasons is the availability of easy to use tools that only support the legacy format, but also the inherent difficulties of processing mmCIF files correctly, given the number of edge cases that make efficient parsing problematic. Nevertheless, to fully exploit macromolecular structure data and their associated annotations such as multiscale structures from integrative/hybrid methods or large macromolecular complexes determined using traditional methods, it is necessary to fully adopt the new format as soon as possible. Results To this end, we developed PDBeCIF, an open-source Python project for manipulating mmCIF and CIF files. It is part of the official list of mmCIF parsers recorded by the wwPDB and is heavily employed in the processes of the Protein Data Bank in Europe. The package is freely available both from the PyPI repository (http://pypi.org/project/pdbecif) and from GitHub (https://github.com/pdbeurope/pdbecif) along with rich documentation and many ready-to-use examples. Conclusions PDBeCIF is an efficient and lightweight Python 2.6+/3+ package with no external dependencies. It can be readily integrated with 3rd party libraries as well as adopted for broad scientific analyses.


2012 ◽  
Vol 68 (4) ◽  
pp. 478-483 ◽  
Author(s):  
Swanand Gore ◽  
Sameer Velankar ◽  
Gerard J. Kleywegt

There is an increasing realisation that the quality of the biomacromolecular structures deposited in the Protein Data Bank (PDB) archive needs to be assessed critically using established and powerful validation methods. The Worldwide Protein Data Bank (wwPDB) organization has convened several Validation Task Forces (VTFs) to advise on the methods and standards that should be used to validate all of the entries already in the PDB as well as all structures that will be deposited in the future. The recommendations of the X-ray VTF are currently being implemented in a software pipeline. Here, ongoing work on this pipeline is briefly described as well as ways in which validation-related information could be presented to users of structural data.


Sign in / Sign up

Export Citation Format

Share Document