DeeplyTough: Learning Structural Comparison of Protein Binding Sites

Mapping Intimacies ◽

10.1101/600304 ◽

2019 ◽

Author(s):

Martin Simonovsky ◽

Joshua Meyers

Keyword(s):

Protein Binding ◽

Binding Sites ◽

Protein Function ◽

Large Scale ◽

Protein Structures ◽

Three Dimensional ◽

Protein Binding Sites ◽

Data Driven Approach ◽

Benchmark Datasets ◽

New Perspective

AbstractMotivationProtein binding site comparison (pocket matching) is of importance in drug discovery. Identification of similar binding sites can help guide efforts for hit finding, understanding polypharmacology and characterization of protein function. The design of pocket matching methods has traditionally involved much intuition, and has employed a broad variety of algorithms and representations of the input protein structures. We regard the high heterogeneity of past work and the recent availability of large-scale benchmarks as an indicator that a data-driven approach may provide a new perspective.ResultsWe propose DeeplyTough, a convolutional neural network that encodes a three-dimensional representation of protein binding sites into descriptor vectors that may be compared efficiently in an alignment-free manner by computing pairwise Euclidean distances. The network is trained with supervision: (i) to provide similar pockets with similar descriptors, (ii) to separate the descriptors of dissimilar pockets by a minimum margin, and (iii) to achieve robustness to nuisance variations. We evaluate our method using three large-scale benchmark datasets, on which it demonstrates excellent performance for held-out data coming from the training distribution and competitive performance when the trained network is required to generalize to datasets constructed independently.Availabilityhttps://github.com/BenevolentAI/[email protected],[email protected]

Download Full-text

Spatiotemporal identification of druggable binding sites using deep learning

10.1101/2020.02.20.952309 ◽

2020 ◽

Cited By ~ 1

Author(s):

Igor Kozlovskii ◽

Petr Popov

Keyword(s):

Protein Binding ◽

Binding Site ◽

Binding Sites ◽

Large Scale ◽

Specific Binding ◽

Three Dimensional ◽

Growth Factor Receptor ◽

Protein Binding Sites ◽

Binding Site Identification ◽

Scale Detection

Identification of novel protein binding sites expands «druggable genome» and opens new opportunities for drug discovery. Generally, presence or absence of a binding site depends on the three-dimensional conformation of a protein, making binding site identification resemble to object detection problem in computer vision. Here we introduce a computational approach for the large-scale detection of protein binding sites, named BiteNet, that considers protein conformations as the 3D-images, binding sites as the objects on these images to detect, and conformational ensembles of proteins as the 3D-videos to analyze. BiteNet is suitable for spatiotemporal detection of hard-to-spot allosteric binding sites, as we showed for conformation-specific binding site of the epidermal growth factor receptor, oligomer-specific binding site of the ion channel, and binding sites in G protein-coupled receptors. BiteNet outperforms state-of-the-art methods both in terms of accuracy and speed, taking about 1.5 minute to analyze 1000 conformations of a protein with 2000 atoms. BiteNet is available at https://github.com/i-Molecule/bitenet.

Download Full-text

Spatiotemporal identification of druggable binding sites using deep learning

Communications Biology ◽

10.1038/s42003-020-01350-0 ◽

2020 ◽

Vol 3 (1) ◽

Author(s):

Igor Kozlovskii ◽

Petr Popov

Keyword(s):

Protein Binding ◽

Binding Site ◽

Binding Sites ◽

Large Scale ◽

Specific Binding ◽

Three Dimensional ◽

Growth Factor Receptor ◽

Protein Binding Sites ◽

Binding Site Identification ◽

Scale Detection

Abstract Identification of novel protein binding sites expands druggable genome and opens new opportunities for drug discovery. Generally, presence or absence of a binding site depends on the three-dimensional conformation of a protein, making binding site identification resemble the object detection problem in computer vision. Here we introduce a computational approach for the large-scale detection of protein binding sites, that considers protein conformations as 3D-images, binding sites as objects on these images to detect, and conformational ensembles of proteins as 3D-videos to analyze. BiteNet is suitable for spatiotemporal detection of hard-to-spot allosteric binding sites, as we showed for conformation-specific binding site of the epidermal growth factor receptor, oligomer-specific binding site of the ion channel, and binding site in G protein-coupled receptor. BiteNet outperforms state-of-the-art methods both in terms of accuracy and speed, taking about 1.5 minutes to analyze 1000 conformations of a protein with ~2000 atoms.

Download Full-text

The Overlap of Small Molecule and Protein Binding Sites within Families of Protein Structures

PLoS Computational Biology ◽

10.1371/journal.pcbi.1000668 ◽

2010 ◽

Vol 6 (2) ◽

pp. e1000668 ◽

Cited By ~ 38

Author(s):

Fred P. Davis ◽

Andrej Sali

Keyword(s):

Protein Binding ◽

Small Molecule ◽

Binding Sites ◽

Protein Structures ◽

Protein Binding Sites

Download Full-text

SitePrint: Three-Dimensional Pharmacophore Descriptors Derived from Protein Binding Sites for Family Based Active Site Analysis, Classification, and Drug Design.

ChemInform ◽

10.1002/chin.200509237 ◽

2005 ◽

Vol 36 (9) ◽

Author(s):

James R. Arnold ◽

Keith W. Burdick ◽

Scott C.-H. Pegg ◽

Samuel Toba ◽

Michelle L. Lamb ◽

...

Keyword(s):

Drug Design ◽

Protein Binding ◽

Active Site ◽

Binding Sites ◽

Three Dimensional ◽

Protein Binding Sites ◽

Site Analysis ◽

Family Based

Download Full-text

An efficient algorithm for matching protein binding sites for protein function prediction

Proceedings of the 2nd ACM Conference on Bioinformatics, Computational Biology and Biomedicine - BCB '11 ◽

10.1145/2147805.2147837 ◽

2011 ◽

Cited By ~ 1

Author(s):

Leif Ellingson ◽

Jinfeng Zhang

Keyword(s):

Protein Binding ◽

Binding Sites ◽

Protein Function ◽

Efficient Algorithm ◽

Protein Function Prediction ◽

Function Prediction ◽

Protein Binding Sites

Download Full-text

A QUANTITATIVE ANALYSIS OF INTERFACIAL AMINO ACID CONSERVATION IN PROTEIN-PROTEIN HETERO COMPLEXES

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720005001429 ◽

2005 ◽

Vol 03 (05) ◽

pp. 1137-1150 ◽

Cited By ~ 10

Author(s):

BOOJALA V. B. REDDY ◽

YIANNIS N. KAZNESSIS

Keyword(s):

Amino Acid ◽

Protein Binding ◽

Binding Sites ◽

Protein Structures ◽

Complex Structures ◽

Protein Surface ◽

Protein Surfaces ◽

Protein Binding Sites ◽

Interaction Sites ◽

Conserved Residue

A long-standing question in molecular biology is whether interfaces of protein-protein complexes are more conserved than the rest of the protein surfaces. Although it has been reported that conservation can be used as an indicator for predicting interaction sites on proteins, there are recent reports stating that the interface regions are only slightly more conserved than the rest of the protein surfaces, with conservation signals not being statistically significant enough for predicting protein-protein binding sites. In order to properly address these controversial reports we have studied a set of 28 well resolved hetero complex structures of proteins that consists of transient and non-transient complexes. The surface positions were classified into four conservation classes and the conservation index of the surface positions was quantitatively analyzed. The results indicate that the surface density of highly conserved positions is significantly higher in the protein-protein interface regions compared with the other regions of the protein surface. However, the average conservation index of the patches in the interface region is not significantly higher compared with other surface regions of the protein structures. This finding demonstrates that the number of conserved residue positions is a more appropriate indicator for predicting protein-protein binding sites than the average conservation index in the interacting region. We have further validated our findings on a set of 59 benchmark complex structures. Furthermore, an analysis of 19 complexes of antigen-antibody interactions shows that there is no conservation of amino acid positions in the interacting regions of these complexes, as expected, with the variable region of the immunoglobulins interacting mostly with the antigens. Interestingly, antigen interacting regions also have a higher number of non-conserved residue positions in the interacting region than the rest of the protein surface.

Download Full-text

Deep geometric representations for modeling effects of mutations on protein-protein binding affinity

PLoS Computational Biology ◽

10.1371/journal.pcbi.1009284 ◽

2021 ◽

Vol 17 (8) ◽

pp. e1009284

Author(s):

Xianggen Liu ◽

Yunan Luo ◽

Pengyong Li ◽

Sen Song ◽

Jian Peng

Keyword(s):

Protein Binding ◽

Binding Affinity ◽

Protein Design ◽

Protein Structures ◽

Three Dimensional ◽

Point Mutations ◽

Dimensional Structure ◽

Gradient Boosting ◽

Benchmark Datasets ◽

The Impact

Modeling the impact of amino acid mutations on protein-protein interaction plays a crucial role in protein engineering and drug design. In this study, we develop GeoPPI, a novel structure-based deep-learning framework to predict the change of binding affinity upon mutations. Based on the three-dimensional structure of a protein, GeoPPI first learns a geometric representation that encodes topology features of the protein structure via a self-supervised learning scheme. These representations are then used as features for training gradient-boosting trees to predict the changes of protein-protein binding affinity upon mutations. We find that GeoPPI is able to learn meaningful features that characterize interactions between atoms in protein structures. In addition, through extensive experiments, we show that GeoPPI achieves new state-of-the-art performance in predicting the binding affinity changes upon both single- and multi-point mutations on six benchmark datasets. Moreover, we show that GeoPPI can accurately estimate the difference of binding affinities between a few recently identified SARS-CoV-2 antibodies and the receptor-binding domain (RBD) of the S protein. These results demonstrate the potential of GeoPPI as a powerful and useful computational tool in protein design and engineering. Our code and datasets are available at: https://github.com/Liuxg16/GeoPPI.

Download Full-text

CavSimBase: A Database for Large Scale Comparison of Protein Binding Sites

IEEE Transactions on Knowledge and Data Engineering ◽

10.1109/tkde.2016.2520484 ◽

2016 ◽

Vol 28 (6) ◽

pp. 1423-1434 ◽

Cited By ~ 7

Author(s):

Matthias Leinweber ◽

Thomas Fober ◽

Marc Strickert ◽

Lars Baumgartner ◽

Gerhard Klebe ◽

...

Keyword(s):

Protein Binding ◽

Binding Sites ◽

Large Scale ◽

Protein Binding Sites

Download Full-text

Prediction of protein binding sites in protein structures using hidden Markov support vector machine

BMC Bioinformatics ◽

10.1186/1471-2105-10-381 ◽

2009 ◽

Vol 10 (1) ◽

Cited By ~ 35

Author(s):

Bin Liu ◽

Xiaolong Wang ◽

Lei Lin ◽

Buzhou Tang ◽

Qiwen Dong ◽

...

Keyword(s):

Support Vector Machine ◽

Protein Binding ◽

Binding Sites ◽

Hidden Markov ◽

Protein Structures ◽

Support Vector ◽

Protein Binding Sites

Download Full-text

SitePrint: Three-Dimensional Pharmacophore Descriptors Derived from Protein Binding Sites for Family Based Active Site Analysis, Classification, and Drug Design

Journal of Chemical Information and Computer Sciences ◽

10.1021/ci049814f ◽

2004 ◽

Vol 44 (6) ◽

pp. 2190-2198 ◽

Cited By ~ 9

Author(s):

James R. Arnold ◽

Keith W. Burdick ◽

Scott C.-H. Pegg ◽

Samuel Toba ◽

Michelle L. Lamb ◽

...

Keyword(s):

Drug Design ◽

Protein Binding ◽

Active Site ◽

Binding Sites ◽

Three Dimensional ◽

Protein Binding Sites ◽

Site Analysis ◽

Family Based

Download Full-text