Hotspot coevolution at protein-protein interfaces is a key identifier of native protein complexes

Mapping Intimacies ◽

10.1101/698233 ◽

2019 ◽

Cited By ~ 1

Author(s):

Sambit K. Mishra ◽

Sarah J. Cooper ◽

Jerry M. Parks ◽

Julie C. Mitchell

Keyword(s):

Protein Interactions ◽

Protein Complexes ◽

Fundamental Problem ◽

Scoring Function ◽

Binding Mode ◽

Protein Docking ◽

Evolutionary Information ◽

Protein Protein Interactions ◽

Novel Approach ◽

Protein Interfaces

AbstractProtein-protein interactions play a key role in mediating numerous biological functions, with more than half the proteins in living organisms existing as either homo- or hetero-oligomeric assemblies. Protein subunits that form oligomers minimize the free energy of the complex, but exhaustive computational search-based docking methods have not comprehensively addressed the protein docking challenge of distinguishing a natively bound complex from non-native forms. In this study, we propose a scoring function, KFC-E, that accounts for both conservation and coevolution of putative binding hotspot residues at protein-protein interfaces. For a benchmark set of 53 bound complexes, KFC-E identifies a near-native binding mode as the top-scoring pose in 38% and in the top 5 in 55% of the complexes. For a set of 17 unbound complexes, KFC-E identifies a near-native pose in the top 10 ranked poses in more than 50% of the cases. By contrast, a scoring function that incorporates information on coevolution at predicted non-hotspots performs poorly by comparison. Our study highlights the importance of coevolution at hotspot residues in forming natively bound complexes and suggests a novel approach for coevolutionary scoring in protein docking.Author SummaryA fundamental problem in biology is to distinguish between the native and non-native bound forms of protein-protein complexes. Experimental methods are often used to detect the native bound forms of proteins but, are demanding in terms of time and resources. Computational approaches have proven to be a useful alternative; they sample the different binding configurations for a pair of interacting proteins and then use an heuristic or physical model to score them. In this study we propose a new scoring approach, KFC-E, which focuses on the evolutionary contributions from a subset of key interface residues (hotspots) to identify native bound complexes. KFC-E capitalizes on the wealth of information in protein sequence databases by incorporating residue-level conservation and coevolution of putative binding hotspots. As hotspot residues mediate the binding energetics of protein-protein interactions, we hypothesize that the knowledge of putative hotspots coupled with their evolutionary information should be helpful in the identification of native bound protein-protein complexes.

Download Full-text

Udock, the interactive docking entertainment system

Faraday Discussions ◽

10.1039/c3fd00147d ◽

2014 ◽

Vol 169 ◽

pp. 425-441 ◽

Cited By ~ 10

Author(s):

Guillaume Levieux ◽

Guillaume Tiger ◽

Stéphanie Mader ◽

Jean-François Zagury ◽

Stéphane Natkin ◽

...

Keyword(s):

Protein Interactions ◽

Protein Complexes ◽

Protein Structures ◽

Protein Docking ◽

Conformational Space ◽

Protein Protein Interactions ◽

Binding Interface ◽

Protein Interfaces ◽

Almost All ◽

Experimental Partner

Protein–protein interactions play a crucial role in biological processes. Protein docking calculations' goal is to predict, given two proteins of known structures, the associate conformation of the corresponding complex. Here, we present a new interactive protein docking system, Udock, that makes use of users' cognitive capabilities added up. In Udock, the users tackle simplified representations of protein structures and explore protein–protein interfaces’ conformational space using a gamified interactive docking system with on the fly scoring. We assumed that if given appropriate tools, a naïve user's cognitive capabilities could provide relevant data for (1) the prediction of correct interfaces in binary protein complexes and (2) the identification of the experimental partner in interaction among a set of decoys. To explore this approach experimentally, we conducted a preliminary two week long playtest where the registered users could perform a cross-docking on a dataset comprising 4 binary protein complexes. The users explored almost all the surface of the proteins that were available in the dataset but favored certain regions that seemed more attractive as potential docking spots. These favored regions were located inside or nearby the experimental binding interface for 5 out of the 8 proteins in the dataset. For most of them, the best scores were obtained with the experimental partner. The alpha version of Udock is freely accessible at http://udock.fr.

Download Full-text

ATRIPPI: AN ATOM-RESIDUE PREFERENCE SCORING FUNCTION FOR PROTEIN–PROTEIN INTERACTIONS

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213010000169 ◽

2010 ◽

Vol 19 (03) ◽

pp. 251-266 ◽

Cited By ~ 1

Author(s):

KANG-PING LIU ◽

KAI-CHENG HSU ◽

JHANG-WEI HUANG ◽

LU-SHIAN CHANG ◽

JINN-MOON YANG

Keyword(s):

Protein Interactions ◽

Electrostatic Interactions ◽

Specific Interaction ◽

Scoring Function ◽

Protein Docking ◽

Protein Protein Interactions ◽

Disulfide Bonding ◽

Knowledge Based ◽

Aromatic Interactions ◽

Protein Interfaces

We present an ATRIPPI model for analyzing protein–protein interactions. This model is a 167-atom-type and residue-specific interaction preferences with distance bins derived from 641 co-crystallized protein–protein interfaces. The ATRIPPI model is able to yield physical meanings of hydrogen bonding, disulfide bonding, electrostatic interactions, van der Waals and aromatic–aromatic interactions. We applied this model to identify the native states and near-native complex structures on 17 bound and 17 unbound complexes from thousands of decoy structures. On average, 77.5% structures (155 structures) of top rank 200 structures are closed to the native structure. These results suggest that the ATRIPPI model is able to keep the advantages of both atom–atom and residue–residue interactions and is a potential knowledge-based scoring function for protein–protein docking methods. We believe that our model is robust and provides biological meanings to support protein–protein interactions.

Download Full-text

Mass spectrometry-based cross-linking study shows that the Psb28 protein binds to cytochrome b559 in Photosystem II

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1620360114 ◽

2017 ◽

Vol 114 (9) ◽

pp. 2224-2229 ◽

Cited By ~ 18

Author(s):

Daniel A. Weisz ◽

Haijun Liu ◽

Hao Zhang ◽

Sundarapandian Thangapandian ◽

Emad Tajkhorshid ◽

...

Keyword(s):

Mass Spectrometry ◽

Photosystem Ii ◽

Protein Interactions ◽

Protein Complexes ◽

Protein Docking ◽

Cross Linking ◽

Protein Protein Interactions ◽

Cytochrome B559 ◽

Reaction Center Complex ◽

Assembly Intermediate

Photosystem II (PSII), a large pigment protein complex, undergoes rapid turnover under natural conditions. During assembly of PSII, oxidative damage to vulnerable assembly intermediate complexes must be prevented. Psb28, the only cytoplasmic extrinsic protein in PSII, protects the RC47 assembly intermediate of PSII and assists its efficient conversion into functional PSII. Its role is particularly important under stress conditions when PSII damage occurs frequently. Psb28 is not found, however, in any PSII crystal structure, and its structural location has remained unknown. In this study, we used chemical cross-linking combined with mass spectrometry to capture the transient interaction of Psb28 with PSII. We detected three cross-links between Psb28 and the α- and β-subunits of cytochrome b559, an essential component of the PSII reaction-center complex. These distance restraints enable us to position Psb28 on the cytosolic surface of PSII directly above cytochrome b559, in close proximity to the QB site. Protein–protein docking results also support Psb28 binding in this region. Determination of the Psb28 binding site and other biochemical evidence allow us to propose a mechanism by which Psb28 exerts its protective effect on the RC47 intermediate. This study also shows that isotope-encoded cross-linking with the “mass tags” selection criteria allows confident identification of more cross-linked peptides in PSII than has been previously reported. This approach thus holds promise to identify other transient protein–protein interactions in membrane protein complexes.

Download Full-text

Protein-protein docking using learned three-dimensional representations

10.1101/738690 ◽

2019 ◽

Cited By ~ 1

Author(s):

Georgy Derevyanko ◽

Guillaume Lamoureux

Keyword(s):

Protein Interactions ◽

Network Architecture ◽

Protein Complexes ◽

Three Dimensional ◽

Spatial Arrangement ◽

Protein Docking ◽

Protein Protein Interactions ◽

Translational Invariance ◽

Shape Complementarity ◽

Spatial Features

AbstractProtein-protein interactions are determined by a number of hard-to-capture features related to shape complementarity, electrostatics, and hydrophobicity. These features may be intrinsic to the protein or induced by the presence of a partner. A conventional approach to protein-protein docking consists in engineering a small number of spatial features for each protein, and in minimizing the sum of their correlations with respect to the spatial arrangement of the two proteins. To generalize this approach, we introduce a deep neural network architecture that transforms the raw atomic densities of each protein into complex three-dimensional representations. Each point in the volume containing the protein is described by 48 learned features, which are correlated and combined with the features of a second protein to produce a score dependent on the relative position and orientation of the two proteins. The architecture is based on multiple layers of SE(3)-equivariant convolutional neural networks, which provide built-in rotational and translational invariance of the score with respect to the structure of the complex. The model is trained end-to-end on a set of decoy conformations generated from 851 nonredundant protein-protein complexes and is tested on data from the Protein-Protein Docking Benchmark Version 4.0.

Download Full-text

Accurate refinement of docked protein complexes using evolutionary information and deep learning

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720016420026 ◽

2016 ◽

Vol 14 (03) ◽

pp. 1642002 ◽

Cited By ~ 11

Author(s):

Bahar Akbal-Delibas ◽

Roshanak Farhoodi ◽

Marc Pomplun ◽

Nurit Haspel

Keyword(s):

Deep Learning ◽

Protein Complexes ◽

Scoring Function ◽

Protein Docking ◽

Training Data ◽

Evolutionary Information ◽

Native Structure ◽

Learning Network ◽

Small Set ◽

Deep Learning Network

One of the major challenges for protein docking methods is to accurately discriminate native-like structures from false positives. Docking methods are often inaccurate and the results have to be refined and re-ranked to obtain native-like complexes and remove outliers. In a previous work, we introduced AccuRefiner, a machine learning based tool for refining protein–protein complexes. Given a docked complex, the refinement tool produces a small set of refined versions of the input complex, with lower root-mean-square-deviation (RMSD) of atomic positions with respect to the native structure. The method employs a unique ranking tool that accurately predicts the RMSD of docked complexes with respect to the native structure. In this work, we use a deep learning network with a similar set of features and five layers. We show that a properly trained deep learning network can accurately predict the RMSD of a docked complex with 1.40 Å error margin on average, by approximating the complex relationship between a wide set of scoring function terms and the RMSD of a docked structure. The network was trained on 35000 unbound docking complexes generated by RosettaDock. We tested our method on 25 different putative docked complexes produced also by RosettaDock for five proteins that were not included in the training data. The results demonstrate that the high accuracy of the ranking tool enables AccuRefiner to consistently choose the refinement candidates with lower RMSD values compared to the coarsely docked input structures.

Download Full-text

Pushing the accuracy limit of shape complementarity for protein-protein docking

BMC Bioinformatics ◽

10.1186/s12859-019-3270-y ◽

2019 ◽

Vol 20 (S25) ◽

Cited By ~ 8

Author(s):

Yumeng Yan ◽

Sheng-You Huang

Keyword(s):

Success Rate ◽

Protein Interactions ◽

Shape Representation ◽

Scoring Function ◽

Protein Docking ◽

Protein Protein Interactions ◽

Second Best ◽

Shape Complementarity ◽

Docking Program ◽

Docking Approach

Abstract Background Protein-protein docking is a valuable computational approach for investigating protein-protein interactions. Shape complementarity is the most basic component of a scoring function and plays an important role in protein-protein docking. Despite significant progresses, shape representation remains an open question in the development of protein-protein docking algorithms, especially for grid-based docking approaches. Results We have proposed a new pairwise shape-based scoring function (LSC) for protein-protein docking which adopts an exponential form to take into account long-range interactions between protein atoms. The LSC scoring function was incorporated into our FFT-based docking program and evaluated for both bound and unbound docking on the protein docking benchmark 4.0. It was shown that our LSC achieved a significantly better performance than four other similar docking methods, ZDOCK 2.1, MolFit/G, GRAMM, and FTDock/G, in both success rate and number of hits. When considering the top 10 predictions, LSC obtained a success rate of 51.71% and 6.82% for bound and unbound docking, respectively, compared to 42.61% and 4.55% for the second-best program ZDOCK 2.1. LSC also yielded an average of 8.38 and 3.94 hits per complex in the top 1000 predictions for bound and unbound docking, respectively, followed by 6.38 and 2.96 hits for the second-best ZDOCK 2.1. Conclusions The present LSC method will not only provide an initial-stage docking approach for post-docking processes but also have a general implementation for accurate representation of other energy terms on grids in protein-protein docking. The software has been implemented in our HDOCK web server at http://hdock.phys.hust.edu.cn/.

Download Full-text

Conservation of coevolving protein interfaces bridges prokaryote–eukaryote homologies in the twilight zone

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1611861114 ◽

2016 ◽

Vol 113 (52) ◽

pp. 15018-15023 ◽

Cited By ~ 24

Author(s):

Juan Rodriguez-Rivas ◽

Simone Marsili ◽

David Juan ◽

Alfonso Valencia

Keyword(s):

Protein Interactions ◽

Protein Complexes ◽

Accurate Information ◽

Twilight Zone ◽

Sequence Information ◽

Protein Protein Interactions ◽

Sequence Alignments ◽

Multiple Sequence ◽

Protein Interfaces ◽

Recent Developments

Protein–protein interactions are fundamental for the proper functioning of the cell. As a result, protein interaction surfaces are subject to strong evolutionary constraints. Recent developments have shown that residue coevolution provides accurate predictions of heterodimeric protein interfaces from sequence information. So far these approaches have been limited to the analysis of families of prokaryotic complexes for which large multiple sequence alignments of homologous sequences can be compiled. We explore the hypothesis that coevolution points to structurally conserved contacts at protein–protein interfaces, which can be reliably projected to homologous complexes with distantly related sequences. We introduce a domain-centered protocol to study the interplay between residue coevolution and structural conservation of protein–protein interfaces. We show that sequence-based coevolutionary analysis systematically identifies residue contacts at prokaryotic interfaces that are structurally conserved at the interface of their eukaryotic counterparts. In turn, this allows the prediction of conserved contacts at eukaryotic protein–protein interfaces with high confidence using solely mutational patterns extracted from prokaryotic genomes. Even in the context of high divergence in sequence (the twilight zone), where standard homology modeling of protein complexes is unreliable, our approach provides sequence-based accurate information about specific details of protein interactions at the residue level. Selected examples of the application of prokaryotic coevolutionary analysis to the prediction of eukaryotic interfaces further illustrate the potential of this approach.

Download Full-text

InterPep2: global peptide–protein docking using interaction surface templates

Bioinformatics ◽

10.1093/bioinformatics/btaa005 ◽

2020 ◽

Vol 36 (8) ◽

pp. 2458-2465 ◽

Cited By ~ 2

Author(s):

Isak Johansson-Åkhe ◽

Claudio Mirabello ◽

Björn Wallner

Keyword(s):

Protein Interactions ◽

Protein Complexes ◽

Structural Features ◽

Protein Docking ◽

Supplementary Information ◽

Peptide Ligand ◽

Protein Protein Interactions ◽

Intrinsically Disordered ◽

Intrinsically Disordered Regions ◽

Improved Performance

Abstract Motivation Interactions between proteins and peptides or peptide-like intrinsically disordered regions are involved in many important biological processes, such as gene expression and cell life-cycle regulation. Experimentally determining the structure of such interactions is time-consuming and difficult because of the inherent flexibility of the peptide ligand. Although several prediction-methods exist, most are limited in performance or availability. Results InterPep2 is a freely available method for predicting the structure of peptide–protein interactions. Improved performance is obtained by using templates from both peptide–protein and regular protein–protein interactions, and by a random forest trained to predict the DockQ-score for a given template using sequence and structural features. When tested on 252 bound peptide–protein complexes from structures deposited after the complexes used in the construction of the training and templates sets of InterPep2, InterPep2-Refined correctly positioned 67 peptides within 4.0 Å LRMSD among top10, similar to another state-of-the-art template-based method which positioned 54 peptides correctly. However, InterPep2 displays a superior ability to evaluate the quality of its own predictions. On a previously established set of 27 non-redundant unbound-to-bound peptide–protein complexes, InterPep2 performs on-par with leading methods. The extended InterPep2-Refined protocol managed to correctly model 15 of these complexes within 4.0 Å LRMSD among top10, without using templates from homologs. In addition, combining the template-based predictions from InterPep2 with ab initio predictions from PIPER-FlexPepDock resulted in 22% more near-native predictions compared to the best single method (22 versus 18). Availability and implementation The program is available from: http://wallnerlab.org/InterPep2. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

A knowledge–based scoring function to assess the stability of quaternary protein assemblies

10.1101/562520 ◽

2019 ◽

Cited By ~ 3

Author(s):

Abhilesh S. Dhawanjewar ◽

Ankit Roy ◽

M.S. Madhusudhan

Keyword(s):

Protein Interactions ◽

Binary Classification ◽

Scoring Function ◽

Protein Docking ◽

Scoring Functions ◽

Protein Protein Interactions ◽

Residue Contact ◽

Knowledge Based ◽

Cellular Biochemistry ◽

The Stability

AbstractMotivationElucidation of protein-protein interactions is a necessary step towards understanding the complete repertoire of cellular biochemistry. Given the enormity of the problem, the expenses and limitations of experimental methods, it is imperative that this problem is tackled computationally. In silico predictions of protein interactions entail sampling different conformations of the purported complex and then scoring these to assess for interaction viability. In this study we have devised a new scheme for scoring protein-protein interactions.ResultsOur method, PIZSA (Protein Interaction Z Score Assessment) is a binary classification scheme for identification of stable protein quaternary assemblies (binders/non-binders) based on statistical potentials. The scoring scheme incorporates residue-residue contact preference on the interface with per residue-pair atomic contributions and accounts for clashes. PIZSA can accurately discriminate between native and non-native structural conformations from protein docking experiments and outperform other recently published scoring functions, demonstrated through testing on a benchmark set and the CAPRI Score_set. Though not explicitly trained for this purpose, PIZSA potentials can identify spurious interactions that are artefacts of the crystallization process.AvailabilityPIZSA is implemented as awebserverat http://cospi.iiserpune.ac.in/pizsa/[email protected]

Download Full-text

FWHT-RF: A Novel Computational Approach to Predict Plant Protein-Protein Interactions via an Ensemble Learning Method

Scientific Programming ◽

10.1155/2021/1607946 ◽

2021 ◽

Vol 2021 ◽

pp. 1-11

Author(s):

Jie Pan ◽

Li-Ping Li ◽

Chang-Qing Yu ◽

Zhu-Hong You ◽

Zhong-Hao Ren ◽

...

Keyword(s):

Protein Interactions ◽

Nearest Neighbor ◽

Protein Sequences ◽

Evolutionary Information ◽

Support Vector ◽

Protein Protein Interactions ◽

K Nearest Neighbor ◽

Novel Approach ◽

Knn Classifier ◽

Scoring Matrix

Protein-protein interactions (PPIs) in plants are crucial for understanding biological processes. Although high-throughput techniques produced valuable information to identify PPIs in plants, they are usually expensive, inefficient, and extremely time-consuming. Hence, there is an urgent need to develop novel computational methods to predict PPIs in plants. In this article, we proposed a novel approach to predict PPIs in plants only using the information of protein sequences. Specifically, plants’ protein sequences are first converted as position-specific scoring matrix (PSSM); then, the fast Walsh–Hadamard transform (FWHT) algorithm is used to extract feature vectors from PSSM to obtain evolutionary information of plant proteins. Lastly, the rotation forest (RF) classifier is trained for prediction and produced a series of evaluation results. In this work, we named this approach FWHT-RF because FWHT and RF are used for feature extraction and classification, respectively. When applying FWHT-RF on three plants’ PPI datasets Maize, Rice, and Arabidopsis thaliana (Arabidopsis), the average accuracies of FWHT-RF using 5-fold cross validation were achieved as high as 95.20%, 94.42%, and 83.85%, respectively. To further evaluate the predictive power of FWHT-RF, we compared it with the state-of-art support vector machine (SVM) and K-nearest neighbor (KNN) classifier in different aspects. The experimental results demonstrated that FWHT-RF can be a useful supplementary method to predict potential PPIs in plants.

Download Full-text