Deep learning reveals many more inter-protein residue-residue contacts than direct coupling analysis

Mapping Intimacies ◽

10.1101/240754 ◽

2017 ◽

Cited By ~ 4

Author(s):

Tian-ming Zhou ◽

Sheng Wang ◽

Jinbo Xu

Keyword(s):

Deep Learning ◽

Protein Interaction ◽

Interaction Network ◽

Protein Docking ◽

Residue Level ◽

Direct Coupling ◽

Coupling Analysis ◽

Multiple Sequence ◽

Contact Prediction ◽

Direct Coupling Analysis

AbstractIntra-protein residue-level contact prediction has drawn a lot of attentions in recent years and made very good progress, but much fewer methods are dedicated to inter-protein contact prediction, which are important for understanding how proteins interact at structure and residue level. Direct coupling analysis (DCA) is popular for intra-protein contact prediction, but extending it to inter-protein contact prediction is challenging since it requires too many interlogs (i.e., interacting homologs) to be effective, which cannot be easily fulfilled especially for a putative interacting protein pair in eukaryotes. We show that deep learning, even trained by only intra-protein contact maps, works much better than DCA for inter-protein contact prediction. We also show that a phylogeny-based method can generate a better multiple sequence alignment for eukaryotes than existing genome-based methods and thus, lead to better inter-protein contact prediction. Our method shall be useful for protein docking, protein interaction prediction and protein interaction network construction.

Download Full-text

Erratum: Three-body interactions improve contact prediction within direct-coupling analysis [Phys. Rev. E 96 , 052405 (2017)]

Physical Review E ◽

10.1103/physreve.104.019902 ◽

2021 ◽

Vol 104 (1) ◽

Author(s):

Michael Schmidt ◽

Kay Hamacher

Keyword(s):

Direct Coupling ◽

Coupling Analysis ◽

Contact Prediction ◽

Direct Coupling Analysis ◽

Three Body

Download Full-text

Direct Coupling Analysis for Protein Contact Prediction

Methods in Molecular Biology - Protein Structure Prediction ◽

10.1007/978-1-4939-0366-5_5 ◽

2014 ◽

pp. 55-70 ◽

Cited By ~ 32

Author(s):

Faruck Morcos ◽

Terence Hwa ◽

José N. Onuchic ◽

Martin Weigt

Keyword(s):

Direct Coupling ◽

Coupling Analysis ◽

Contact Prediction ◽

Direct Coupling Analysis

Download Full-text

Accurate contact-based modelling of repeat proteins predicts the structure of new repeats protein families

PLoS Computational Biology ◽

10.1371/journal.pcbi.1008798 ◽

2021 ◽

Vol 17 (4) ◽

pp. e1008798

Author(s):

Claudio Bassot ◽

Arne Elofsson

Keyword(s):

Deep Learning ◽

Protein Structure ◽

High Accuracy ◽

Unique Sequence ◽

Direct Coupling ◽

Protein Families ◽

Coupling Analysis ◽

Repeat Proteins ◽

Eukaryotic Proteomes ◽

Direct Coupling Analysis

Repeat proteins are abundant in eukaryotic proteomes. They are involved in many eukaryotic specific functions, including signalling. For many of these proteins, the structure is not known, as they are difficult to crystallise. Today, using direct coupling analysis and deep learning it is often possible to predict a protein’s structure. However, the unique sequence features present in repeat proteins have been a challenge to use direct coupling analysis for predicting contacts. Here, we show that deep learning-based methods (trRosetta, DeepMetaPsicov (DMP) and PconsC4) overcomes this problem and can predict intra- and inter-unit contacts in repeat proteins. In a benchmark dataset of 815 repeat proteins, about 90% can be correctly modelled. Further, among 48 PFAM families lacking a protein structure, we produce models of forty-one families with estimated high accuracy.

Download Full-text

FilterDCA: interpretable supervised contact prediction using inter-domain coevolution

10.1101/2019.12.24.887877 ◽

2019 ◽

Cited By ~ 1

Author(s):

Maureen Muscat ◽

Giancarlo Croce ◽

Edoardo Sarti ◽

Martin Weigt

Keyword(s):

Deep Learning ◽

De Novo ◽

Protein Complexes ◽

Protein Structures ◽

Direct Coupling ◽

Sequence Information ◽

Coupling Analysis ◽

Contact Patterns ◽

Direct Coupling Analysis ◽

Training Sets

AbstractPredicting three-dimensional protein structure and assembling protein complexes using sequence information belongs to the most prominent tasks in computational biology. Recently substantial progress has been obtained in the case of single proteins using a combination of unsupervised coevolutionary sequence analysis with structurally supervised deep learning. While reaching impressive accuracies in predicting residue-residue contacts, deep learning has a number of disadvantages. The need for large structural training sets limits the applicability to multi-protein complexes; and their deep architecture makes the interpretability of the convolutional neural networks intrinsically hard. Here we introduce FilterDCA, a simpler supervised predictor for inter-domain and inter-protein contacts. It is based on the fact that contact maps of proteins show typical contact patterns, which results from secondary structure and are reflected by patterns in coevolutionary analysis. We explicitly integrate averaged contacts patterns with coevolutionary scores derived by Direct Coupling Analysis, reaching results comparable to more complex deep-learning approaches, while remaining fully transparent and interpretable. The FilterDCA code is available at http://gitlab.lcqb.upmc.fr/muscat/FilterDCA.Author summaryThe de novo prediction of tertiary and quaternary protein structures has recently seen important advances, by combining unsupervised, purely sequence-based coevolutionary analyses with structure-based supervision using deep learning for contact-map prediction. While showing impressive performance, deep-learning methods require large training sets and pose severe obstacles for their interpretability. Here we construct a simple, transparent and therefore fully interpretable inter-domain contact predictor, which uses the results of coevolutionary Direct Coupling Analysis in combination with explicitly constructed filters reflecting typical contact patterns in a training set of known protein structures, and which improves the accuracy of predicted contacts significantly. Our approach thereby sheds light on the question how contact information is encoded in coevolutionary signals.

Download Full-text

pydca v1.0: a comprehensive software for direct coupling analysis of RNA and protein sequences

Bioinformatics ◽

10.1093/bioinformatics/btz892 ◽

2019 ◽

Vol 36 (7) ◽

pp. 2264-2265 ◽

Cited By ~ 3

Author(s):

Mehari B Zerihun ◽

Fabrizio Pucci ◽

Emanuel K Peter ◽

Alexander Schug

Keyword(s):

Structure Prediction ◽

Sequence Data ◽

Direct Coupling ◽

Supplementary Information ◽

Spatial Proximity ◽

Homologous Proteins ◽

Coupling Analysis ◽

Multiple Sequence ◽

Wide Range ◽

Direct Coupling Analysis

Abstract Motivation The ongoing advances in sequencing technologies have provided a massive increase in the availability of sequence data. This made it possible to study the patterns of correlated substitution between residues in families of homologous proteins or RNAs and to retrieve structural and stability information. Direct coupling analysis (DCA) infers coevolutionary couplings between pairs of residues indicating their spatial proximity, making such information a valuable input for subsequent structure prediction. Results Here, we present pydca, a standalone Python-based software package for the DCA of protein- and RNA-homologous families. It is based on two popular inverse statistical approaches, namely, the mean-field and the pseudo-likelihood maximization and is equipped with a series of functionalities that range from multiple sequence alignment trimming to contact map visualization. Thanks to its efficient implementation, features and user-friendly command line interface, pydca is a modular and easy-to-use tool that can be used by researchers with a wide range of backgrounds. Availability and implementation pydca can be obtained from https://github.com/KIT-MBS/pydca or from the Python Package Index under the MIT License. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Assessing the accuracy of direct-coupling analysis for RNA contact prediction

RNA ◽

10.1261/rna.074179.119 ◽

2020 ◽

Vol 26 (5) ◽

pp. 637-647 ◽

Cited By ~ 4

Author(s):

Francesca Cuturello ◽

Guido Tiana ◽

Giovanni Bussi

Keyword(s):

Direct Coupling ◽

Coupling Analysis ◽

Contact Prediction ◽

Direct Coupling Analysis

Download Full-text

Liquid-theory analogy of direct-coupling analysis of multiple-sequence alignment and its implications for protein structure prediction

Biophysics and Physicobiology ◽

10.2142/biophysico.12.0_117 ◽

2015 ◽

Vol 12 (0) ◽

pp. 117-119 ◽

Cited By ~ 1

Author(s):

Akira R. Kinjo

Keyword(s):

Protein Structure ◽

Protein Structure Prediction ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Structure Prediction ◽

Direct Coupling ◽

Coupling Analysis ◽

Multiple Sequence ◽

Liquid Theory ◽

Direct Coupling Analysis

Download Full-text

Faculty Opinions recommendation of Assessing the accuracy of direct-coupling analysis for RNA contact prediction.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.737468653.793577502 ◽

2020 ◽

Author(s):

Janusz Bujnicki ◽

Pritha Ghosh

Keyword(s):

Direct Coupling ◽

Coupling Analysis ◽

Contact Prediction ◽

Direct Coupling Analysis

Download Full-text

PconsC4: fast, free, easy, and accurate contact predictions

10.1101/383133 ◽

2018 ◽

Cited By ~ 2

Author(s):

Mirco Michel ◽

David Menéndez Hurtado ◽

Arne Elofsson

Keyword(s):

Deep Learning ◽

State Of The Art ◽

Prediction Methods ◽

Coupling Analysis ◽

Learning Methods ◽

Contact Prediction ◽

Residue Contact ◽

Direct Coupling Analysis ◽

Computationally Expensive ◽

Contact Predictions

AbstractMotivationResidue contact prediction was revolutionized recently by the introduction of direct coupling analysis (DCA). Further improvements, in particular for small families, have been obtained by the combination of DCA and deep learning methods. However, existing deep learning contact prediction methods often rely on a number of external programs and are therefore computationally expensive.ResultsHere, we introduce a novel contact predictor, PconsC4, which performs on par with state of the art methods. PconsC4 is heavily optimized, does not use any external programs and therefore is significantly faster and easier to use than other methods.AvailabilityPconsC4 is freely available under the GPL license from https://github.com/ElofssonLab/PconsC4. Installation is easy using the pip command and works on any system with Python 3.5 or later and a modern GCC [email protected]

Download Full-text

Three-body interactions improve contact prediction within direct-coupling analysis

Physical Review E ◽

10.1103/physreve.96.052405 ◽

2017 ◽

Vol 96 (5) ◽

Cited By ~ 6

Author(s):

Michael Schmidt ◽

Kay Hamacher

Keyword(s):

Direct Coupling ◽

Coupling Analysis ◽

Contact Prediction ◽

Direct Coupling Analysis ◽

Three Body

Download Full-text