Algorithms for protein interaction networks

M. Lappe; L. Holm

doi:10.1042/bst0330530

Graph2GO: a multi-modal attributed network embedding method for inferring protein functions

GigaScience ◽

10.1093/gigascience/giaa081 ◽

2020 ◽

Vol 9 (8) ◽

Author(s):

Kunjie Fan ◽

Yuanfang Guan ◽

Yan Zhang

Keyword(s):

Biological Networks ◽

Protein Function ◽

Large Scale ◽

Sequence Similarity ◽

Functional Characterization ◽

Subcellular Location ◽

Representation Learning ◽

Interaction Networks ◽

Attributed Network ◽

Protein Functions

Abstract Background Identifying protein functions is important for many biological applications. Since experimental functional characterization of proteins is time-consuming and costly, accurate and efficient computational methods for predicting protein functions are in great demand for generating the testable hypotheses guiding large-scale experiments.“ Results Here, we propose Graph2GO, a multi-modal graph-based representation learning model that can integrate heterogeneous information, including multiple types of interaction networks (sequence similarity network and protein-protein interaction network) and protein features (amino acid sequence, subcellular location, and protein domains) to predict protein functions on gene ontology. Comparing Graph2GO to BLAST, as a baseline model, and to two popular protein function prediction methods (Mashup and deepNF), we demonstrated that our model can achieve state-of-the-art performance. We show the robustness of our model by testing on multiple species. We also provide a web server supporting function query and downstream analysis on-the-fly. Conclusions Graph2GO is the first model that has utilized attributed network representation learning methods to model both interaction networks and protein features for predicting protein functions, and achieved promising performance. Our model can be easily extended to include more protein features to further improve the performance. Besides, Graph2GO is also applicable to other application scenarios involving biological networks, and the learned latent representations can be used as feature inputs for machine learning tasks in various downstream analyses.

Download Full-text

Analysis of Protein-Protein Interaction Networks through Computational Approaches

Protein and Peptide Letters ◽

10.2174/0929866526666191105142034 ◽

2020 ◽

Vol 27 (4) ◽

pp. 265-278 ◽

Cited By ~ 1

Author(s):

Ying Han ◽

Liang Cheng ◽

Weiju Sun

Keyword(s):

Protein Interaction ◽

Biological Networks ◽

Interaction Networks ◽

Computational Techniques ◽

Cellular Functions ◽

Protein Protein Interaction ◽

Comprehensive Information ◽

Protein Interaction Prediction ◽

Or Gene ◽

Experimental Findings

The interactions among proteins and genes are extremely important for cellular functions. Molecular interactions at protein or gene levels can be used to construct interaction networks in which the interacting species are categorized based on direct interactions or functional similarities. Compared with the limited experimental techniques, various computational tools make it possible to analyze, filter, and combine the interaction data to get comprehensive information about the biological pathways. By the efficient way of integrating experimental findings in discovering PPIs and computational techniques for prediction, the researchers have been able to gain many valuable data on PPIs, including some advanced databases. Moreover, many useful tools and visualization programs enable the researchers to establish, annotate, and analyze biological networks. We here review and list the computational methods, databases, and tools for protein−protein interaction prediction.

Download Full-text

Universal Screening Methods and Applications of ThermoFluor®

CrossRef Listing of Deleted DOIs ◽

10.1177/1087057106292746 ◽

2006 ◽

Vol 11 (7) ◽

pp. 854-863 ◽

Cited By ~ 124

Author(s):

Maxwell D. Cummings ◽

Michael A. Farnum ◽

Marina I. Nelen

Keyword(s):

Protein Interactions ◽

Protein Function ◽

Protein Unfolding ◽

Direct Detection ◽

Functional Characterization ◽

Screening Methods ◽

Protein Protein Interactions ◽

Protein Protein Interaction ◽

Bacterial Enzyme ◽

Research Problems

The genomics revolution has unveiled a wealth of poorly characterized proteins. Scientists are often able to produce milligram quantities of proteins for which function is unknown or hypothetical, based only on very distant sequence homology. Broadly applicable tools for functional characterization are essential to the illumination of these orphan proteins. An additional challenge is the direct detection of inhibitors of protein-protein interactions (and allosteric effectors). Both of these research problems are relevant to, among other things, the challenge of finding and validating new protein targets for drug action. Screening collections of small molecules has long been used in the pharmaceutical industry as 1 method of discovering drug leads. Screening in this context typically involves a function-based assay. Given a sufficient quantity of a protein of interest, significant effort may still be required for functional characterization, assay development, and assay configuration for screening. Increasingly, techniques are being reported that facilitate screening for specific ligands for a protein of unknown function. Such techniques also allow for function-independent screening with better characterized proteins. ThermoFluor®, a screening instrument based on monitoring ligand effects on temperature-dependent protein unfolding, can be applied when protein function is unknown. This technology has proven useful in the decryption of an essential bacterial enzyme and in the discovery of a series of inhibitors of a cancer-related, protein-protein interaction. The authors review some of the tools relevant to these research problems in drug discovery, and describe our experiences with 2 different proteins.

Download Full-text

deepNF: Deep network fusion for protein function prediction

10.1101/223339 ◽

2017 ◽

Cited By ~ 2

Author(s):

Vladimir Gligorijević ◽

Meet Barot ◽

Richard Bonneau

Keyword(s):

Protein Function ◽

Large Scale ◽

Protein Function Prediction ◽

Predictive Performance ◽

Substantial Improvement ◽

Function Prediction ◽

Interaction Networks ◽

Highly Nonlinear ◽

High Level ◽

String Networks

AbstractThe prevalence of high-throughput experimental methods has resulted in an abundance of large-scale molecular and functional interaction networks. The connectivity of these networks provide a rich source of information for inferring functional annotations for genes and proteins. An important challenge has been to develop methods for combining these heterogeneous networks to extract useful protein feature representations for function prediction. Most of the existing approaches for network integration use shallow models that cannot capture complex and highly-nonlinear network structures. Thus, we propose deepNF, a network fusion method based on Multimodal Deep Autoencoders to extract high-level features of proteins from multiple heterogeneous interaction networks. We apply this method to combine STRING networks to construct a common low-dimensional representation containing high-level protein features. We use separate layers for different network types in the early stages of the multimodal autoencoder, later connecting all the layers into a single bottleneck layer from which we extract features to predict protein function. We compare the cross-validation and temporal holdout predictive performance of our method with state-of-the-art methods, including the recently proposed method Mashup. Our results show that our method outperforms previous methods for both human and yeast STRING networks. We also show substantial improvement in the performance of our method in predicting GO terms of varying type and specificity.AvailabilitydeepNF is freely available at: https://github.com/VGligorijevic/deepNF

Download Full-text

iProteinDB: an integrative database of Drosophila post-translational modifications

10.1101/386268 ◽

2018 ◽

Cited By ~ 2

Author(s):

Yanhui Hu ◽

Richelle Sopko ◽

Verena Chung ◽

Romain A. Studer ◽

Sean D. Landry ◽

...

Keyword(s):

Protein Interactions ◽

Protein Function ◽

Large Scale ◽

Model Organisms ◽

General Strategy ◽

Post Translational Modification ◽

Post Translational Modifications ◽

Functional Sites ◽

Evolutionarily Conserved ◽

And Function

AbstractPost-translational modification (PTM) serves as a regulatory mechanism for protein function, influencing stability, protein interactions, activity and localization, and is critical in many signaling pathways. The best characterized PTM is phosphorylation, whereby a phosphate is added to an acceptor residue, commonly serine, threonine and tyrosine. As proteins are often phosphorylated at multiple sites, identifying those sites that are important for function is a challenging problem. Considering that many phosphorylation sites may be non-functional, prioritizing evolutionarily conserved phosphosites provides a general strategy to identify the putative functional sites with regards to regulation and function. To facilitate the identification of conserved phosphosites, we generated a large-scale phosphoproteomics dataset from Drosophila embryos collected from six closely-related species. We built iProteinDB (https://www.flyrnai.org/tools/iproteindb/), a resource integrating these data with other high-throughput PTM datasets, including vertebrates, and manually curated information for Drosophila. At iProteinDB, scientists can view the PTM landscape for any Drosophila protein and identify predicted functional phosphosites based on a comparative analysis of data from closely-related Drosophila species. Further, iProteinDB enables comparison of PTM data from Drosophila to that of orthologous proteins from other model organisms, including human, mouse, rat, Xenopus laevis, Danio rerio, and Caenorhabditis elegans.

Download Full-text

Mining Protein Interactome Networks to Measure Interaction Reliability and Select Hub Proteins

Computational Knowledge Discovery for Bioinformatics Research ◽

10.4018/978-1-4666-1785-8.ch013 ◽

2013 ◽

pp. 222-238

Author(s):

Young-Rae Cho ◽

Aidong Zhang

Keyword(s):

Protein Interactions ◽

Large Scale ◽

Functional Characterization ◽

Flow Simulation ◽

Protein Protein Interactions ◽

Systematic Analysis ◽

Graph Theoretic ◽

Interactome Network ◽

Protein Interactome ◽

Hub Proteins

High-throughput techniques involve large-scale detection of protein-protein interactions. This interaction data set from the genome-scale perspective is structured into an interactome network. Since the interaction evidence represents functional linkage, various graph-theoretic computational approaches have been applied to the interactome networks for functional characterization. However, this data is generally unreliable, and the typical genome-wide interactome networks have a complex connectivity. In this paper, the authors explore systematic analysis of protein interactome networks, and propose a $k$-round signal flow simulation algorithm to measure interaction reliability from connection patterns of the interactome networks. This algorithm quantitatively characterizes functional links between proteins by simulating the propagation of information signals through complex connections. In this regard, the algorithm efficiently estimates the strength of alternative paths for each interaction. The authors also present an algorithm for mining the complex interactome network structure. The algorithm restructures the network by hierarchical ordering of nodes, and this structure re-formatting process reveals hub proteins in the interactome networks. This paper demonstrates that two rounds of simulation accurately scores interaction reliability in terms of ontological correlation and functional consistency. Finally, the authors validate that the selected structural hubs represent functional core proteins.

Download Full-text

INGA: protein function prediction combining interaction networks, domain assignments and sequence similarity

Nucleic Acids Research ◽

10.1093/nar/gkv523 ◽

2015 ◽

Vol 43 (W1) ◽

pp. W134-W140 ◽

Cited By ~ 52

Author(s):

Damiano Piovesan ◽

Manuel Giollo ◽

Emanuela Leonardi ◽

Carlo Ferrari ◽

Silvio C.E. Tosatto

Keyword(s):

Protein Function ◽

Sequence Similarity ◽

Protein Function Prediction ◽

Function Prediction ◽

Interaction Networks

Download Full-text

A look back at the quality of Protein Function Prediction tools in CAFA

10.7287/peerj.preprints.27161 ◽

2018 ◽

Author(s):

Morteza Pourreza Shahri ◽

Madhusudan Srinivasan ◽

Diane Bimczok ◽

Upulee Kanewala ◽

Indika Kahanda

Keyword(s):

Protein Function ◽

Large Scale ◽

Computational Models ◽

Protein Function Prediction ◽

Function Prediction ◽

Test Case ◽

Test Cases ◽

Metamorphic Testing ◽

Main Challenge ◽

Scale Experiment

The Critical Assessment of protein Function Annotation algorithms (CAFA) is a large-scale experiment for assessing the computational models for automated function prediction (AFP). The models presented in CAFA have shown excellent promise in terms of prediction accuracy, but quality assurance has been paid relatively less attention. The main challenge associated with conducting systematic testing on AFP software is the lack of a test oracle, which determines passing or failing of a test case; unfortunately, the exact expected outcomes are not well defined for the AFP task. Thus, AFP tools face the oracle problem. Metamorphic testing (MT) is a technique used to test programs that face the oracle problem using metamorphic relations (MRs). A MR determines whether a test has passed or failed by specifying how the output should change according to a specific change made to the input. In this work, we use MT to test nine CAFA2 AFP tools by defining a set of MRs that apply input transformations at the protein-level. According to our initial testing, we observe that several tools fail all the test cases and two tools pass all the test cases on different GO ontologies.

Download Full-text

Mining Protein Interactome Networks to Measure Interaction Reliability and Select Hub Proteins

International Journal of Knowledge Discovery in Bioinformatics ◽

10.4018/jkdb.2010070102 ◽

2010 ◽

Vol 1 (3) ◽

pp. 20-35

Author(s):

Young-Rae Cho ◽

Aidong Zhang

Keyword(s):

Protein Interactions ◽

Large Scale ◽

Functional Characterization ◽

Flow Simulation ◽

Data Set ◽

Systematic Analysis ◽

Core Proteins ◽

Interactome Network ◽

Protein Interactome ◽

Hub Proteins

High-throughput techniques involve large-scale detection of protein-protein interactions. This interaction data set from the genome-scale perspective is structured into an interactome network. Since the interaction evidence represents functional linkage, various graph-theoretic computational approaches have been applied to the interactome networks for functional characterization. However, this data is generally unreliable, and the typical genome-wide interactome networks have a complex connectivity. In this paper, the authors explore systematic analysis of protein interactome networks, and propose a $k$-round signal flow simulation algorithm to measure interaction reliability from connection patterns of the interactome networks. This algorithm quantitatively characterizes functional links between proteins by simulating the propagation of information signals through complex connections. In this regard, the algorithm efficiently estimates the strength of alternative paths for each interaction. The authors also present an algorithm for mining the complex interactome network structure. The algorithm restructures the network by hierarchical ordering of nodes, and this structure re-formatting process reveals hub proteins in the interactome networks. This paper demonstrates that two rounds of simulation accurately scores interaction reliability in terms of ontological correlation and functional consistency. Finally, the authors validate that the selected structural hubs represent functional core proteins.

Download Full-text

Application of dynamic expansion tree for finding large network motifs in biological networks

PeerJ ◽

10.7717/peerj.6917 ◽

2019 ◽

Vol 7 ◽

pp. e6917 ◽

Cited By ~ 1

Author(s):

Sabyasachi Patra ◽

Anjali Mohapatra

Keyword(s):

Biological Networks ◽

Protein Function ◽

Large Scale ◽

Network Motif ◽

Graph Isomorphism ◽

Interaction Network ◽

Motif Finding ◽

Network Motifs ◽

Large Network ◽

Scalable Network

Network motifs play an important role in the structural analysis of biological networks. Identification of such network motifs leads to many important applications such as understanding the modularity and the large-scale structure of biological networks, classification of networks into super-families, and protein function annotation. However, identification of large network motifs is a challenging task as it involves the graph isomorphism problem. Although this problem has been studied extensively in the literature using different computational approaches, still there is a lot of scope for improvement. Motivated by the challenges involved in this field, an efficient and scalable network motif finding algorithm using a dynamic expansion tree is proposed. The novelty of the proposed algorithm is that it avoids computationally expensive graph isomorphism tests and overcomes the space limitation of the static expansion tree (SET) which makes it enable to find large motifs. In this algorithm, the embeddings corresponding to a child node of the expansion tree are obtained from the embeddings of a parent node, either by adding a vertex or by adding an edge. This process does not involve any graph isomorphism check. The time complexity of vertex addition and edge addition are O(n) and O(1), respectively. The growth of a dynamic expansion tree (DET) depends on the availability of patterns in the target network. Pruning of branches in the DET significantly reduces the space requirement of the SET. The proposed algorithm has been tested on a protein–protein interaction network obtained from the MINT database. The proposed algorithm is able to identify large network motifs faster than most of the existing motif finding algorithms.

Download Full-text