A Topology-Based Metric for Measuring Term Similarity in the Gene Ontology

Advances in Bioinformatics ◽

10.1155/2012/975783 ◽

2012 ◽

Vol 2012 ◽

pp. 1-17 ◽

Cited By ~ 27

Author(s):

Gaston K. Mazandu ◽

Nicola J. Mulder

Keyword(s):

Gene Ontology ◽

Protein Function ◽

Protein Function Prediction ◽

Similarity Measures ◽

Biological Knowledge ◽

Online Tool ◽

Protein Protein Interaction ◽

Or Groups ◽

Term Similarity ◽

Go Terms

The wide coverage and biological relevance of the Gene Ontology (GO), confirmed through its successful use in protein function prediction, have led to the growth in its popularity. In order to exploit the extent of biological knowledge that GO offers in describing genes or groups of genes, there is a need for an efficient, scalable similarity measure for GO terms and GO-annotated proteins. While several GO similarity measures exist, none adequately addresses all issues surrounding the design and usage of the ontology. We introduce a new metric for measuring the distance between two GO terms using the intrinsic topology of the GO-DAG, thus enabling the measurement of functional similarities between proteins based on their GO annotations. We assess the performance of this metric using a ROC analysis on human protein-protein interaction datasets and correlation coefficient analysis on the selected set of protein pairs from the CESSM online tool. This metric achieves good performance compared to the existing annotation-based GO measures. We used this new metric to assess functional similarity between orthologues, and show that it is effective at determining whether orthologues are annotated with similar functions and identifying cases where annotation is inconsistent between orthologues.

Download Full-text

CrowdGO: a wisdom of the crowd-based Gene Ontology annotation tool

10.1101/731596 ◽

2019 ◽

Cited By ~ 1

Author(s):

Maarten J.M.F. Reijnders

Keyword(s):

Gene Ontology ◽

Protein Function ◽

Directed Acyclic Graph ◽

Support Vector Machine Model ◽

Protein Function Prediction ◽

Support Vector ◽

Annotation Tool ◽

Machine Model ◽

Prediction Tools ◽

Go Terms

AbstractMotivationProtein function prediction tools vary widely in their methodologies, resulting in different sets of GO terms being correctly predicted. Ideally, multiple tools are combined to achieve a higher recall of GO terms while increasing precision.ResultsCrowdGO combines input predictions from any number of tools and combines them based on the Gene Ontology Directed Acyclic Graph. Using each GO terms information content, the semantic similarity between GO predictions of different tools, and a Support Vector Machine model, it achieves improved precision and recall compared to each of the tools separately (Figure 1).AvailabilityCrowdGO can be found at https://gitlab.com/mreijnders/CrowdGO

Download Full-text

Protein function prediction from protein–protein interaction network using gene ontology based neighborhood analysis and physico-chemical features

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720018500257 ◽

2018 ◽

Vol 16 (06) ◽

pp. 1850025 ◽

Cited By ~ 5

Author(s):

Sovan Saha ◽

Abhimanyu Prasad ◽

Piyali Chatterjee ◽

Subhadip Basu ◽

Mita Nasipuri

Keyword(s):

Functional Groups ◽

Protein Function ◽

Functional Group ◽

Protein Function Prediction ◽

Interaction Network ◽

Function Prediction ◽

Protein Protein Interaction ◽

Physico Chemical ◽

Go Terms ◽

Protein Protein Interaction Network

Protein Function Prediction from Protein–Protein Interaction Network (PPIN) and physico-chemical features using the Gene Ontology (GO) classification are indeed very useful for assigning biological or biochemical functions to a protein. They also lead to the identification of those significant proteins which are responsible for the generation of various diseases whose drugs are still yet to be discovered. So, the prediction of GO functional terms from PPIN and sequence is an important field of study. In this work, we have proposed a methodology, Multi Label Protein Function Prediction (ML_PFP) which is based on Neighborhood analysis empowered with physico-chemical features of constituent amino acids to predict the functional group of unannotated protein. A protein does not perform functions in isolation rather it performs functions in a group by interacting with others. So a protein is involved in many functions or, in other words, may be associated with multiple functional groups or labels or GO terms. Though functional group of other known interacting partner protein and its physico-chemical features provide useful information, assignment of multiple labels to unannotated protein is a very challenging task. Here, we have taken Homo sapiens or Human PPIN as well as Saccharomyces cerevisiae or yeast PPIN along with their GO terms to predict functional groups or GO terms of unannotated proteins. This work has become very challenging as both Human and Yeast protein dataset are voluminous and complex in nature and multi-label functional groups assignment has also added a new dimension to this challenge. Our algorithm has been observed to achieve a better performance in Cellular Function, Molecular Function and Biological Process of both yeast and human network when compared with the other existing state-of-the-art methodologies which will be discussed in detail in the results section.

Download Full-text

Protein function prediction with gene ontology: from traditional to deep learning models

PeerJ ◽

10.7717/peerj.12019 ◽

2021 ◽

Vol 9 ◽

pp. e12019

Author(s):

Thi Thuy Duong Vu ◽

Jaehee Jung

Keyword(s):

Gene Ontology ◽

Deep Learning ◽

Protein Function ◽

High Throughput Sequencing ◽

Protein Function Prediction ◽

Rapid Development ◽

Function Prediction ◽

Amino Acid Sequences ◽

Go Annotation ◽

Go Terms

Protein function prediction is a crucial part of genome annotation. Prediction methods have recently witnessed rapid development, owing to the emergence of high-throughput sequencing technologies. Among the available databases for identifying protein function terms, Gene Ontology (GO) is an important resource that describes the functional properties of proteins. Researchers are employing various approaches to efficiently predict the GO terms. Meanwhile, deep learning, a fast-evolving discipline in data-driven approach, exhibits impressive potential with respect to assigning GO terms to amino acid sequences. Herein, we reviewed the currently available computational GO annotation methods for proteins, ranging from conventional to deep learning approach. Further, we selected some suitable predictors from among the reviewed tools and conducted a mini comparison of their performance using a worldwide challenge dataset. Finally, we discussed the remaining major challenges in the field, and emphasized the future directions for protein function prediction with GO.

Download Full-text

Protein Function Prediction by Clustering of Protein-Protein Interaction Network

Advances in Intelligent and Soft Computing - ICT Innovations 2011 ◽

10.1007/978-3-642-28664-3_4 ◽

2012 ◽

pp. 39-49 ◽

Cited By ~ 1

Author(s):

Ivana Cingovska ◽

Aleksandra Bogojeska ◽

Kire Trivodaliev ◽

Slobodan Kalajdziski

Keyword(s):

Protein Interaction ◽

Protein Interaction Network ◽

Protein Function ◽

Protein Function Prediction ◽

Interaction Network ◽

Function Prediction ◽

Protein Protein Interaction ◽

Protein Protein Interaction Network

Download Full-text

Learning Kernel Matrix from Gene Ontology and Annotation Data for Protein Function Prediction

Advances in Neural Networks – ISNN 2009 - Lecture Notes in Computer Science ◽

10.1007/978-3-642-01513-7_76 ◽

2009 ◽

pp. 694-703

Author(s):

Yiming Chen ◽

Zhoujun Li ◽

Junwan Liu

Keyword(s):

Gene Ontology ◽

Protein Function ◽

Protein Function Prediction ◽

Function Prediction ◽

Kernel Matrix ◽

Annotation Data

Download Full-text

Gene Ontology GAN (GOGAN): a novel architecture for protein function prediction

Soft Computing ◽

10.1007/s00500-021-06707-z ◽

2022 ◽

Author(s):

Musadaq Mansoor ◽

Mohammad Nauman ◽

Hafeez Ur Rehman ◽

Alfredo Benso

Keyword(s):

Gene Ontology ◽

Protein Function ◽

Protein Function Prediction ◽

Function Prediction

Download Full-text

Protein Function Prediction Using Protein–Protein Interaction Networks

Protein Function Prediction for Omics Era ◽

10.1007/978-94-007-0881-5_13 ◽

2011 ◽

pp. 243-270

Author(s):

Hon Nian Chua ◽

Guimei Liu ◽

Limsoon Wong

Keyword(s):

Protein Interaction ◽

Protein Function ◽

Protein Function Prediction ◽

Function Prediction ◽

Protein Interaction Networks ◽

Interaction Networks ◽

Protein Protein Interaction ◽

Protein Protein Interaction Networks

Download Full-text

Protein function prediction using neighbor relativity in protein–protein interaction network

Computational Biology and Chemistry ◽

10.1016/j.compbiolchem.2012.12.003 ◽

2013 ◽

Vol 43 ◽

pp. 11-16 ◽

Cited By ~ 16

Author(s):

Sobhan Moosavi ◽

Masoud Rahgozar ◽

Amir Rahimi

Keyword(s):

Protein Interaction ◽

Protein Interaction Network ◽

Protein Function ◽

Protein Function Prediction ◽

Interaction Network ◽

Function Prediction ◽

Protein Protein Interaction ◽

Protein Protein Interaction Network

Download Full-text

A Bayesian approach to construct Context-Specific Gene Ontology: Application to protein function prediction

2016 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB) ◽

10.1109/cibcb.2016.7758127 ◽

2016 ◽

Cited By ~ 1

Author(s):

Hasna Njah ◽

Salma Jamoussi ◽

Walid Mahdi ◽

Mohamed Elati

Keyword(s):

Gene Ontology ◽

Bayesian Approach ◽

Protein Function ◽

Protein Function Prediction ◽

Function Prediction ◽

Specific Gene ◽

Ontology Application ◽

Context Specific

Download Full-text

The effect of statistical normalisation on network propagation scores

10.1101/2020.01.20.911842 ◽

2020 ◽

Author(s):

Sergio Picart-Armada ◽

Wesley K. Thompson ◽

Alfonso Buil ◽

Alexandre Perera-Lluna

Keyword(s):

Protein Function ◽

Diffusion Processes ◽

Protein Function Prediction ◽

Interaction Network ◽

Mean Value ◽

Statistical Properties ◽

Label Propagation ◽

Protein Protein Interaction ◽

Module Discovery ◽

Permutation Analysis

AbstractMotivationNetwork diffusion and label propagation are fundamental tools in computational biology, with applications like gene-disease association, protein function prediction and module discovery. More recently, several publications have introduced a permutation analysis after the propagation process, due to concerns that network topology can bias diffusion scores. This opens the question of the statistical properties and the presence of bias of such diffusion processes in each of its applications. In this work, we characterised some common null models behind the permutation analysis and the statistical properties of the diffusion scores. We benchmarked seven diffusion scores on three case studies: synthetic signals on a yeast interactome, simulated differential gene expression on a protein-protein interaction network and prospective gene set prediction on another interaction network. For clarity, all the datasets were based on binary labels, but we also present theoretical results for quantitative labels.ResultsDiffusion scores starting from binary labels were affected by the label codification, and exhibited a problem-dependent topological bias that could be removed by the statistical normalisation. Parametric and non-parametric normalisation addressed both points by being codification-independent and by equalising the bias. We identified and quantified two sources of bias -mean value and variance- that yielded performance differences when normalising the scores. We provided closed formulae for both and showed how the null covariance is related to the spectral properties of the graph. Despite none of the proposed scores systematically outperformed the others, normalisation was preferred when the sought positive labels were not aligned with the bias. We conclude that the decision on bias removal should be problem and data-driven, i.e. based on a quantitative analysis of the bias and its relation to the positive entities.AvailabilityThe code is publicly available at https://github.com/b2slab/[email protected]

Download Full-text