Recent advances in network-based methods for disease gene prediction

Briefings in Bioinformatics ◽

10.1093/bib/bbaa303 ◽

2020 ◽

Author(s):

Sezin Kircali Ata ◽

Min Wu ◽

Yuan Fang ◽

Le Ou-Yang ◽

Chee Keong Kwoh ◽

...

Keyword(s):

Empirical Analysis ◽

Genome Wide Association Study ◽

Disease Gene ◽

Gene Prediction ◽

Representation Learning ◽

Graph Representation ◽

Molecular Networks ◽

Learning Methods ◽

Gene Association ◽

Disease Gene Prediction

Abstract Disease–gene association through genome-wide association study (GWAS) is an arduous task for researchers. Investigating single nucleotide polymorphisms that correlate with specific diseases needs statistical analysis of associations. Considering the huge number of possible mutations, in addition to its high cost, another important drawback of GWAS analysis is the large number of false positives. Thus, researchers search for more evidence to cross-check their results through different sources. To provide the researchers with alternative and complementary low-cost disease–gene association evidence, computational approaches come into play. Since molecular networks are able to capture complex interplay among molecules in diseases, they become one of the most extensively used data for disease–gene association prediction. In this survey, we aim to provide a comprehensive and up-to-date review of network-based methods for disease gene prediction. We also conduct an empirical analysis on 14 state-of-the-art methods. To summarize, we first elucidate the task definition for disease gene prediction. Secondly, we categorize existing network-based efforts into network diffusion methods, traditional machine learning methods with handcrafted graph features and graph representation learning methods. Thirdly, an empirical analysis is conducted to evaluate the performance of the selected methods across seven diseases. We also provide distinguishing findings about the discussed methods based on our empirical analysis. Finally, we highlight potential research directions for future studies on disease gene prediction.

Download Full-text

A Comparative Study of Classification-Based Machine Learning Methods for Novel Disease Gene Prediction

Advances in Intelligent Systems and Computing - Knowledge and Systems Engineering ◽

10.1007/978-3-319-11680-8_46 ◽

2015 ◽

pp. 577-588 ◽

Cited By ~ 9

Author(s):

Duc-Hau Le ◽

Nguyen Xuan Hoai ◽

Yung-Keun Kwon

Keyword(s):

Machine Learning ◽

Comparative Study ◽

Disease Gene ◽

Gene Prediction ◽

Learning Methods ◽

Disease Gene Prediction ◽

Machine Learning Methods

Download Full-text

Review of computational approaches to Parkinson’s disease gene prediction

2020 3rd International Conference on Intelligent Sustainable Systems (ICISS) ◽

10.1109/iciss49785.2020.9315926 ◽

2020 ◽

Author(s):

RHEA MARY JOSI ◽

R. I. MINU

Keyword(s):

Parkinson’S Disease ◽

Parkinson's Disease ◽

Disease Gene ◽

Gene Prediction ◽

Computational Approaches ◽

Disease Gene Prediction

Download Full-text

MOSES: A New Approach to Integrate Interactome Topology and Functional Features for Disease Gene Prediction

Genes ◽

10.3390/genes12111713 ◽

2021 ◽

Vol 12 (11) ◽

pp. 1713

Author(s):

Manuela Petti ◽

Lorenzo Farina ◽

Federico Francone ◽

Stefano Lucidi ◽

Amalia Macali ◽

...

Keyword(s):

Network Topology ◽

Disease Gene ◽

Gene Prediction ◽

Knowledge Bases ◽

Biological Knowledge ◽

Disease Genes ◽

Human Interactome ◽

Disease Gene Prediction ◽

Disease Associations ◽

Functional Features

Disease gene prediction is to date one of the main computational challenges of precision medicine. It is still uncertain if disease genes have unique functional properties that distinguish them from other non-disease genes or, from a network perspective, if they are located randomly in the interactome or show specific patterns in the network topology. In this study, we propose a new method for disease gene prediction based on the use of biological knowledge-bases (gene-disease associations, genes functional annotations, etc.) and interactome network topology. The proposed algorithm called MOSES is based on the definition of two somewhat opposing sets of genes both disease-specific from different perspectives: warm seeds (i.e., disease genes obtained from databases) and cold seeds (genes far from the disease genes on the interactome and not involved in their biological functions). The application of MOSES to a set of 40 diseases showed that the suggested putative disease genes are significantly enriched in their reference disease. Reassuringly, known and predicted disease genes together, tend to form a connected network module on the human interactome, mitigating the scattered distribution of disease genes which is probably due to both the paucity of disease-gene associations and the incompleteness of the interactome.

Download Full-text

Significance-based multi-scale method for network community detection and its application in disease-gene prediction

PLoS ONE ◽

10.1371/journal.pone.0227244 ◽

2020 ◽

Vol 15 (3) ◽

pp. e0227244

Author(s):

Ke Hu ◽

Ju Xiang ◽

Yun-Xia Yu ◽

Liang Tang ◽

Qin Xiang ◽

...

Keyword(s):

Community Detection ◽

Disease Gene ◽

Gene Prediction ◽

Disease Gene Prediction ◽

Multi Scale ◽

Scale Method ◽

Network Community

Download Full-text

Disease gene prediction for molecularly uncharacterized diseases

PLoS Computational Biology ◽

10.1371/journal.pcbi.1007078 ◽

2019 ◽

Vol 15 (7) ◽

pp. e1007078 ◽

Cited By ~ 7

Author(s):

Juan J. Cáceres ◽

Alberto Paccanaro

Keyword(s):

Disease Gene ◽

Gene Prediction ◽

Disease Gene Prediction

Download Full-text

Network-based methods for human disease gene prediction

Briefings in Functional Genomics ◽

10.1093/bfgp/elr024 ◽

2011 ◽

Vol 10 (5) ◽

pp. 280-293 ◽

Cited By ~ 144

Author(s):

X. Wang ◽

N. Gulbahce ◽

H. Yu

Keyword(s):

Human Disease ◽

Disease Gene ◽

Gene Prediction ◽

Disease Gene Prediction ◽

Human Disease Gene

Download Full-text

Computational Approaches for Human Disease Gene Prediction and Ranking

Systems Analysis of Human Multigene Disorders - Advances in Experimental Medicine and Biology ◽

10.1007/978-1-4614-8778-4_4 ◽

2013 ◽

pp. 69-84 ◽

Cited By ~ 13

Author(s):

Cheng Zhu ◽

Chao Wu ◽

Bruce J. Aronow ◽

Anil G. Jegga

Keyword(s):

Human Disease ◽

Disease Gene ◽

Gene Prediction ◽

Computational Approaches ◽

Disease Gene Prediction ◽

Human Disease Gene

Download Full-text

Ensemble disease gene prediction by clinical sample-based networks

BMC Bioinformatics ◽

10.1186/s12859-020-3346-8 ◽

2020 ◽

Vol 21 (S2) ◽

Cited By ~ 1

Author(s):

Ping Luo ◽

Li-Ping Tian ◽

Bolin Chen ◽

Qianghua Xiao ◽

Fang-Xiang Wu

Keyword(s):

Disease Gene ◽

Clinical Sample ◽

Gene Prediction ◽

Disease Gene Prediction

Download Full-text

A feature-learning-based method for the disease-gene prediction problem

International Journal of Data Mining and Bioinformatics ◽

10.1504/ijdmb.2020.109502 ◽

2020 ◽

Vol 24 (1) ◽

pp. 16

Author(s):

Lorenzo Madeddu ◽

Giovanni Stilo ◽

Paola Velardi

Keyword(s):

Disease Gene ◽

Gene Prediction ◽

Feature Learning ◽

Prediction Problem ◽

Disease Gene Prediction

Download Full-text

Unsupervised Structural Graph Node Representation Learning

10.18122/td/1754/boisestate ◽

2020 ◽

Author(s):

Mikel Joaristi

Keyword(s):

Real World ◽

State Of The Art ◽

Structural Information ◽

Representation Learning ◽

Graph Representation ◽

Learning Methods ◽

Structural Graph ◽

Connectivity Information ◽

Latent Space ◽

Previous State

Unsupervised Graph Representation Learning methods learn a numerical representation of the nodes in a graph. The generated representations encode meaningful information about the nodes' properties, making them a powerful tool for tasks in many areas of study, such as social sciences, biology or communication networks. These methods are particularly interesting because they facilitate the direct use of standard Machine Learning models on graphs. Graph representation learning methods can be divided into two main categories depending on the information they encode, methods preserving the nodes connectivity information, and methods preserving nodes' structural information. Connectivity-based methods focus on encoding relationships between nodes, with neighboring nodes being closer together in the resulting latent space. On the other hand, structure-based methods generate a latent space where nodes serving a similar structural function in the network are encoded close to each other, independently of them being connected or even close to each other in the graph. While there are a lot of works that focus on preserving nodes' connectivity information, only a few works study the problem of encoding nodes' structure, specially in an unsupervised way. In this dissertation, we demonstrate that properly encoding nodes' structural information is fundamental for many real-world applications, as it can be leveraged to successfully solve many tasks where connectivity-based methods fail. One concrete example is presented first. In this example, the task consists of detecting malicious entities in a real-world financial network. We show that to solve this problem, connectivity information is not enough and show how leveraging structural information provides considerable performance improvements. This particular example pinpoints the need for further research on the area of structural graph representation learning, together with the limitations of the previous state-of-the-art. We use the acquired knowledge as a starting point and inspiration for the research and development of three independent unsupervised structural graph representation learning methods: Structural Iterative Representation learning approach for Graph Nodes (SIR-GN), Structural Iterative Lexicographic Autoencoded Node Representation (SILA), and Sparse Structural Node Representation (SparseStruct). We show how each of our methods tackles specific limitations on the previous state-of-the-art on structural graph representation learning such as scalability, representation meaning, and lack of formal proof that guarantees the preservation of structural properties. We provide an extensive experimental section where we compare our three proposed methods to the current state-of-the-art on both connectivity-based and structure-based representation learning methods. Finally, in this dissertation, we look at extensions of the basic structural graph representation learning problem. We study the problem of temporal structural graph representation. We also provide a method for representation explainability.

Download Full-text