A graph auto-encoder model for miRNA-disease associations prediction

Author(s):  
Zhengwei Li ◽  
Jiashu Li ◽  
Ru Nie ◽  
Zhu-Hong You ◽  
Wenzheng Bao

Abstract Emerging evidence indicates that the abnormal expression of miRNAs involves in the evolution and progression of various human complex diseases. Identifying disease-related miRNAs as new biomarkers can promote the development of disease pathology and clinical medicine. However, designing biological experiments to validate disease-related miRNAs is usually time-consuming and expensive. Therefore, it is urgent to design effective computational methods for predicting potential miRNA-disease associations. Inspired by the great progress of graph neural networks in link prediction, we propose a novel graph auto-encoder model, named GAEMDA, to identify the potential miRNA-disease associations in an end-to-end manner. More specifically, the GAEMDA model applies a graph neural networks-based encoder, which contains aggregator function and multi-layer perceptron for aggregating nodes’ neighborhood information, to generate the low-dimensional embeddings of miRNA and disease nodes and realize the effective fusion of heterogeneous information. Then, the embeddings of miRNA and disease nodes are fed into a bilinear decoder to identify the potential links between miRNA and disease nodes. The experimental results indicate that GAEMDA achieves the average area under the curve of $93.56\pm 0.44\%$ under 5-fold cross-validation. Besides, we further carried out case studies on colon neoplasms, esophageal neoplasms and kidney neoplasms. As a result, 48 of the top 50 predicted miRNAs associated with these diseases are confirmed by the database of differentially expressed miRNAs in human cancers and microRNA deregulation in human disease database, respectively. The satisfactory prediction performance suggests that GAEMDA model could serve as a reliable tool to guide the following researches on the regulatory role of miRNAs. Besides, the source codes are available at https://github.com/chimianbuhetang/GAEMDA.

2020 ◽  
Vol 21 (S13) ◽  
Author(s):  
Renyi Zhou ◽  
Zhangli Lu ◽  
Huimin Luo ◽  
Ju Xiang ◽  
Min Zeng ◽  
...  

Abstract Background Drug discovery is known for the large amount of money and time it consumes and the high risk it takes. Drug repositioning has, therefore, become a popular approach to save time and cost by finding novel indications for approved drugs. In order to distinguish these novel indications accurately in a great many of latent associations between drugs and diseases, it is necessary to exploit abundant heterogeneous information about drugs and diseases. Results In this article, we propose a meta-path-based computational method called NEDD to predict novel associations between drugs and diseases using heterogeneous information. First, we construct a heterogeneous network as an undirected graph by integrating drug-drug similarity, disease-disease similarity, and known drug-disease associations. NEDD uses meta paths of different lengths to explicitly capture the indirect relationships, or high order proximity, within drugs and diseases, by which the low dimensional representation vectors of drugs and diseases are obtained. NEDD then uses a random forest classifier to predict novel associations between drugs and diseases. Conclusions The experiments on a gold standard dataset which contains 1933 validated drug–disease associations show that NEDD produces superior prediction results compared with the state-of-the-art approaches.


2022 ◽  
Vol 13 (1) ◽  
pp. 1-54
Author(s):  
Yu Zhou ◽  
Haixia Zheng ◽  
Xin Huang ◽  
Shufeng Hao ◽  
Dengao Li ◽  
...  

Graph neural networks provide a powerful toolkit for embedding real-world graphs into low-dimensional spaces according to specific tasks. Up to now, there have been several surveys on this topic. However, they usually lay emphasis on different angles so that the readers cannot see a panorama of the graph neural networks. This survey aims to overcome this limitation and provide a systematic and comprehensive review on the graph neural networks. First of all, we provide a novel taxonomy for the graph neural networks, and then refer to up to 327 relevant literatures to show the panorama of the graph neural networks. All of them are classified into the corresponding categories. In order to drive the graph neural networks into a new stage, we summarize four future research directions so as to overcome the challenges faced. It is expected that more and more scholars can understand and exploit the graph neural networks and use them in their research community.


2019 ◽  
Vol 20 (15) ◽  
pp. 3648 ◽  
Author(s):  
Xuan ◽  
Sun ◽  
Wang ◽  
Zhang ◽  
Pan

Identification of disease-associated miRNAs (disease miRNAs) are critical for understanding etiology and pathogenesis. Most previous methods focus on integrating similarities and associating information contained in heterogeneous miRNA-disease networks. However, these methods establish only shallow prediction models that fail to capture complex relationships among miRNA similarities, disease similarities, and miRNA-disease associations. We propose a prediction method on the basis of network representation learning and convolutional neural networks to predict disease miRNAs, called CNNMDA. CNNMDA deeply integrates the similarity information of miRNAs and diseases, miRNA-disease associations, and representations of miRNAs and diseases in low-dimensional feature space. The new framework based on deep learning was built to learn the original and global representation of a miRNA-disease pair. First, diverse biological premises about miRNAs and diseases were combined to construct the embedding layer in the left part of the framework, from a biological perspective. Second, the various connection edges in the miRNA-disease network, such as similarity and association connections, were dependent on each other. Therefore, it was necessary to learn the low-dimensional representations of the miRNA and disease nodes based on the entire network. The right part of the framework learnt the low-dimensional representation of each miRNA and disease node based on non-negative matrix factorization, and these representations were used to establish the corresponding embedding layer. Finally, the left and right embedding layers went through convolutional modules to deeply learn the complex and non-linear relationships among the similarities and associations between miRNAs and diseases. Experimental results based on cross validation indicated that CNNMDA yields superior performance compared to several state-of-the-art methods. Furthermore, case studies on lung, breast, and pancreatic neoplasms demonstrated the powerful ability of CNNMDA to discover potential disease miRNAs.


Genes ◽  
2019 ◽  
Vol 10 (9) ◽  
pp. 685 ◽  
Author(s):  
Xuan ◽  
Zhang ◽  
Zhang ◽  
Li ◽  
Zhao

Predicting the potential microRNA (miRNA) candidates associated with a disease helps in exploring the mechanisms of disease development. Most recent approaches have utilized heterogeneous information about miRNAs and diseases, including miRNA similarities, disease similarities, and miRNA-disease associations. However, these methods do not utilize the projections of miRNAs and diseases in a low-dimensional space. Thus, it is necessary to develop a method that can utilize the effective information in the low-dimensional space to predict potential disease-related miRNA candidates. We proposed a method based on non-negative matrix factorization, named DMAPred, to predict potential miRNA-disease associations. DMAPred exploits the similarities and associations of diseases and miRNAs, and it integrates local topological information of the miRNA network. The likelihood that a miRNA is associated with a disease also depends on their projections in low-dimensional space. Therefore, we project miRNAs and diseases into low-dimensional feature space to yield their low-dimensional and dense feature representations. Moreover, the sparse characteristic of miRNA-disease associations was introduced to make our predictive model more credible. DMAPred achieved superior performance for 15 well-characterized diseases with AUCs (area under the receiver operating characteristic curve) ranging from 0.860 to 0.973 and AUPRs (area under the precision-recall curve) ranging from 0.118 to 0.761. In addition, case studies on breast, prostatic, and lung neoplasms demonstrated the ability of DMAPred to discover potential disease-related miRNAs.


Author(s):  
Chengqian Lu ◽  
Min Zeng ◽  
Fang-Xiang Wu ◽  
Min Li ◽  
Jianxin Wang

Abstract Motivation Emerging studies indicate that circular RNAs (circRNAs) are widely involved in the progression of human diseases. Due to its special structure which is stable, circRNAs are promising diagnostic and prognostic biomarkers for diseases. However, the experimental verification of circRNA-disease associations is expensive and limited to small-scale. Effective computational methods for predicting potential circRNA-disease associations are regarded as a matter of urgency. Although several models have been proposed, over-reliance on known associations and the absence of characteristics of biological functions make precise predictions are still challenging. Results In this study, we propose a method for predicting CircRNA-Disease Associations based on Sequence and Ontology Representations, named CDASOR, with convolutional and recurrent neural networks. For sequences of circRNAs, we encode them with continuous k-mers, get low-dimensional vectors of k-mers, extract their local feature vectors with 1 D CNN and learn their long-term dependencies with bi-directional long short-term memory. For diseases, we serialize disease ontology into sentences containing the hierarchy of ontology, obtain low-dimensional vectors for disease ontology terms and get terms’ dependencies. Furthermore, we get association patterns of circRNAs and diseases from known circRNA-disease associations with neural networks. After the above steps, we get circRNAs’ and diseases’ high-level representations which are informative to improve the prediction. The experimental results show that CDASOR provides an accurate prediction. Importing the characteristics of biological functions, CDASOR achieves impressive predictions in the de novo test. In addition, 6 of the top-10 predicted results are verified by the published literature in the case studies. Availability The code of CDASOR is freely available at https://github.com/BioinformaticsCSU/CDASOR Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Author(s):  
Artur Schweidtmann ◽  
Jan Rittig ◽  
Andrea König ◽  
Martin Grohe ◽  
Alexander Mitsos ◽  
...  

<div>Prediction of combustion-related properties of (oxygenated) hydrocarbons is an important and challenging task for which quantitative structure-property relationship (QSPR) models are frequently employed. Recently, a machine learning method, graph neural networks (GNNs), has shown promising results for the prediction of structure-property relationships. GNNs utilize a graph representation of molecules, where atoms correspond to nodes and bonds to edges containing information about the molecular structure. More specifically, GNNs learn physico-chemical properties as a function of the molecular graph in a supervised learning setup using a backpropagation algorithm. This end-to-end learning approach eliminates the need for selection of molecular descriptors or structural groups, as it learns optimal fingerprints through graph convolutions and maps the fingerprints to the physico-chemical properties by deep learning. We develop GNN models for predicting three fuel ignition quality indicators, i.e., the derived cetane number (DCN), the research octane number (RON), and the motor octane number (MON), of oxygenated and non-oxygenated hydrocarbons. In light of limited experimental data in the order of hundreds, we propose a combination of multi-task learning, transfer learning, and ensemble learning. The results show competitive performance of the proposed GNN approach compared to state-of-the-art QSPR models making it a promising field for future research. The prediction tool is available via a web front-end at www.avt.rwth-aachen.de/gnn.</div>


2020 ◽  
Author(s):  
Zheng Lian ◽  
Jianhua Tao ◽  
Bin Liu ◽  
Jian Huang ◽  
Zhanlei Yang ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document