Network-based prediction of drug–target interactions using an arbitrary-order proximity embedded deep forest

Xiangxiang Zeng; Siyi Zhu; Yuan Hou; Pengyue Zhang; Lang Li; Jing Li; L Frank Huang; Stephen J Lewis; Ruth Nussinov; Feixiong Cheng

doi:10.1093/bioinformatics/btaa010

Network-based prediction of drug–target interactions using an arbitrary-order proximity embedded deep forest

Bioinformatics ◽

10.1093/bioinformatics/btaa010 ◽

2020 ◽

Vol 36 (9) ◽

pp. 2805-2812 ◽

Cited By ~ 16

Author(s):

Xiangxiang Zeng ◽

Siyi Zhu ◽

Yuan Hou ◽

Pengyue Zhang ◽

Lang Li ◽

...

Keyword(s):

Biological Networks ◽

Drug Target ◽

Large Scale ◽

Arbitrary Order ◽

Characteristic Curve ◽

External Validation ◽

Molecular Targets ◽

Drug Repurposing ◽

Supplementary Information ◽

Deep Forest

Abstract Motivation Systematic identification of molecular targets among known drugs plays an essential role in drug repurposing and understanding of their unexpected side effects. Computational approaches for prediction of drug–target interactions (DTIs) are highly desired in comparison to traditional experimental assays. Furthermore, recent advances of multiomics technologies and systems biology approaches have generated large-scale heterogeneous, biological networks, which offer unexpected opportunities for network-based identification of new molecular targets among known drugs. Results In this study, we present a network-based computational framework, termed AOPEDF, an arbitrary-order proximity embedded deep forest approach, for prediction of DTIs. AOPEDF learns a low-dimensional vector representation of features that preserve arbitrary-order proximity from a highly integrated, heterogeneous biological network connecting drugs, targets (proteins) and diseases. In total, we construct a heterogeneous network by uniquely integrating 15 networks covering chemical, genomic, phenotypic and network profiles among drugs, proteins/targets and diseases. Then, we build a cascade deep forest classifier to infer new DTIs. Via systematic performance evaluation, AOPEDF achieves high accuracy in identifying molecular targets among known drugs on two external validation sets collected from DrugCentral [area under the receiver operating characteristic curve (AUROC) = 0.868] and ChEMBL (AUROC = 0.768) databases, outperforming several state-of-the-art methods. In a case study, we showcase that multiple molecular targets predicted by AOPEDF are associated with mechanism-of-action of substance abuse disorder for several marketed drugs (such as aripiprazole, risperidone and haloperidol). Availability and implementation Source code and data can be downloaded from https://github.com/ChengF-Lab/AOPEDF. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

NeoDTI: Neural integration of neighbor information from a heterogeneous network for discovering new drug-target interactions

10.1101/261396 ◽

2018 ◽

Cited By ~ 1

Author(s):

Fangping Wan ◽

Lixiang Hong ◽

An Xiao ◽

Tao Jiang ◽

Jianyang Zeng

Keyword(s):

Drug Development ◽

Biological Networks ◽

Heterogeneous Network ◽

Drug Target ◽

Large Scale ◽

Drug Repositioning ◽

Supplementary Information ◽

Functional Roles ◽

Aggregation Techniques ◽

Wide Range

AbstractMotivationAccurately predicting drug-target interactions (DTIs) in silico can guide the drug discovery process and thus facilitate drug development. Computational approaches for DTI prediction that adopt the systems biology perspective generally exploit the rationale that the properties of drugs and targets can be characterized by their functional roles in biological networks.ResultsInspired by recent advance of information passing and aggregation techniques that generalize the convolution neural networks (CNNs) to mine large-scale graph data and greatly improve the performance of many network-related prediction tasks, we develop a new nonlinear end-to-end learning model, called NeoDTI, that integrates diverse information from heterogeneous network data and automatically learns topology-preserving representations of drugs and targets to facilitate DTI prediction. The substantial prediction performance improvement over other state-of-the-art DTI prediction methods as well as several novel predicted DTIs with evidence supports from previous studies have demonstrated the superior predictive power of NeoDTI. In addition, NeoDTI is robust against a wide range of choices of hyperparameters and is ready to integrate more drug and target related information (e.g., compound-protein binding affinity data). All these results suggest that NeoDTI can offer a powerful and robust tool for drug development and drug repositioning.Availability and implementationThe source code and data used in NeoDTI are available at: https://github.com/FangpingWan/[email protected] informationSupplementary data are available at Bioinformatics online.

Download Full-text

A Novel Method to Predict Drug-Target Interactions Based on Large-Scale Graph Representation Learning

Cancers ◽

10.3390/cancers13092111 ◽

2021 ◽

Vol 13 (9) ◽

pp. 2111

Author(s):

Bo-Wei Zhao ◽

Zhu-Hong You ◽

Lun Hu ◽

Zhen-Hao Guo ◽

Lei Wang ◽

...

Keyword(s):

Drug Target ◽

Large Scale ◽

Computational Models ◽

Structural Information ◽

Characteristic Curve ◽

Representation Learning ◽

Graph Representation ◽

Convolutional Network ◽

Novel Method

Identification of drug-target interactions (DTIs) is a significant step in the drug discovery or repositioning process. Compared with the time-consuming and labor-intensive in vivo experimental methods, the computational models can provide high-quality DTI candidates in an instant. In this study, we propose a novel method called LGDTI to predict DTIs based on large-scale graph representation learning. LGDTI can capture the local and global structural information of the graph. Specifically, the first-order neighbor information of nodes can be aggregated by the graph convolutional network (GCN); on the other hand, the high-order neighbor information of nodes can be learned by the graph embedding method called DeepWalk. Finally, the two kinds of feature are fed into the random forest classifier to train and predict potential DTIs. The results show that our method obtained area under the receiver operating characteristic curve (AUROC) of 0.9455 and area under the precision-recall curve (AUPR) of 0.9491 under 5-fold cross-validation. Moreover, we compare the presented method with some existing state-of-the-art methods. These results imply that LGDTI can efficiently and robustly capture undiscovered DTIs. Moreover, the proposed model is expected to bring new inspiration and provide novel perspectives to relevant researchers.

Download Full-text

SDTRLS: Predicting Drug-Target Interactions for Complex Diseases Based on Chemical Substructures

Complexity ◽

10.1155/2017/2713280 ◽

2017 ◽

Vol 2017 ◽

pp. 1-10 ◽

Cited By ~ 4

Author(s):

Cheng Yan ◽

Jianxin Wang ◽

Wei Lan ◽

Fang-Xiang Wu ◽

Yi Pan

Keyword(s):

Computational Methods ◽

Drug Target ◽

Large Scale ◽

High Efficiency ◽

Drug Repositioning ◽

External Validation ◽

Complex Diseases ◽

Biological Databases ◽

Biomolecular Networks ◽

New Chemical Entities

It is well known that drug discovery for complex diseases via biological experiments is a time-consuming and expensive process. Alternatively, the computational methods provide a low-cost and high-efficiency way for predicting drug-target interactions (DTIs) from biomolecular networks. However, the current computational methods mainly deal with DTI predictions of known drugs; there are few methods for large-scale prediction of failed drugs and new chemical entities that are currently stored in some biological databases may be effective for other diseases compared with their originally targeted diseases. In this study, we propose a method (called SDTRLS) which predicts DTIs through RLS-Kron model with chemical substructure similarity fusion and Gaussian Interaction Profile (GIP) kernels. SDTRLS can be an effective predictor for targets of old drugs, failed drugs, and new chemical entities from large-scale biomolecular network databases. Our computational experiments show that SDTRLS outperforms the state-of-the-art SDTNBI method; specifically, in the G protein-coupled receptors (GPCRs) external validation, the maximum and the average AUC values of SDTRLS are 0.842 and 0.826, respectively, which are superior to those of SDTNBI, which are 0.797 and 0.766, respectively. This study provides an important basis for new drug development and drug repositioning based on biomolecular networks.

Download Full-text

An In Silico Model for Predicting Drug-Induced Hepatotoxicity

International Journal of Molecular Sciences ◽

10.3390/ijms20081897 ◽

2019 ◽

Vol 20 (8) ◽

pp. 1897 ◽

Cited By ~ 13

Author(s):

Shuaibing He ◽

Tianyuan Ye ◽

Ruiying Wang ◽

Chenyang Zhang ◽

Xuelian Zhang ◽

...

Keyword(s):

Large Scale ◽

Characteristic Curve ◽

External Validation ◽

Model Development ◽

Qsar Model ◽

Quantitative Structure Activity Relationship ◽

New Drugs ◽

Drug Induced ◽

Drug Candidates ◽

External Test

As one of the leading causes of drug failure in clinical trials, drug-induced liver injury (DILI) seriously impeded the development of new drugs. Assessing the DILI risk of drug candidates in advance has been considered as an effective strategy to decrease the rate of attrition in drug discovery. Recently, there have been continuous attempts in the prediction of DILI. However, it indeed remains a huge challenge to predict DILI successfully. There is an urgent need to develop a quantitative structure–activity relationship (QSAR) model for predicting DILI with satisfactory performance. In this work, we reported a high-quality QSAR model for predicting the DILI risk of xenobiotics by incorporating the use of eight effective classifiers and molecular descriptors provided by Marvin. In model development, a large-scale and diverse dataset consisting of 1254 compounds for DILI was built through a comprehensive literature retrieval. The optimal model was attained by an ensemble method, averaging the probabilities from eight classifiers, with accuracy (ACC) of 0.783, sensitivity (SE) of 0.818, specificity (SP) of 0.748, and area under the receiver operating characteristic curve (AUC) of 0.859. For further validation, three external test sets and a large negative dataset were utilized. Consequently, both the internal and external validation indicated that our model outperformed prior studies significantly. Data provided by the current study will also be a valuable source for modeling/data mining in the future.

Download Full-text

Revealing new therapeutic opportunities through drug target prediction: a class imbalance-tolerant machine learning approach

Bioinformatics ◽

10.1093/bioinformatics/btaa495 ◽

2020 ◽

Vol 36 (16) ◽

pp. 4490-4497

Author(s):

Siqi Liang ◽

Haiyuan Yu

Keyword(s):

Machine Learning ◽

Drug Target ◽

Drug Targets ◽

Class Imbalance ◽

Target Prediction ◽

Drug Repurposing ◽

New Drugs ◽

Supplementary Information ◽

Training Scheme ◽

Drug Target Prediction

Abstract Motivation In silico drug target prediction provides valuable information for drug repurposing, understanding of side effects as well as expansion of the druggable genome. In particular, discovery of actionable drug targets is critical to developing targeted therapies for diseases. Results Here, we develop a robust method for drug target prediction by leveraging a class imbalance-tolerant machine learning framework with a novel training scheme. We incorporate novel features, including drug–gene phenotype similarity and gene expression profile similarity that capture information orthogonal to other features. We show that our classifier achieves robust performance and is able to predict gene targets for new drugs as well as drugs that potentially target unexplored genes. By providing newly predicted drug–target associations, we uncover novel opportunities of drug repurposing that may benefit cancer treatment through action on either known drug targets or currently undrugged genes. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Identification of Time-Invariant Biomarkers for Non-Genotoxic Hepatocarcinogen Assessment

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph17124298 ◽

2020 ◽

Vol 17 (12) ◽

pp. 4298 ◽

Cited By ~ 1

Author(s):

Shan-Han Huang ◽

Ying-Chi Lin ◽

Chun-Wei Tung

Keyword(s):

Large Scale ◽

Expression Profiles ◽

Characteristic Curve ◽

External Validation ◽

Enrichment Analysis ◽

Machine Learning Techniques ◽

Early Assessment ◽

Short Term ◽

Short Term Exposure ◽

Time Invariant

Non-genotoxic hepatocarcinogens (NGHCs) can only be confirmed by 2-year rodent studies. Toxicogenomics (TGx) approaches using gene expression profiles from short-term animal studies could enable early assessment of NGHCs. However, high variance in the modulation of the genes had been noted among exposure styles and datasets. Expanding from our previous strategy in identifying consensus biomarkers in multiple experiments, we aimed to identify time-invariant biomarkers for NGHCs in short-term exposure styles and validate their applicability to long-term exposure styles. In this study, nine time-invariant biomarkers, namely A2m, Akr7a3, Aqp7, Ca3, Cdc2a, Cdkn3, Cyp2c11, Ntf3, and Sds, were identified from four large-scale microarray datasets. Machine learning techniques were subsequently employed to assess the prediction performance of the biomarkers. The biomarker set along with the Random Forest models gave the highest median area under the receiver operating characteristic curve (AUC) of 0.824 and a low interquartile range (IQR) variance of 0.036 based on a leave-one-out cross-validation. The application of the models to the external validation datasets achieved high AUC values of greater than or equal to 0.857. Enrichment analysis of the biomarkers inferred the involvement of chronic inflammatory diseases such as liver cirrhosis, fibrosis, and hepatocellular carcinoma in NGHCs. The time-invariant biomarkers provided a robust alternative for NGHC prediction.

Download Full-text

DTI-Voodoo: machine learning over interaction networks and ontology-based background knowledge predicts drug–target interactions

10.1101/2021.04.28.441733 ◽

2021 ◽

Author(s):

Tilman Hinnerichs ◽

Robert Hoehndorf

Keyword(s):

Drug Target ◽

Drug Targets ◽

Interaction Network ◽

Drug Repurposing ◽

Computational Method ◽

Interaction Networks ◽

Supplementary Information ◽

Prediction Methods ◽

Link Type ◽

Molecular Features

AbstractMotivationIn silico drug–target interaction (DTI) prediction is important for drug discovery and drug repurposing. Approaches to predict DTIs can proceed indirectly, top-down, using phenotypic effects of drugs to identify potential drug targets, or they can be direct, bottom-up and use molecular information to directly predict binding potentials. Both approaches can be combined with information about interaction networks.ResultsWe developed DTI-Voodoo as a computational method that combines molecular features and ontology-encoded phenotypic effects of drugs with protein–protein interaction networks, and uses a graph convolutional neural network to predict DTIs. We demonstrate that drug effect features can exploit information in the interaction network whereas molecular features do not. DTI-Voodoo is designed to predict candidate drugs for a given protein; we use this formulation to show that common DTI datasets contain intrinsic biases with major affects on performance evaluation and comparison of DTI prediction methods. Using a modified evaluation scheme, we demonstrate that DTI-Voodoo improves significantly over state of the art DTI prediction methods.AvailabilityDTI-Voodoo source code and data necessary to reproduce results are freely available at https://github.com/THinnerichs/DTI-VOODOO.Supplementary informationSupplementary data are available at https://github.com/ THinnerichs/DTI-VOODOO.

Download Full-text

A new computational drug repurposing method using established disease–drug pair knowledge

Bioinformatics ◽

10.1093/bioinformatics/btz156 ◽

2019 ◽

Vol 35 (19) ◽

pp. 3672-3678 ◽

Cited By ~ 8

Author(s):

Nafiseh Saberian ◽

Azam Peyvandipour ◽

Michele Donato ◽

Sahar Ansari ◽

Sorin Draghici

Keyword(s):

Gene Expression ◽

Machine Learning ◽

Large Scale ◽

Expression Profiles ◽

Drug Repurposing ◽

Supervised Machine Learning ◽

Supplementary Information ◽

Sources Of Information ◽

Approved Drugs ◽

Fda Approved Drugs

Abstract Motivation Drug repurposing is a potential alternative to the classical drug discovery pipeline. Repurposing involves finding novel indications for already approved drugs. In this work, we present a novel machine learning-based method for drug repurposing. This method explores the anti-similarity between drugs and a disease to uncover new uses for the drugs. More specifically, our proposed method takes into account three sources of information: (i) large-scale gene expression profiles corresponding to human cell lines treated with small molecules, (ii) gene expression profile of a human disease and (iii) the known relationship between Food and Drug Administration (FDA)-approved drugs and diseases. Using these data, our proposed method learns a similarity metric through a supervised machine learning-based algorithm such that a disease and its associated FDA-approved drugs have smaller distance than the other disease-drug pairs. Results We validated our framework by showing that the proposed method incorporating distance metric learning technique can retrieve FDA-approved drugs for their approved indications. Once validated, we used our approach to identify a few strong candidates for repurposing. Availability and implementation The R scripts are available on demand from the authors. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Large-scale analysis of human gene expression variability associates highly variable drug targets with lower drug effectiveness and safety

Bioinformatics ◽

10.1093/bioinformatics/btz023 ◽

2019 ◽

Vol 35 (17) ◽

pp. 3028-3037 ◽

Cited By ~ 8

Author(s):

Eyal Simonovsky ◽

Ronen Schuster ◽

Esti Yeger-Lotem

Keyword(s):

Drug Target ◽

Large Scale ◽

Target Genes ◽

Supplementary Information ◽

Protein Coding ◽

Expression Levels ◽

Expression Variability ◽

Protein Coding Genes ◽

Approved Drugs ◽

Variable Genes

Abstract Motivation The effectiveness of drugs tends to vary between patients. One of the well-known reasons for this phenomenon is genetic polymorphisms in drug target genes among patients. Here, we propose that differences in expression levels of drug target genes across individuals can also contribute to this phenomenon. Results To explore this hypothesis, we analyzed the expression variability of protein-coding genes, and particularly drug target genes, across individuals. For this, we developed a novel variability measure, termed local coefficient of variation (LCV), which ranks the expression variability of each gene relative to genes with similar expression levels. Unlike commonly used methods, LCV neutralizes expression levels biases without imposing any distribution over the variation and is robust to data incompleteness. Application of LCV to RNA-sequencing profiles of 19 human tissues and to target genes of 1076 approved drugs revealed that drug target genes were significantly more variable than protein-coding genes. Analysis of 113 drugs with available effectiveness scores showed that drugs targeting highly variable genes tended to be less effective in the population. Furthermore, comparison of approved drugs to drugs that were withdrawn from the market showed that withdrawn drugs targeted significantly more variable genes than approved drugs. Last, upon analyzing gender differences we found that the variability of drug target genes was similar between men and women. Altogether, our results suggest that expression variability of drug target genes could contribute to the variable responsiveness and effectiveness of drugs, and is worth considering during drug treatment and development. Availability and implementation LCV is available as a python script in GitHub (https://github.com/eyalsim/LCV). Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Breaking the paradigm: Dr Insight empowers signature-free, enhanced drug repurposing

Bioinformatics ◽

10.1093/bioinformatics/btz006 ◽

2019 ◽

Vol 35 (16) ◽

pp. 2818-2826 ◽

Cited By ~ 9

Author(s):

Jinyan Chan ◽

Xuan Wang ◽

Jacob A Turner ◽

Nicole E Baldwin ◽

Jinghua Gu

Keyword(s):

Drug Target ◽

Drug Targets ◽

Effective Means ◽

Drug Repurposing ◽

Superior Performance ◽

Supplementary Information ◽

Breast Cancer Dataset ◽

Specific Drug ◽

Cancer Dataset ◽

Disease Specific

Abstract Motivation Transcriptome-based computational drug repurposing has attracted considerable interest by bringing about faster and more cost-effective drug discovery. Nevertheless, key limitations of the current drug connectivity-mapping paradigm have been long overlooked, including the lack of effective means to determine optimal query gene signatures. Results The novel approach Dr Insight implements a frame-breaking statistical model for the ‘hand-shake’ between disease and drug data. The genome-wide screening of concordantly expressed genes (CEGs) eliminates the need for subjective selection of query signatures, added to eliciting better proxy for potential disease-specific drug targets. Extensive comparisons on simulated and real cancer datasets have validated the superior performance of Dr Insight over several popular drug-repurposing methods to detect known cancer drugs and drug–target interactions. A proof-of-concept trial using the TCGA breast cancer dataset demonstrates the application of Dr Insight for a comprehensive analysis, from redirection of drug therapies, to a systematic construction of disease-specific drug-target networks. Availability and implementation Dr Insight R package is available at https://cran.r-project.org/web/packages/DrInsight/index.html. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text