Drug-Target Interaction prediction using Multi Graph Regularized Nuclear Norm Minimization
AbstractThe identification of interactions between drugs and target proteins is crucial in pharmaceutical sciences. The experimental validation of interactions in genomic drug discovery is laborious and expensive; hence, there is a need for efficient and accurate in-silico techniques which can predict potential drug-target interactions to narrow down the search space for experimental verification.In this work, we propose a new framework, namely, Multi Graph Regularized Nuclear Norm Minimization, which predicts the interactions between drugs and proteins from three inputs: known drug-target interaction network, similarities over drugs and those over targets. The proposed method focuses on finding a low-rank interaction matrix that is structured by the proximities of drugs and targets encoded by graphs. Previous works on Drug Target Interaction (DTI) prediction have shown that incorporating drug and target similarities helps in learning the data manifold better by preserving the local geometries of the original data. But, there is no clear consensus on which kind and what combination of similarities would best assist the prediction task. Hence, we propose to use various multiple drug-drug similarities and target-target similarities as multiple graph Laplacian (over drugs/targets) regularization terms to capture the proximities exhaustively.Extensive cross-validation experiments on four benchmark datasets using standard evaluation metrics (AUPR and AUC) show that the proposed algorithm improves the predictive performance and outperforms recent state-of-the-art computational methods by a large margin.Author summaryThis work introduces a computational approach, namely Multi-Graph Regularized Nuclear Norm Minimization (MGRNNM), to predict potential interactions between drugs and targets. The novelty of MGRNNM lies in structuring drug-target interactions by multiple proximities of drugs and targets. There have been previous works which have graph regularized Matrix factorization and Matrix completion algorithms to incorporate the standard chemical structure drug similarity and genomic sequence target protein similarity, respectively. We introduce multiple drug-graph laplacian and target-graph laplacian regularization terms to the standard matrix completion framework to predict the missing values in the interaction matrix. The graph Laplacian terms are constructed from various kinds and combinations of similarities over drugs and targets (computed from the interaction matrix itself). In addition to this, we further improve the prediction accuracy by sparsifying the drug and target similarity matrices, respectively. For performance evaluation, we conducted extensive experiments on four benchmark datasets. The experimental results demonstrated that MGRNNM clearly outperforms recent state-of-the-art methods under three different cross-validation settings, in terms of the area under the ROC curve (AUC) and the area under the precision-recall curve (AUPR).