scholarly journals Predicting drug–target interactions Based on the Ensemble Models of Multiple Feature Pairs

2021 ◽  
Vol 22 (12) ◽  
pp. 6598
Author(s):  
Cheng Wang ◽  
Jun Zhang ◽  
Peng Chen ◽  
Bing Wang

Backgroud: The prediction of drug–target interactions (DTIs) is of great significance in drug development. It is time-consuming and expensive in traditional experimental methods. Machine learning can reduce the cost of prediction and is limited by the characteristics of imbalanced datasets and problems of essential feature selection. Methods: The prediction method based on the Ensemble model of Multiple Feature Pairs (Ensemble-MFP) is introduced. Firstly, three negative sets are generated according to the Euclidean distance of three feature pairs. Then, the negative samples of the validation set/test set are randomly selected from the union set of the three negative sets in the validation set/test set. At the same time, the ensemble model with weight is optimized and applied to the test set. Results: The area under the receiver operating characteristic curve (area under ROC, AUC) in three out of four sub-datasets in gold standard datasets was more than 94.0% in the prediction of new drugs. The effectiveness of the proposed method is also shown with the comparison of state-of-the-art methods and demonstration of predicted drug–target pairs. Conclusion: The Ensemble-MFP can weigh the existing feature pairs and has a good prediction effect for general prediction on new drugs.

2020 ◽  
Vol 2020 ◽  
pp. 1-14
Author(s):  
Xiaonan Zhao ◽  
Zhenzi Bai ◽  
Chenghua Li ◽  
Chuanlun Sheng ◽  
Hongyan Li

Studies have demonstrated the prognosis potential of long noncoding RNAs (lncRNAs) for hepatocellular carcinoma (HCC), but specific lncRNAs for hepatitis B virus- (HBV-) related HCC have rarely been reported. This study was aimed at identifying a lncRNA prognostic signature for HBV-HCC and exploring their underlying functions. The sequencing dataset was collected from The Cancer Genome Atlas database as the training set, while the microarray dataset was obtained from the European Bioinformatics Institute database (E-TABM-36) as the validation set. Univariate and multivariate Cox regression analyses identified that eight lncRNAs (TSPEAR-AS1, LINC00511, LINC01136, MKLN1-AS, LINC00506, KRTAP5-AS1, ZNF252P-AS1, and THUMPD3-AS1) were significantly associated with overall survival (OS). These eight lncRNAs were used to construct a risk score model. The Kaplan-Meier survival curve results showed that this risk score can significantly differentiate the OS between the high-risk group and the low-risk group. Receiver operating characteristic curve analysis demonstrated that this risk score exhibited good prediction effectiveness (area under the curve AUC=0.990 for the training set; AUC=0.903 for the validation set). Furthermore, this lncRNA risk score was identified as an independent prognostic factor in the multivariate analysis after adjusting other clinical characteristics. The crucial coexpression (LINC00511-CABYR, THUMPD3-AS1-TRIP13, LINC01136-SFN, LINC00506-ANLN, and KRTAP5-AS1/TSPEAR-AS1/MKLN1-AS/ZNF252P-AS1-MC1R) or competing endogenous RNA (THUMPD3-AS1-hsa-miR-450a-TRIP13) interaction axes were identified to reveal the possible functions of lncRNAs. These genes were enriched into cell cycle-related biological processes or pathways. In conclusion, our study identified a novel eight-lncRNA prognosis signature for HBV-HCC patients and these lncRNAs may be potential therapeutic targets.


2020 ◽  
Vol 15 (7) ◽  
pp. 750-757
Author(s):  
Jihong Wang ◽  
Yue Shi ◽  
Xiaodan Wang ◽  
Huiyou Chang

Background: At present, using computer methods to predict drug-target interactions (DTIs) is a very important step in the discovery of new drugs and drug relocation processes. The potential DTIs identified by machine learning methods can provide guidance in biochemical or clinical experiments. Objective: The goal of this article is to combine the latest network representation learning methods for drug-target prediction research, improve model prediction capabilities, and promote new drug development. Methods: We use large-scale information network embedding (LINE) method to extract network topology features of drugs, targets, diseases, etc., integrate features obtained from heterogeneous networks, construct binary classification samples, and use random forest (RF) method to predict DTIs. Results: The experiments in this paper compare the common classifiers of RF, LR, and SVM, as well as the typical network representation learning methods of LINE, Node2Vec, and DeepWalk. It can be seen that the combined method LINE-RF achieves the best results, reaching an AUC of 0.9349 and an AUPR of 0.9016. Conclusion: The learning method based on LINE network can effectively learn drugs, targets, diseases and other hidden features from the network topology. The combination of features learned through multiple networks can enhance the expression ability. RF is an effective method of supervised learning. Therefore, the Line-RF combination method is a widely applicable method.


2020 ◽  
Vol 163 (6) ◽  
pp. 1156-1165
Author(s):  
Juan Xiao ◽  
Qiang Xiao ◽  
Wei Cong ◽  
Ting Li ◽  
Shouluan Ding ◽  
...  

Objective To develop an easy-to-use nomogram for discrimination of malignant thyroid nodules and to compare diagnostic efficiency with the Kwak and American College of Radiology (ACR) Thyroid Imaging, Reporting and Data System (TI-RADS). Study Design Retrospective diagnostic study. Setting The Second Hospital of Shandong University. Subjects and Methods From March 2017 to April 2019, 792 patients with 1940 thyroid nodules were included into the training set; from May 2019 to December 2019, 174 patients with 389 nodules were included into the validation set. Multivariable logistic regression model was used to develop a nomogram for discriminating malignant nodules. To compare the diagnostic performance of the nomogram with the Kwak and ACR TI-RADS, the area under the receiver operating characteristic curve, sensitivity, specificity, and positive and negative predictive values were calculated. Results The nomogram consisted of 7 factors: composition, orientation, echogenicity, border, margin, extrathyroidal extension, and calcification. In the training set, for all nodules, the area under the curve (AUC) for the nomogram was 0.844, which was higher than the Kwak TI-RADS (0.826, P = .008) and the ACR TI-RADS (0.810, P < .001). For the 822 nodules >1 cm, the AUC of the nomogram was 0.891, which was higher than the Kwak TI-RADS (0.852, P < .001) and the ACR TI-RADS (0.853, P < .001). In the validation set, the AUC of the nomogram was also higher than the Kwak and ACR TI-RADS ( P < .05), each in the whole series and separately for nodules >1 or ≤1 cm. Conclusions When compared with the Kwak and ACR TI-RADS, the nomogram had a better performance in discriminating malignant thyroid nodules.


Cancers ◽  
2021 ◽  
Vol 13 (4) ◽  
pp. 913
Author(s):  
Johannes Fahrmann ◽  
Ehsan Irajizad ◽  
Makoto Kobayashi ◽  
Jody Vykoukal ◽  
Jennifer Dennison ◽  
...  

MYC is an oncogenic driver in the pathogenesis of ovarian cancer. We previously demonstrated that MYC regulates polyamine metabolism in triple-negative breast cancer (TNBC) and that a plasma polyamine signature is associated with TNBC development and progression. We hypothesized that a similar plasma polyamine signature may associate with ovarian cancer (OvCa) development. Using mass spectrometry, four polyamines were quantified in plasma from 116 OvCa cases and 143 controls (71 healthy controls + 72 subjects with benign pelvic masses) (Test Set). Findings were validated in an independent plasma set from 61 early-stage OvCa cases and 71 healthy controls (Validation Set). Complementarity of polyamines with CA125 was also evaluated. Receiver operating characteristic area under the curve (AUC) of individual polyamines for distinguishing cases from healthy controls ranged from 0.74–0.88. A polyamine signature consisting of diacetylspermine + N-(3-acetamidopropyl)pyrrolidin-2-one in combination with CA125 developed in the Test Set yielded improvement in sensitivity at >99% specificity relative to CA125 alone (73.7% vs 62.2%; McNemar exact test 2-sided P: 0.019) in the validation set and captured 30.4% of cases that were missed with CA125 alone. Our findings reveal a MYC-driven plasma polyamine signature associated with OvCa that complemented CA125 in detecting early-stage ovarian cancer.


Cancers ◽  
2021 ◽  
Vol 13 (9) ◽  
pp. 2111
Author(s):  
Bo-Wei Zhao ◽  
Zhu-Hong You ◽  
Lun Hu ◽  
Zhen-Hao Guo ◽  
Lei Wang ◽  
...  

Identification of drug-target interactions (DTIs) is a significant step in the drug discovery or repositioning process. Compared with the time-consuming and labor-intensive in vivo experimental methods, the computational models can provide high-quality DTI candidates in an instant. In this study, we propose a novel method called LGDTI to predict DTIs based on large-scale graph representation learning. LGDTI can capture the local and global structural information of the graph. Specifically, the first-order neighbor information of nodes can be aggregated by the graph convolutional network (GCN); on the other hand, the high-order neighbor information of nodes can be learned by the graph embedding method called DeepWalk. Finally, the two kinds of feature are fed into the random forest classifier to train and predict potential DTIs. The results show that our method obtained area under the receiver operating characteristic curve (AUROC) of 0.9455 and area under the precision-recall curve (AUPR) of 0.9491 under 5-fold cross-validation. Moreover, we compare the presented method with some existing state-of-the-art methods. These results imply that LGDTI can efficiently and robustly capture undiscovered DTIs. Moreover, the proposed model is expected to bring new inspiration and provide novel perspectives to relevant researchers.


2021 ◽  
Author(s):  
Seiichiro Abe ◽  
Juntaro Matsuzaki ◽  
Kazuki Sudo ◽  
Ichiro Oda ◽  
Hitoshi Katai ◽  
...  

Abstract Background The aim of this study was to identify serum miRNAs that discriminate early gastric cancer (EGC) samples from non-cancer controls using a large cohort. Methods This retrospective case–control study included 1417 serum samples from patients with EGC (seen at the National Cancer Center Hospital in Tokyo between 2008 and 2012) and 1417 age- and gender-matched non-cancer controls. The samples were randomly assigned to discovery and validation sets and the miRNA expression profiles of whole serum samples were comprehensively evaluated using a highly sensitive DNA chip (3D-Gene®) designed to detect 2565 miRNA sequences. Diagnostic models were constructed using the levels of several miRNAs in the discovery set, and the diagnostic performance of the model was evaluated in the validation set. Results The discovery set consisted of 708 samples from EGC patients and 709 samples from non-cancer controls, and the validation set consisted of 709 samples from EGC patients and 708 samples from non-cancer controls. The diagnostic EGC index was constructed using four miRNAs (miR-4257, miR-6785-5p, miR-187-5p, and miR-5739). In the discovery set, a receiver operating characteristic curve analysis of the EGC index revealed that the area under the curve (AUC) was 0.996 with a sensitivity of 0.983 and a specificity of 0.977. In the validation set, the AUC for the EGC index was 0.998 with a sensitivity of 0.996 and a specificity of 0.953. Conclusions A novel combination of four serum miRNAs could be a useful non-invasive diagnostic biomarker to detect EGC with high accuracy. A multicenter prospective study is ongoing to confirm the present observations.


2012 ◽  
Vol 446-449 ◽  
pp. 1432-1436
Author(s):  
Suo Wang

In order to predict tunnel surrounding rock pressure, this paper puts forward a series of dynamic numerical simulative model on the tunnel excavation. According to the change of rock damage in the construction program, it adjusts dynamically the mechanical material parameters of surrounding rock. So the model achieves the purpose which is controlling and simulating the process of tunnel progressive damage. In accordance with the numerical simulative results, it analyzes the relationship between the rock parameters with the plastic strain, radial displacement. Then this paper proposes a prediction method of tunnel surrounding rock pressure based on the theory of the progressive damage and method of characteristic curve. Finally, it compares the pressure on the numerical simulative models with on the site date, and it proves that the prediction method has practical engineering value.


2021 ◽  
Vol 15 ◽  
pp. 117793222110091
Author(s):  
Badreddine Nouadi ◽  
Abdelkarim Ezaouine ◽  
Mariame El Messal ◽  
Mohamed Blaghen ◽  
Faiza Bennis ◽  
...  

The emerging pathogen SARS-CoV2 causing coronavirus disease 2019 (COVID-19) is a global public health challenge. To the present day, COVID-19 had affected more than 40 million people worldwide. The exploration and the development of new bioactive compounds with cost-effective and specific anti-COVID 19 therapeutic power is the prime focus of the current medical research. Thus, the exploitation of the molecular docking technique has become essential in the discovery and development of new drugs, to better understand drug-target interactions in their original environment. This work consists of studying the binding affinity and the type of interactions, through molecular docking, between 54 compounds from Moroccan medicinal plants, dextran sulfate and heparin (compounds not derived from medicinal plants), and 3CLpro-SARS-CoV-2, ACE2, and the post fusion core of 2019-nCoV S2 subunit. The PDB files of the target proteins and prepared herbal compounds (ligands) were subjected for docking to AutoDock Vina using UCSF Chimera, which provides a list of potential complexes based on the criteria of form complementarity of the natural compound with their binding affinities. The results of molecular docking revealed that Taxol, Rutin, Genkwanine, and Luteolin-glucoside have a high affinity with ACE2 and 3CLpro. Therefore, these natural compounds can have 2 effects at once, inhibiting 3CLpro and preventing recognition between the virus and ACE2. These compounds may have a potential therapeutic effect against SARS-CoV2, and therefore natural anti-COVID-19 compounds.


2021 ◽  
Author(s):  
Euxu Xie ◽  
Xuelian Gu ◽  
Chen Ma ◽  
Li Guo ◽  
Man Li ◽  
...  

Abstract Objective To develop and validate a nomogram for predicting bladder calculi risk in patients with benign prostatic hyperplasia (BPH).Methods A total of 368 patients who underwent transurethral resection of the prostate (TURP) and had histologically proven BPH from January 2018 to January 2021 were retrospectively collected. Eligible patients were randomly assigned to the training and validation datasets. Least absolute shrinkage and selection operator (LASSO) regression was used to select the optimal risk factors. A prediction model was established based on the selected characteristics. The performance of the nomogram was assessed by calibration plots and the area under the receiver operating characteristic curve (AUROC). Furthermore, decision curve analysis (DCA) was used to determine the net benefit rate of of the nomogram. Results Among 368 patients who met the inclusion criteria, older age, a history of diabetes and hyperuricemia, longer intravesical prostatic protrusion (IPP)and larger prostatic urethral angulation (PUA) were independent risk factors for bladder calculi in patients with BPH. These factors were used to develop a nomogram, which had a good identification ability in predicting the risk of bladder calculi in patients, with AUROCs of 0.911 (95% CI: 0.876–0.945) in the training set and 0.884 (95% CI: 0.820–0.948) in the validation set. The calibration plot showed that the model had good calibration. Moreover, DCA indicated that the model had a goodclinical benefit. Conclusion We developed and internally validated the first nomogram to date to help physicians assess the risk of bladder calculi in patients with BPH, which may help physicians improve individual interventions and make better clinical decisions.


Sign in / Sign up

Export Citation Format

Share Document