scholarly journals A Hadoop-Based Method to Predict Potential Effective Drug Combination

2014 ◽  
Vol 2014 ◽  
pp. 1-5 ◽  
Author(s):  
Yifan Sun ◽  
Yi Xiong ◽  
Qian Xu ◽  
Dongqing Wei

Combination drugs that impact multiple targets simultaneously are promising candidates for combating complex diseases due to their improved efficacy and reduced side effects. However, exhaustive screening of all possible drug combinations is extremely time-consuming and impractical. Here, we present a novel Hadoop-based approach to predict drug combinations by taking advantage of the MapReduce programming model, which leads to an improvement of scalability of the prediction algorithm. By integrating the gene expression data of multiple drugs, we constructed data preprocessing and the support vector machines and naïve Bayesian classifiers on Hadoop for prediction of drug combinations. The experimental results suggest that our Hadoop-based model achieves much higher efficiency in the big data processing steps with satisfactory performance. We believed that our proposed approach can help accelerate the prediction of potential effective drugs with the increasing of the combination number at an exponential rate in future. The source code and datasets are available upon request.

2005 ◽  
Vol 14 (05) ◽  
pp. 849-865 ◽  
Author(s):  
YING ZHAO ◽  
GEORGE KARYPIS

Contact map prediction is of great interest for its application in fold recognition and protein 3D structure determination. In this paper we present a contact-map prediction algorithm that employs Support Vector Machines as the machine learning tool and incorporates various features such as sequence profiles and their conservations, correlated mutation analysis based on various amino acid physicochemical properties, and secondary structure. In addition, we evaluated the effectiveness of the different features on contact map prediction for different fold classes. On average, our predictor achieved a prediction accuracy of 0.224 with an improvement over a random predictor of a factor 11.7, which is better than reported studies. Our study showed that predicted secondary structure features play an important roles for the proteins containing beta-structures. Models based on secondary structure features and correlated mutation analysis features produce different sets of predictions. Our study also suggests that models learned separately for different protein fold families may achieve better performance than a unified model.


2010 ◽  
Vol 4 (2) ◽  
Author(s):  
Yun Park ◽  
Theoden Netoff ◽  
Keshab Parhi

A patient-specific seizure prediction algorithm is proposed using a classifier to differentiate pre-ictal from inter-ictal EEG signals. The spectral power of EEG processed in four different fashions is used as features: raw, time-differential, space-differential, and time/space-differential EEG. The features are classified using cost-sensitive support vector machines by the double cross-validation methodology. The proposed algorithm has been applied to EEG recordings of 18 patients in the Freiburg EEG database, totaling 80 seizures and 437 h long inter-ictal recordings. Classification with the feature obtained from time/space-differential ECoG demonstrates the performance of 86.25% sensitivity and 0.1281 false positives per hour in out-of-sample testing.


Neurosurgery ◽  
2018 ◽  
Vol 65 (CN_suppl_1) ◽  
pp. 139-139
Author(s):  
Vibhor Krishna ◽  
Francesco Sammartino ◽  
Qinwan Rabbani ◽  
Barbara K Changizi ◽  
Punit Agrawal ◽  
...  

Abstract INTRODUCTION Deep brain stimulation (DBS) titration is experience dependent and time consuming. It is expected to be more challenging with the wider use of directional DBS leads. Connectivity-based methods for stimulation titration are required. We hypothesized that stimulation parameters can be estimated based on the cortical connections of the DBS electrodes. METHODS Twenty-four Parkinson's disease (PD) patients with subthalamic nucleus (STN) DBS were included. All patients had preoperative 3T diffusion imaging (60 directions) and 1-yr follow-up after DBS. We recorded parameters associated with stimulation-induced acute clinical effects (ACE) during DBS programming. We classified them into improvement (rigidity, bradykinesia, and tremor) and side effects (paresthesia, motor contractions, visual disturbances). Using probabilistic tractography, we identified the cortical voxels uniquely associated with each ACE category. A prediction algorithm, based on support vector machines (SVM) with repeated cross-validation, was trained on the unique features of cortical connectivity. This algorithm was then used to estimate the optimal contact and stimulation amplitude combination for each DBS contact in both hemispheres. A blinded comparison with actual stimulation parameters was done using sensitivity analysis. We also tested the classifier on another independent cohort of 14 PD patients with STN DBS. RESULTS Clusters in premotor and SMA (area 6) were significantly associated with therapeutic stimulation. At 1 yr, 42 of the 47 stimulation electrodes were accurately estimated as “efficacious” and the therapeutic window calculated to be = 3 V in 31(66%) and between 2 and 2.9 V in 11(24%), respectively. The SVM algorithm had excellent sensitivity (area under curve = 0.8506, 95% confidence interval 0.7026-0.9987). Its sensitivity was maintained in the validation cohort. CONCLUSION The optimal stimulation settings after DBS can be estimated from the pattern of cortical connections of each electrode. Prospective validation in a larger cohort may help test the prediction accuracy of this approach.


2020 ◽  
Author(s):  
Lewis Mervin ◽  
Avid M. Afzal ◽  
Ola Engkvist ◽  
Andreas Bender

In the context of bioactivity prediction, the question of how to calibrate a score produced by a machine learning method into reliable probability of binding to a protein target is not yet satisfactorily addressed. In this study, we compared the performance of three such methods, namely Platt Scaling, Isotonic Regression and Venn-ABERS in calibrating prediction scores for ligand-target prediction comprising the Naïve Bayes, Support Vector Machines and Random Forest algorithms with bioactivity data available at AstraZeneca (40 million data points (compound-target pairs) across 2112 targets). Performance was assessed using Stratified Shuffle Split (SSS) and Leave 20% of Scaffolds Out (L20SO) validation.


2020 ◽  
Vol 4 (2) ◽  
pp. 329-335
Author(s):  
Rusydi Umar ◽  
Imam Riadi ◽  
Purwono

The failure of most startups in Indonesia is caused by team performance that is not solid and competent. Programmers are an integral profession in a startup team. The development of social media can be used as a strategic tool for recruiting the best programmer candidates in a company. This strategic tool is in the form of an automatic classification system of social media posting from prospective programmers. The classification results are expected to be able to predict the performance patterns of each candidate with a predicate of good or bad performance. The classification method with the best accuracy needs to be chosen in order to get an effective strategic tool so that a comparison of several methods is needed. This study compares classification methods including the Support Vector Machines (SVM) algorithm, Random Forest (RF) and Stochastic Gradient Descent (SGD). The classification results show the percentage of accuracy with k = 10 cross validation for the SVM algorithm reaches 81.3%, RF at 74.4%, and SGD at 80.1% so that the SVM method is chosen as a model of programmer performance classification on social media activities.


Sign in / Sign up

Export Citation Format

Share Document