Deep neural network affinity model for BACE inhibitors in D3R Grand Challenge 4

Mapping Intimacies ◽

10.1101/680306 ◽

2019 ◽

Author(s):

Bo Wang ◽

Ho-Leung Ng

Keyword(s):

Neural Network ◽

Machine Learning ◽

Crystal Structures ◽

Deep Neural Network ◽

Macrocyclic Ligands ◽

Support Vector ◽

Grand Challenge ◽

Scoring Functions ◽

Learning Models ◽

Machine Learning Models

AbstractDrug Design Data Resource (D3R) Grand Challenge 4 (GC4) offered a unique opportunity for designing and testing novel methodology for accurate docking and affinity prediction of ligands in an open and blinded manner. We participated in the beta-secretase 1 (BACE) Subchallenge which is comprised of cross-docking and redocking of 20 macrocyclic ligands to BACE and predicting binding affinity for 154 macrocyclic ligands. For this challenge, we developed machine learning models trained specifically on BACE. We developed a deep neural network (DNN) model that used a combination of both structure and ligand-based features that outperformed simpler machine learning models. According to the results released by D3R, we achieved a Spearman’s rank correlation coefficient of 0.43(7) for predicting the affinity of 154 ligands. We describe the formulation of our machine learning strategy in detail. We compared the performance of DNN with linear regression, random forest, and support vector machines using ligand-based, structure-based, and combining both ligand and structure-based features. We compared different structures for our DNN and found that performance was highly dependent on fine optimization of the L2 regularization hyperparameter, alpha. We also developed a novel metric of ligand three-dimensional similarity inspired by crystallographic difference density maps to match ligands without crystal structures to similar ligands with known crystal structures. This report demonstrates that detailed parameterization, careful data training and implementation, and extensive feature analysis are necessary to obtain strong performance with more complex machine learning methods. Post hoc analysis shows that scoring functions based only on ligand features are competitive with those also using structural features. Our DNN approach tied for fifth in predicting BACE-ligand binding affinities.

Download Full-text

Predicting S&P 500 Market Price by Deep Neural Network and Enemble Model

E3S Web of Conferences ◽

10.1051/e3sconf/202021402040 ◽

2020 ◽

Vol 214 ◽

pp. 02040

Author(s):

Feiyu Wang

Keyword(s):

Neural Network ◽

Machine Learning ◽

Support Vector Machine ◽

Linear Regression ◽

Deep Neural Network ◽

Market Price ◽

Support Vector ◽

Learning Models ◽

Conventional Machine ◽

Machine Learning Models

The method to predict the movement of stock market has appealed to scientists for decades. In this article, we use three different models to tackle that problem. In particular, we propose a Deep Neural Network (DNN) to predict the intraday direction of SP500 index and compare the DNN with two conventional machine learning models, i.e. linear regression, support vector machine. We demonstrate that DNN is able to predict SP500 index with relatively highest accuracy.

Download Full-text

First-Break Picking Classification Models Using Recurrent Neural Network

10.2118/204862-ms ◽

2021 ◽

Author(s):

Mohammed Ayub ◽

SanLinn Kaka

Keyword(s):

Neural Network ◽

Machine Learning ◽

Deep Neural Network ◽

Contextual Information ◽

Classification Model ◽

Superior Performance ◽

Learning Models ◽

Neural Network Models ◽

Minimum Number ◽

Machine Learning Models

Abstract Manual first-break picking from a large volume of seismic data is extremely tedious and costly. Deployment of machine learning models makes the process fast and cost effective. However, these machine learning models require high representative and effective features for accurate automatic picking. Therefore, First- Break (FB) picking classification model that uses effective minimum number of features and promises performance efficiency is proposed. The variants of Recurrent Neural Networks (RNNs) such as Long ShortTerm Memory (LSTM) and Gated Recurrent Unit (GRU) can retain contextual information from long previous time steps. We deploy this advantage for FB picking as seismic traces are amplitude values of vibration along the time-axis. We use behavioral fluctuation of amplitude as input features for LSTM and GRU. The models are trained on noisy data and tested for generalization on original traces not seen during the training and validation process. In order to analyze the real-time suitability, the performance is benchmarked using accuracy, F1-measure and three other established metrics. We have trained two RNN models and two deep Neural Network models for FB classification using only amplitude values as features. Both LSTM and GRU have the accuracy and F1-measure with a score of 94.20%. With the same features, Convolutional Neural Network (CNN) has an accuracy of 93.58% and F1-score of 93.63%. Again, Deep Neural Network (DNN) model has scores of 92.83% and 92.59% as accuracy and F1-measure, respectively. From the pexperiment results, we see significant superior performance of LSTM and GRU to CNN and DNN when used the same features. For robustness of LSTM and GRU models, the performance is compared with DNN model that is trained using nine features derived from seismic traces and observed that the performance superiority of RNN models. Therefore, it is safe to conclude that RNN models (LSTM and GRU) are capable of classifying the FB events efficiently even by using a minimum number of features that are not computationally expensive. The novelty of our work is the capability of automatic FB classification with the RNN models that incorporate contextual behavioral information without the need for sophisticated feature extraction or engineering techniques that in turn can help in reducing the cost and fostering classification model robust and faster.

Download Full-text

Fraudulent Face Image Detection

ITM Web of Conferences ◽

10.1051/itmconf/20203203005 ◽

2020 ◽

Vol 32 ◽

pp. 03005

Author(s):

Rahul Awhad ◽

Saurabh Jayswal ◽

Adesh More ◽

Jyoti Kundale

Keyword(s):

Neural Network ◽

Machine Learning ◽

Face Image ◽

Support Vector ◽

Learning Models ◽

Image Detection ◽

Software Applications ◽

Support Vector Classifier ◽

The Face ◽

Machine Learning Models

Due to the growing advancements in technology, many software applications are being developed to modify and edit images. Such software can be used to alter images. Nowadays, an altered image is so realistic that it becomes too difficult for a person to identify whether the image is fake or real. Such software applications can be used to alter the image of a person’s face also. So, it becomes very difficult to identify whether the image of the face is real or not. Our proposed system is used to identify whether the image of a face is fake or real. The proposed system makes use of machine learning. The system makes use of a convolution neural network and support vector classifier. Both these machine learning models are trained using real as well as fake images. Both these trained models will take an image as an input and will determine whether the image is fake or real.

Download Full-text

Machine learning model for predicting the optimal depth of tracheal tube insertion in pediatric patients: A retrospective cohort study

PLoS ONE ◽

10.1371/journal.pone.0257069 ◽

2021 ◽

Vol 16 (9) ◽

pp. e0257069

Author(s):

Jae-Geum Shim ◽

Kyoung-Ho Ryu ◽

Sung Hyun Lee ◽

Eun-Ah Cho ◽

Sungho Lee ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Artificial Neural Network ◽

Support Vector Machine ◽

Random Forest ◽

Tracheal Tube ◽

Pediatric Patients ◽

Support Vector ◽

Learning Models ◽

Machine Learning Models

Objective To construct a prediction model for optimal tracheal tube depth in pediatric patients using machine learning. Methods Pediatric patients aged <7 years who received post-operative ventilation after undergoing surgery between January 2015 and December 2018 were investigated in this retrospective study. The optimal location of the tracheal tube was defined as the median of the distance between the upper margin of the first thoracic(T1) vertebral body and the lower margin of the third thoracic(T3) vertebral body. We applied four machine learning models: random forest, elastic net, support vector machine, and artificial neural network and compared their prediction accuracy to three formula-based methods, which were based on age, height, and tracheal tube internal diameter(ID). Results For each method, the percentage with optimal tracheal tube depth predictions in the test set was calculated as follows: 79.0 (95% confidence interval [CI], 73.5 to 83.6) for random forest, 77.4 (95% CI, 71.8 to 82.2; P = 0.719) for elastic net, 77.0 (95% CI, 71.4 to 81.8; P = 0.486) for support vector machine, 76.6 (95% CI, 71.0 to 81.5; P = 1.0) for artificial neural network, 66.9 (95% CI, 60.9 to 72.5; P < 0.001) for the age-based formula, 58.5 (95% CI, 52.3 to 64.4; P< 0.001) for the tube ID-based formula, and 44.4 (95% CI, 38.3 to 50.6; P < 0.001) for the height-based formula. Conclusions In this study, the machine learning models predicted the optimal tracheal tube tip location for pediatric patients more accurately than the formula-based methods. Machine learning models using biometric variables may help clinicians make decisions regarding optimal tracheal tube depth in pediatric patients.

Download Full-text

Predicting mortality in SARS-COV-2 (COVID-19) positive patients in the inpatient setting using a Novel Deep Neural Network

10.1101/2020.12.13.20247254 ◽

2020 ◽

Author(s):

Maleeha Naseem ◽

Hajra Arshad ◽

Syeda Amrah Hashimi ◽

Furqan Irfan ◽

Fahad Shabbir Ahmed

Keyword(s):

Neural Network ◽

Machine Learning ◽

Risk Factors ◽

Deep Neural Network ◽

Multivariate Analyses ◽

High Accuracy ◽

Inpatient Setting ◽

Learning Models ◽

Rt Pcr ◽

Machine Learning Models

ABSTRACTBackgroundThe second wave of COVID-19 pandemic is anticipated to be worse than the initial one and will strain the healthcare systems even more during the winter months. Our aim was to develop a machine learning-based model to predict mortality using the deep learning Neo-V framework. We hypothesized this novel machine learning approach could be applied to COVID-19 patients to predict mortality successfully with high accuracy.MethodsThe current Deep-Neo-V model is built on our previously statistically rigorous machine learning framework [Fahad-Liaqat-Ahmad Intensive Machine (FLAIM) framework] that evaluated statistically significant risk factors, generated new combined variables and then supply these risk factors to deep neural network to predict mortality in RT-PCR positive COVID-19 patients in the inpatient setting. We analyzed adult patients (≥18 years) admitted to the Aga Khan University Hospital, Pakistan with a working diagnosis of COVID-19 infection (n=1228). We excluded patients that were negative on COVID-19 on RT-PCR, had incomplete or missing health records. The first phase selection of risk factor was done using Cox-regression univariate and multivariate analyses. In the second phase, we generated new variables and tested those statistically significant for mortality and in the third and final phase we applied deep neural networks and other traditional machine learning models like Decision Tree Model, k-nearest neighbor models and others.ResultsA total of 1228 cases were diagnosed as COVID-19 infection, we excluded 14 patients after the exclusion criteria and (n=)1214 patients were analyzed. We observed that several clinical and laboratory-based variables were statistically significant for both univariate and multivariate analyses while others were not. With most significant being septic shock (hazard ratio [HR], 4.30; 95% confidence interval [CI], 2.91-6.37), supportive treatment (HR, 3.51; 95% CI, 2.01-6.14), abnormal international normalized ratio (INR) (HR, 3.24; 95% CI, 2.28-4.63), admission to the intensive care unit (ICU) (HR, 3.24; 95% CI, 2.22-4.74), treatment with invasive ventilation (HR, 3.21; 95% CI, 2.15-4.79) and laboratory lymphocytic derangement (HR, 2.79; 95% CI, 1.6-4.86). Machine learning results showed our DNN (Neo-V) model outperformed all conventional machine learning models with test set accuracy of 99.53%, sensitivity of 89.87%, and specificity of 95.63%; positive predictive value, 50.00%; negative predictive value, 91.05%; and area under the curve of the receiver-operator curve of 88.5.ConclusionOur novel Deep-Neo-V model outperformed all other machine learning models. The model is easy to implement, user friendly and with high accuracy.

Download Full-text

Early Warning System for Online STEM Learning—A Slimmer Approach Using Recurrent Neural Networks

Sustainability ◽

10.3390/su132212461 ◽

2021 ◽

Vol 13 (22) ◽

pp. 12461

Author(s):

Chih-Chang Yu ◽

Yufeng (Leon) Wu

Keyword(s):

Neural Network ◽

Machine Learning ◽

Neural Networks ◽

At Risk ◽

At Risk Students ◽

Training Data ◽

Support Vector ◽

Learning Models ◽

Conventional Machine ◽

Machine Learning Models

While the use of deep neural networks is popular for predicting students’ learning outcomes, convolutional neural network (CNN)-based methods are used more often. Such methods require numerous features, training data, or multiple models to achieve week-by-week predictions. However, many current learning management systems (LMSs) operated by colleges cannot provide adequate information. To make the system more feasible, this article proposes a recurrent neural network (RNN)-based framework to identify at-risk students who might fail the course using only a few common learning features. RNN-based methods can be more effective than CNN-based methods in identifying at-risk students due to their ability to memorize time-series features. The data used in this study were collected from an online course that teaches artificial intelligence (AI) at a university in northern Taiwan. Common features, such as the number of logins, number of posts and number of homework assignments submitted, are considered to train the model. This study compares the prediction results of the RNN model with the following conventional machine learning models: logistic regression, support vector machines, decision trees and random forests. This work also compares the performance of the RNN model with two neural network-based models: the multi-layer perceptron (MLP) and a CNN-based model. The experimental results demonstrate that the RNN model used in this study is better than conventional machine learning models and the MLP in terms of F-score, while achieving similar performance to the CNN-based model with fewer parameters. Our study shows that the designed RNN model can identify at-risk students once one-third of the semester has passed. Some future directions are also discussed.

Download Full-text

Prediction of cancer incidence rates for the European continent using machine learning models

Health Informatics Journal ◽

10.1177/1460458220983878 ◽

2021 ◽

Vol 27 (1) ◽

pp. 146045822098387

Author(s):

Boran Sekeroglu ◽

Kubra Tuncal

Keyword(s):

Neural Network ◽

Colorectal Cancer ◽

Machine Learning ◽

Linear Regression ◽

Support Vector Regression ◽

Incidence Rates ◽

Support Vector ◽

Learning Models ◽

European Continent ◽

Machine Learning Models

Cancer is one of the most important and common public health problems on Earth that can occur in many different types. Treatments and precautions are aimed at minimizing the deaths caused by cancer; however, incidence rates continue to rise. Thus, it is important to analyze and estimate incidence rates to support the determination of more effective precautions. In this research, 2018 Cancer Datasheet of World Health Organization (WHO), is used and all countries on the European Continent are considered to analyze and predict the incidence rates until 2020, for Lung cancer, Breast cancer, Colorectal cancer, Prostate cancer and All types of cancer, which have highest incidence and mortality rates. Each cancer type is trained by six machine learning models namely, Linear Regression, Support Vector Regression, Decision Tree, Long-Short Term Memory neural network, Backpropagation neural network, and Radial Basis Function neural network according to gender types separately. Linear regression and support vector regression outperformed the other models with the [Formula: see text] scores 0.99 and 0.98, respectively, in initial experiments, and then used for prediction of incidence rates of the considered cancer types. The ML models estimated that the maximum rise of incidence rates would be in colorectal cancer for females by 6%.

Download Full-text

The Impact of CrystallographicData for the Development of Machine Learning Models to Predict Protein-Ligand Binding Affinity

Current Medicinal Chemistry ◽

10.2174/0929867328666210210121320 ◽

2021 ◽

Vol 28 ◽

Author(s):

Martina Veit-Acosta ◽

Walter Filgueira de Azevedo Junior

Keyword(s):

Machine Learning ◽

Experimental Data ◽

Ligand Binding ◽

Crystal Structures ◽

Binding Affinity ◽

Thermodynamic Data ◽

Ligand Docking ◽

Scoring Functions ◽

Learning Models ◽

Machine Learning Models

Background: One of the main challenges in the early stages of drug discovery is the computational assessment of protein-ligand binding affinity. Machine learning techniques can contribute to predicting this type of interaction. We may apply these techniques following two approaches. First, using the experimental structures for which affinity data is available. Second, using protein-ligand docking simulations. Objective: In this review, we describe recently published machine learning models based on crystal structures for which binding affinity and thermodynamic data are available. Method: We used experimental structures available at the protein data bank and binding affinity and thermodynamic data accessed at BindingDB, Binding MOAD, and PDBbind. We reviewed machine learning models to predict binding created using open source programs such as SAnDReS and Taba. Results: Analysis of machine learning models trained against datasets composed of crystal structure complexes indicated the high predictive performance of these models compared with classical scoring functions. Conclusion: The rapid increase in the number of crystal structures of protein-ligand complexes created a favorable scenario for developing machine learning models to predict binding affinity. These models rely on experimental data from two sources, the structural and the affinity data. The combination of experimental data generates computational models that outperform classical scoring functions.

Download Full-text

Predicting Depression Using Social Media Posts

Journal of Soft Computing and Data Mining ◽

10.30880/jscdm.2021.02.02.004 ◽

2021 ◽

Vol 2 (2) ◽

Author(s):

Fahem Abu Bakar ◽

◽

Nazri Mohd Nawi ◽

Abdulkareem A. Hezam ◽

◽

...

Keyword(s):

Neural Network ◽

Mental Health ◽

Machine Learning ◽

Social Media ◽

Support Vector Machine ◽

Neural Network Model ◽

Support Vector ◽

Learning Models ◽

The Neural Network ◽

Machine Learning Models

The use of Social Network Sites (SNS) is on the rise these days, particularly among the younger generations. Users can communicate their interests, feelings, and everyday routines thanks to the availability of social media sites. Many studies show that properly utilizing user-generated content (UGC) can aid in determining people's mental health status. The use of the UGC could aid in the prediction of mental health, particularly depression, where it is a significant medical condition that impairs one's ability to work, learn, eat, sleep, and enjoy life. However, all information about a person's mood and negativism can be gathered from their SNS user profile. Therefore, this study utilizes SNS as a data source by using machine learning models to screen and identify users in categorizing users based on their mental health. The performance of three machine learning models is evaluated to classify the UGC: Decision Forest, Neural Network, and Support Vector Machine (SVM). The results show that the accuracy and recall result of the Neural Network model is the same as the Support Vector Machine (SVM) model, which is 78.27% and 0.042, but Neural Network performs better in the average precision value. This proves that the Neural Network model is the best model for making predictions to determine the level of depression by using social media posts.

Download Full-text

Evaluation of Rainfall Erosivity Factor Estimation Using Machine and Deep Learning Models

Water ◽

10.3390/w13030382 ◽

2021 ◽

Vol 13 (3) ◽

pp. 382

Author(s):

Jimin Lee ◽

Seoro Lee ◽

Jiyeong Hong ◽

Dongjun Lee ◽

Joo Hyun Bae ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Soil Loss ◽

Deep Neural Network ◽

Rainfall Erosivity ◽

Learning Models ◽

R Factor ◽

Target Values ◽

R Factors ◽

Machine Learning Models

Rainfall erosivity factor (R-factor) is one of the Universal Soil Loss Equation (USLE) input parameters that account for impacts of rainfall intensity in estimating soil loss. Although many studies have calculated the R-factor using various empirical methods or the USLE method, these methods are time-consuming and require specialized knowledge for the user. The purpose of this study is to develop machine learning models to predict the R-factor faster and more accurately than the previous methods. For this, this study calculated R-factor using 1-min interval rainfall data for improved accuracy of the target value. First, the monthly R-factors were calculated using the USLE calculation method to identify the characteristics of monthly rainfall-runoff induced erosion. In turn, machine learning models were developed to predict the R-factor using the monthly R-factors calculated at 50 sites in Korea as target values. The machine learning algorithms used for this study were Decision Tree, K-Nearest Neighbors, Multilayer Perceptron, Random Forest, Gradient Boosting, eXtreme Gradient Boost, and Deep Neural Network. As a result of the validation with 20% randomly selected data, the Deep Neural Network (DNN), among seven models, showed the greatest prediction accuracy results. The DNN developed in this study was tested for six sites in Korea to demonstrate trained model performance with Nash–Sutcliffe Efficiency (NSE) and the coefficient of determination (R2) of 0.87. This means that our findings show that DNN can be efficiently used to estimate monthly R-factor at the desired site with much less effort and time with total monthly precipitation, maximum daily precipitation, and maximum hourly precipitation data. It will be used not only to calculate soil erosion risk but also to establish soil conservation plans and identify areas at risk of soil disasters by calculating rainfall erosivity factors.

Download Full-text