Artificial Intelligence Algorithms for Discovering New Active Compounds Targeting TRPA1 Pain Receptors

Abstract Background: Dipeptidyl Peptidase-4 (DPP-4) inhibitors are becoming an essential drug in the treatment of type 2 diabetes mellitus, but some classes of these drugs have side effects such as joint pain that can become severe to pancreatitis. It is thought that these side effects appear related to their inhibition against enzymes DPP-8 and DPP-9. Objective: This study aims to find DPP-4 inhibitor hit compounds that are selective against the DPP-8 and DPP-9 enzymes. By building a virtual screening workflow using the Quantitative Structure-Activity Relationship (QSAR) method based on artificial intelligence (AI), millions of molecules from the database can be screened for the DPP-4 enzyme target with a faster time compared to other screening methods. Result: Five regression machine learning algorithms and four classification machine learning algorithms were used to build virtual screening workflows. The algorithm that qualifies for the regression QSAR model was Support Vector regression with R 2 pred 0.78, while the classification QSAR model was Random Forest with 92.21% accuracy. The virtual screening results of more than 10 million molecules from the database, obtained 2,716 hit compounds with pIC50 above 7.5. Molecular docking results of several potential hit compounds to the DPP-4, DPP-8 and DPP-9 enzymes, obtained CH0002 hit compound that has a high inhibitory potential against the DPP-4 enzyme and low inhibition of the DPP-8 and DPP-9 enzymes. Conclusion: This research was able to produce DPP-4 inhibitor hit compounds that are potential to DPP-4 and selective to DPP-8 and DPP-9 enzymes so that they can be further developed in the DPP-4 inhibitors discovery. The resulting virtual screening workflow can be applied to the discovery of hit compounds on other targets. Keywords: Artificial Intelligence; DPP-4; KNIME; Machine Learning; QSAR; Virtual Screening

Download Full-text

Virtual Screening of DPP-4 Inhibitors Using QSAR-Based Artificial Intelligence and Molecular Docking of Hit Compounds to DPP-8 and DPP-9 Enzymes

10.21203/rs.2.22282/v2 ◽

2020 ◽

Author(s):

Oky Hermansyah ◽

Alhadi Bustamam ◽

Arry Yanuar

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Molecular Docking ◽

Side Effects ◽

Virtual Screening ◽

Learning Algorithms ◽

Qsar Model ◽

Machine Learning Algorithms ◽

Support Vector ◽

Screening Methods

Abstract Background: Dipeptidyl Peptidase-4 (DPP-4) inhibitors are becoming an essential drug in the treatment of type 2 diabetes mellitus, but some classes of these drugs have side effects such as joint pain that can become severe to pancreatitis. It is thought that these side effects appear related to their inhibition against enzymes DPP-8 and DPP-9. Objective: This study aims to find DPP-4 inhibitor hit compounds that are selective against the DPP-8 and DPP-9 enzymes. By building a virtual screening workflow using the Quantitative Structure-Activity Relationship (QSAR) method based on artificial intelligence (AI), millions of molecules from the database can be screened for the DPP-4 enzyme target with a faster time compared to other screening methods. Result: Five regression machine learning algorithms and four classification machine learning algorithms were used to build virtual screening workflows. The algorithm that qualifies for the regression QSAR model was Support Vector regression with R 2 pred 0.78, while the classification QSAR model was Random Forest with 92.21% accuracy. The virtual screening results of more than 10 million molecules from the database, obtained 2,716 hit compounds with pIC50 above 7.5. Molecular docking results of several potential hit compounds to the DPP-4, DPP-8 and DPP-9 enzymes, obtained CH0002 hit compound that has a high inhibitory potential against the DPP-4 enzyme and low inhibition of the DPP-8 and DPP-9 enzymes. Conclusion: This research was able to produce DPP-4 inhibitor hit compounds that are potential to DPP-4 and selective to DPP-8 and DPP-9 enzymes so that they can be further developed in the DPP-4 inhibitors discovery. The resulting virtual screening workflow can be applied to the discovery of hit compounds on other targets.

Download Full-text

Performance Analysis of Different Machine Learning Algorithms for Identifying and Classifying the Failures of Traction Motors

Journal of Physics Conference Series ◽

10.1088/1742-6596/2095/1/012058 ◽

2021 ◽

Vol 2095 (1) ◽

pp. 012058

Author(s):

Xiaoyu Xian ◽

Haichuan Tang ◽

Yin Tian ◽

Qi Liu ◽

Yuming Fan

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Support Vector ◽

Random Forest Classification ◽

Machine Learning Classification ◽

Fault Type ◽

Traction Motors ◽

Forest Classification

Abstract This paper addresses electric motor fault diagnosis using supervised machine learning classification. A total of 15 distinct fault types are classified and multilabel strategies are used to classify concurrent faults. we explored, developed, and compared the performance of different types of binary (fault/non-fault), multi-class (fault type) and multi-label (single fault versus combination fault) classifiers. To evaluate the effectiveness of fault identification and classification, we used different supervised machine learning methods, including Random forest classification, support vector machine and neural network classification. Through experiment, we compared these methods over 4 classification regimes and finally summarize the most suitable machine learning algorithms for different aspects of health diagnosis in traction motors area.

Download Full-text

A Very Large-Scale Bioactivity Comparison of Deep Learning and Multiple Machine Learning Algorithms for Drug Discovery

10.26434/chemrxiv.12781241 ◽

2020 ◽

Author(s):

Thomas R. Lane ◽

Daniel H. Foil ◽

Eni Minerali ◽

Fabio Urbina ◽

Kimberley M. Zorn ◽

...

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Deep Learning ◽

Drug Discovery ◽

Deep Neural Networks ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Support Vector ◽

Learning Methods ◽

Machine Learning Methods

Machine learning methods are attracting considerable attention from the pharmaceutical industry for use in drug discovery and applications beyond. In recent studies we have applied multiple machine learning algorithms, modeling metrics and in some cases compared molecular descriptors to build models for individual targets or properties on a relatively small scale. Several research groups have used large numbers of datasets from public databases such as ChEMBL in order to evaluate machine learning methods of interest to them. The largest of these types of studies used on the order of 1400 datasets. We have now extracted well over 5000 datasets from CHEMBL for use with the ECFP6 fingerprint and comparison of our proprietary software Assay CentralTM with random forest, k-Nearest Neighbors, support vector classification, naïve Bayesian, AdaBoosted decision trees, and deep neural networks (3 levels). Model performance <a>was</a> assessed using an array of five-fold cross-validation metrics including area-under-the-curve, F1 score, Cohen’s kappa and Matthews correlation coefficient. <a>Based on ranked normalized scores for the metrics or datasets all methods appeared comparable while the distance from the top indicated Assay CentralTM and support vector classification were comparable. </a>Unlike prior studies which have placed considerable emphasis on deep neural networks (deep learning), no advantage was seen in this case where minimal tuning was performed of any of the methods. If anything, Assay CentralTM may have been at a slight advantage as the activity cutoff for each of the over 5000 datasets representing over 570,000 unique compounds was based on Assay CentralTMperformance, but support vector classification seems to be a strong competitor. We also apply Assay CentralTM to prospective predictions for PXR and hERG to further validate these models. This work currently appears to be the largest comparison of machine learning algorithms to date. Future studies will likely evaluate additional databases, descriptors and algorithms, as well as further refining methods for evaluating and comparing models.

Download Full-text

An analysis of PCOS disease prediction model using machine learning classification algorithms

Recent Patents on Engineering ◽

10.2174/1872212115999201224130204 ◽

2020 ◽

Vol 15 ◽

Author(s):

Shivani Aggarwal ◽

Kavita Pandey

Keyword(s):

Machine Learning ◽

Insulin Resistance ◽

Feature Selection ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Support Vector ◽

Classification Algorithms ◽

Metabolic Abnormalities ◽

Related Disorder ◽

Machine Learning Classification

Background: Polycystic ovary syndrome is commonly known as PCOS and it is surprising that it affects up to 18% of women in reproductive age. PCOS is the most usually occurring hormone-related disorder. Some of the symptoms of PCOS are irregular periods, increased facial and body hair growth, attain more weight, darkening of skin, diabetes and trouble conceiving (infertility). It also came into light that patients suffering from PCOS also possess a range of metabolic abnormalities. Due to metabolic abnormalities, some disorder may occur which increase the risk of insulin resistance, type 2 diabetes and impaired glucose tolerance (a sign of prediabetes). Family members of women suffering from PCOS are also at higher hazardous level for developing the same metabolic abnormalities. Obesity and overweight status contribute to insulin resistance in PCOS. Objective: In the modern era, there are several new technologies available to diagnose PCOS and one of them is Machine learning algorithms because they are exposed to new data. These algorithms learn from past experiences to produce reliable and repeatable decisions. In this article, Machine learning algorithms are used to identify the important features to diagnose PCOS. Methods: Several classification algorithms like Support vector machine (SVM), Logistic Regression, Gradient Boosting, Random Forest, Decision Tree and K-Nearest Neighbor (KNN) are uses well organized test datasets for classify huge records. Initially a dataset of 541 instances and 41 attributes has been taken to apply the prediction models and a manual feature selection is done over it. Results: After the feature selection, a set of 12 attributes has been identified which plays a crucial role in diagnosing PCOS. Conclusion: There are several researches progressing in the direction of diagnosing PCOS but till now the relevant features are not identify for the same.

Download Full-text

A Comparative Study on Machine Learning Algorithms for Smart Manufacturing: Tool Wear Prediction Using Random Forests

Journal of Manufacturing Science and Engineering ◽

10.1115/1.4036350 ◽

2017 ◽

Vol 139 (7) ◽

Cited By ~ 116

Author(s):

Dazhong Wu ◽

Connor Jennings ◽

Janis Terpenny ◽

Robert X. Gao ◽

Soundar Kumara

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Tool Wear ◽

Random Forests ◽

Manufacturing Systems ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Data Driven ◽

Smart Manufacturing ◽

Tool Wear Prediction

Manufacturers have faced an increasing need for the development of predictive models that predict mechanical failures and the remaining useful life (RUL) of manufacturing systems or components. Classical model-based or physics-based prognostics often require an in-depth physical understanding of the system of interest to develop closed-form mathematical models. However, prior knowledge of system behavior is not always available, especially for complex manufacturing systems and processes. To complement model-based prognostics, data-driven methods have been increasingly applied to machinery prognostics and maintenance management, transforming legacy manufacturing systems into smart manufacturing systems with artificial intelligence. While previous research has demonstrated the effectiveness of data-driven methods, most of these prognostic methods are based on classical machine learning techniques, such as artificial neural networks (ANNs) and support vector regression (SVR). With the rapid advancement in artificial intelligence, various machine learning algorithms have been developed and widely applied in many engineering fields. The objective of this research is to introduce a random forests (RFs)-based prognostic method for tool wear prediction as well as compare the performance of RFs with feed-forward back propagation (FFBP) ANNs and SVR. Specifically, the performance of FFBP ANNs, SVR, and RFs are compared using an experimental data collected from 315 milling tests. Experimental results have shown that RFs can generate more accurate predictions than FFBP ANNs with a single hidden layer and SVR.

Download Full-text

Comparative study of support vector machines and random forests machine learning algorithms on credit operation

Software Practice and Experience ◽

10.1002/spe.2842 ◽

2020 ◽

Cited By ~ 1

Author(s):

Germanno Teles ◽

Joel J. P. C. Rodrigues ◽

Ricardo A. L. Rabêlo ◽

Sergei A. Kozlov

Keyword(s):

Machine Learning ◽

Support Vector Machines ◽

Comparative Study ◽

Random Forests ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Support Vector ◽

Vector Machines

Download Full-text

Detection of Diabetes By Machine Learning Technique

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.f4501.059720 ◽

2020 ◽

Vol 9 (7) ◽

pp. 247-251

Keyword(s):

Machine Learning ◽

Blood Sugar ◽

Learning Algorithms ◽

Research Work ◽

Machine Learning Algorithms ◽

Support Vector ◽

Machine Learning Classification ◽

Learning Technique ◽

Important Health ◽

Prediction Of Diabetes

Diabetes is a most important health dispute that has reached distressing levels; today approximately half a billion individuals are living with diabetes universal. Diabetes is a state that damages the body’s capability to process glucose in blood, otherwise known as blood sugar. It is a metabolic disease that reasons high blood sugar. The hormone insulin transfers sugar from the blood into your cells to be stored for energy. With diabetes, your body either doesn’t make sufficient insulin or can’t efficiently use the insulin it does makes. The motive of this research is to design a method or prototype which can detect or predict the diabetes in patients with high precision. Therefore different machine learning classification algorithms namely decision tree, support vector machine, Naïve Bayes and k-NN are used in this research work for prediction of the diabetes. Two databases are used for experimentation. The first one is created from hospital with 82 patients and second one is readily available Pima Indian Diabetes database. The performances of different machine learning algorithms are estimated on different measures like Precision, Recall, F-measure and accuracy. The objective of this research is to study the accuracy of different machine learning algorithms and hence identify set of suitable algorithms for prediction of diabetes for further research work.

Download Full-text

The upside of uncertainty: Identification of lithology contact zones from airborne geophysics and satellite data using random forests and support vector machines

Geophysics ◽

10.1190/geo2012-0411.1 ◽

2013 ◽

Vol 78 (3) ◽

pp. WB113-WB126 ◽

Cited By ~ 43

Author(s):

Matthew J. Cracknell ◽

Anya M. Reading

Keyword(s):

Machine Learning ◽

Support Vector Machines ◽

Random Forests ◽

Supervised Classification ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Geophysical Data ◽

Support Vector ◽

Vector Machines ◽

Spatially Varying

Inductive machine learning algorithms attempt to recognize patterns in, and generalize from empirical data. They provide a practical means of predicting lithology, or other spatially varying physical features, from multidimensional geophysical data sets. It is for this reason machine learning approaches are increasing in popularity for geophysical data inference. A key motivation for their use is the ease with which uncertainty measures can be estimated for nonprobabilistic algorithms. We have compared and evaluated the abilities of two nonprobabilistic machine learning algorithms, random forests (RF) and support vector machines (SVM), to recognize ambiguous supervised classification predictions using uncertainty calculated from estimates of class membership probabilities. We formulated a method to establish optimal uncertainty threshold values to identify and isolate the maximum number of incorrect predictions while preserving most of the correct classifications. This is illustrated using a case example of the supervised classification of surface lithologies in a folded, structurally complex, metamorphic terrain. We found that (1) the use of optimal uncertainty thresholds significantly improves overall classification accuracy of RF predictions, but not those of SVM, by eliminating the maximum number of incorrectly classified samples while preserving the maximum number of correctly classified samples; (2) RF, unlike SVM, was able to exploit dependencies and structures contained within spatially varying input data; and (3) high RF prediction uncertainty is spatially coincident with transitions in lithology and associated contact zones, and regions of intense deformation. Uncertainty has its upside in the identification of areas of key geologic interest and has wide application across the geosciences, where transition zones are important classes in their own right. The techniques used in this study are of practical value in prioritizing subsequent geologic field activities, which, with the aid of this analysis, may be focused on key lithology contacts and problematic localities.

Download Full-text

Predicting Mortality Risk in Patients with COVID-19 Using Artificial Intelligence to Help Medical Decision-Making

10.1101/2020.03.30.20047308 ◽

2020 ◽

Cited By ~ 14

Author(s):

Mohammad Pourhomayoun ◽

Mahdi Shakibi

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Mortality Rate ◽

Mortality Risk ◽

Confusion Matrix ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Medical Decision ◽

Support Vector ◽

Depth Analysis

AbstractIn the wake of COVID-19 disease, caused by the SARS-CoV-2 virus, we designed and developed a predictive model based on Artificial Intelligence (AI) and Machine Learning algorithms to determine the health risk and predict the mortality risk of patients with COVID-19. In this study, we used documented data of 117,000 patients world-wide with laboratory-confirmed COVID-19. This study proposes an AI model to help hospitals and medical facilities decide who needs to get attention first, who has higher priority to be hospitalized, triage patients when the system is overwhelmed by overcrowding, and eliminate delays in providing the necessary care. The results demonstrate 93% overall accuracy in predicting the mortality rate. We used several machine learning algorithms including Support Vector Machine (SVM), Artificial Neural Networks, Random Forest, Decision Tree, Logistic Regression, and K-Nearest Neighbor (KNN) to predict the mortality rate in patients with COVID-19. In this study, the most alarming symptoms and features were also identified. Finally, we used a separate dataset of COVID-19 patients to evaluate our developed model accuracy, and used confusion matrix to make an in-depth analysis of our classifiers and calculate the sensitivity and specificity of our model.

Download Full-text