Advances in Predictions of Oral Bioavailability of Candidate Drugs in Man with New Machine Learning Methodology

Urban Fagerholm; Sven Hellberg; Ola Spjuth

doi:10.3390/molecules26092572

Advances in Predictions of Oral Bioavailability of Candidate Drugs in Man with New Machine Learning Methodology

Molecules ◽

10.3390/molecules26092572 ◽

2021 ◽

Vol 26 (9) ◽

pp. 2572

Author(s):

Urban Fagerholm ◽

Sven Hellberg ◽

Ola Spjuth

Keyword(s):

Machine Learning ◽

Oral Bioavailability ◽

Predictive Accuracy ◽

Systemic Exposure ◽

Prediction Errors ◽

Drug Candidates ◽

Physiologically Based Pharmacokinetic ◽

Dosing Regimens ◽

Structural Alerts ◽

Computational Predictions

Oral bioavailability (F) is an essential determinant for the systemic exposure and dosing regimens of drug candidates. F is determined by numerous processes, and computational predictions of human estimates have so far shown limited results. We describe a new methodology where F in humans is predicted directly from chemical structure using an integrated strategy combining 9 machine learning models, 3 sets of structural alerts, and 2 physiologically-based pharmacokinetic models. We evaluate the model on a benchmark dataset consisting of 184 compounds, obtaining a predictive accuracy (Q2) of 0.50, which is successful according to a pharmaceutical industry proposal. Twenty-seven compounds were found (beforehand) to be outside the main applicability domain for the model. We compare our results with interspecies correlations (rat, mouse and dog vs. human) using the same dataset, where animal vs. human-correlations (R2) were found to be 0.21 to 0.40 and maximum prediction errors were smaller than maximum interspecies differences. We conclude that our method has sufficient predictive accuracy to be practically useful with applications in human exposure and dose predictions, compound optimization and decision making, with potential to rationalize drug discovery and development and decrease failures and overexposures in early clinical trials with candidate drugs.

Download Full-text

Machine Learning for Predicting Risk of Drug-Induced Autoimmune Diseases by Structural Alerts and Daily Dose

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph18137139 ◽

2021 ◽

Vol 18 (13) ◽

pp. 7139

Author(s):

Yue Wu ◽

Jieqiang Zhu ◽

Peter Fu ◽

Weida Tong ◽

Huixiao Hong ◽

...

Keyword(s):

Machine Learning ◽

Autoimmune Diseases ◽

Odds Ratio ◽

Area Under Curve ◽

Predictive Performance ◽

Drug Induced ◽

Drug Candidates ◽

Daily Dose ◽

Structural Alerts ◽

Underlying Mechanisms

An effective approach for assessing a drug’s potential to induce autoimmune diseases (ADs) is needed in drug development. Here, we aim to develop a workflow to examine the association between structural alerts and drugs-induced ADs to improve toxicological prescreening tools. Considering reactive metabolite (RM) formation as a well-documented mechanism for drug-induced ADs, we investigated whether the presence of certain RM-related structural alerts was predictive for the risk of drug-induced AD. We constructed a database containing 171 RM-related structural alerts, generated a dataset of 407 AD- and non-AD-associated drugs, and performed statistical analysis. The nitrogen-containing benzene substituent alerts were found to be significantly associated with the risk of drug-induced ADs (odds ratio = 2.95, p = 0.0036). Furthermore, we developed a machine-learning-based predictive model by using daily dose and nitrogen-containing benzene substituent alerts as the top inputs and achieved the predictive performance of area under curve (AUC) of 70%. Additionally, we confirmed the reactivity of the nitrogen-containing benzene substituent aniline and related metabolites using quantum chemistry analysis and explored the underlying mechanisms. These identified structural alerts could be helpful in identifying drug candidates that carry a potential risk of drug-induced ADs to improve their safety profiles.

Download Full-text

A Novel Amino Acid Sequence-based Computational Approach to Predicting Cell-penetrating Peptides

Current Computer - Aided Drug Design ◽

10.2174/1573409914666180925100355 ◽

2019 ◽

Vol 15 (3) ◽

pp. 206-211 ◽

Cited By ~ 2

Author(s):

Jihui Tang ◽

Jie Ning ◽

Xiaoyan Liu ◽

Baoming Wu ◽

Rongfeng Hu

Keyword(s):

Machine Learning ◽

Amino Acid ◽

Amino Acid Position ◽

Cell Penetrating Peptides ◽

Support Vector ◽

Cell Penetration ◽

Drug Candidates ◽

Machine Learning Model ◽

Cell Penetrating ◽

Novel Method

Introduction: Machine Learning is a useful tool for the prediction of cell-penetration compounds as drug candidates. Materials and Methods: In this study, we developed a novel method for predicting Cell-Penetrating Peptides (CPPs) membrane penetrating capability. For this, we used orthogonal encoding to encode amino acid and each amino acid position as one variable. Then a software of IBM spss modeler and a dataset including 533 CPPs, were used for model screening. Results: The results indicated that the machine learning model of Support Vector Machine (SVM) was suitable for predicting membrane penetrating capability. For improvement, the three CPPs with the most longer lengths were used to predict CPPs. The penetration capability can be predicted with an accuracy of close to 95%. Conclusion: All the results indicated that by using amino acid position as a variable can be a perspective method for predicting CPPs membrane penetrating capability.

Download Full-text

A feature-based hybrid recommender system for risk prediction : Machine learning approach (Preprint)

10.2196/preprints.11010 ◽

2020 ◽

Author(s):

Uzair Bhatti

Keyword(s):

Machine Learning ◽

Risk Prediction ◽

Predictive Accuracy ◽

Correct Diagnosis ◽

Recommendation Systems ◽

Data Integrity ◽

Machine Learning Algorithms ◽

Patient Counseling ◽

Hybrid Filtering ◽

Novel Algorithm

BACKGROUND In the era of health informatics, exponential growth of information generated by health information systems and healthcare organizations demands expert and intelligent recommendation systems. It has become one of the most valuable tools as it reduces problems such as information overload while selecting and suggesting doctors, hospitals, medicine, diagnosis etc according to patients’ interests. OBJECTIVE Recommendation uses Hybrid Filtering as one of the most popular approaches, but the major limitations of this approach are selectivity and data integrity issues.Mostly existing recommendation systems & risk prediction algorithms focus on a single domain, on the other end cross-domain hybrid filtering is able to alleviate the degree of selectivity and data integrity problems to a better extent. METHODS We propose a novel algorithm for recommendation & predictive model using KNN algorithm with machine learning algorithms and artificial intelligence (AI). We find the factors that directly impact on diseases and propose an approach for predicting the correct diagnosis of different diseases. We have constructed a series of models with good reliability for predicting different surgery complications and identified several novel clinical associations. We proposed a novel algorithm pr-KNN to use KNN for prediction and recommendation of diseases RESULTS Beside that we compared the performance of our algorithm with other machine algorithms and found better performance of our algorithm, with predictive accuracy improving by +3.61%. CONCLUSIONS The potential to directly integrate these predictive tools into EHRs may enable personalized medicine and decision-making at the point of care for patient counseling and as a teaching tool. CLINICALTRIAL dataset for the trials of patient attached

Download Full-text

Development of A Drug Early Warning System Model for Cardiac Arrest Using Deep Learning: Retrospective Cohort Study (Preprint)

10.2196/preprints.26783 ◽

2020 ◽

Author(s):

Hsiao-Ko Chang ◽

Hui-Chih Wang ◽

Chih-Fen Huang ◽

Feipei Lai

Keyword(s):

Machine Learning ◽

Time Series ◽

Cardiac Arrest ◽

Early Warning ◽

Time Series Data ◽

Predictive Accuracy ◽

Vital Signs ◽

Warning System ◽

Series Data ◽

Dynamic Time

BACKGROUND In most of Taiwan’s medical institutions, congestion is a serious problem for emergency departments. Due to a lack of beds, patients spend more time in emergency retention zones, which make it difficult to detect cardiac arrest (CA). OBJECTIVE We seek to develop a Drug Early Warning System Model (DEWSM), it included drug injections and vital signs as this research important features. We use it to predict cardiac arrest in emergency departments via drug classification and medical expert suggestion. METHODS We propose this new model for detecting cardiac arrest via drug classification and by using a sliding window; we apply learning-based algorithms to time-series data for a DEWSM. By treating drug features as a dynamic time-series factor for cardiopulmonary resuscitation (CPR) patients, we increase sensitivity, reduce false alarm rates and mortality, and increase the model’s accuracy. To evaluate the proposed model, we use the area under the receiver operating characteristic curve (AUROC). RESULTS Four important findings are as follows: (1) We identify the most important drug predictors: bits (intravenous therapy), and replenishers and regulators of water and electrolytes (fluid and electrolyte supplement). The best AUROC of bits is 85%, it means the medical expert suggest the drug features: bits, it will affect the vital signs, and then the evaluate this model correctly classified patients with CPR reach 85%; that of replenishers and regulators of water and electrolytes is 86%. These two features are the most influential of the drug features in the task. (2) We verify feature selection, in which accounting for drugs improve the accuracy: In Task 1, the best AUROC of vital signs is 77%, and that of all features is 86%. In Task 2, the best AUROC of all features is 85%, which demonstrates that thus accounting for the drugs significantly affects prediction. (3) We use a better model: For traditional machine learning, this study adds a new AI technology: the long short-term memory (LSTM) model with the best time-series accuracy, comparable to the traditional random forest (RF) model; the two AUROC measures are 85%. It can be seen that the use of new AI technology will achieve better results, currently comparable to the accuracy of traditional common RF, and the LSTM model can be adjusted in the future to obtain better results. (4) We determine whether the event can be predicted beforehand: The best classifier is still an RF model, in which the observational starting time is 4 hours before the CPR event. Although the accuracy is impaired, the predictive accuracy still reaches 70%. Therefore, we believe that CPR events can be predicted four hours before the event. CONCLUSIONS This paper uses a sliding window to account for dynamic time-series data consisting of the patient’s vital signs and drug injections. The National Early Warning Score (NEWS) only focuses on the score of vital signs, and does not include factors related to drug injections. In this study, the experimental results of adding the drug injections are better than only vital signs. In a comparison with NEWS, we improve predictive accuracy via feature selection, which includes drugs as features. In addition, we use traditional machine learning methods and deep learning (using LSTM method as the main processing time series data) as the basis for comparison of this research. The proposed DEWSM, which offers 4-hour predictions, is better than the NEWS in the literature. This also confirms that the doctor’s heuristic rules are consistent with the results found by machine learning algorithms.

Download Full-text

A novel multi-stage ensemble model with multiple K-means-based selective undersampling: An application in credit scoring

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-201954 ◽

2021 ◽

Vol 40 (5) ◽

pp. 9471-9484

Author(s):

Yilun Jin ◽

Yanan Liu ◽

Wenyu Zhang ◽

Shuai Zhang ◽

Yu Lou

Keyword(s):

Machine Learning ◽

Predictive Accuracy ◽

Credit Scoring ◽

Imbalanced Data ◽

Ensemble Model ◽

Selective Sampling ◽

Machine Learning Methods ◽

Multi Stage ◽

Proposed Model ◽

New Feature

With the advancement of machine learning, credit scoring can be performed better. As one of the widely recognized machine learning methods, ensemble learning has demonstrated significant improvements in the predictive accuracy over individual machine learning models for credit scoring. This study proposes a novel multi-stage ensemble model with multiple K-means-based selective undersampling for credit scoring. First, a new multiple K-means-based undersampling method is proposed to deal with the imbalanced data. Then, a new selective sampling mechanism is proposed to select the better-performing base classifiers adaptively. Finally, a new feature-enhanced stacking method is proposed to construct an effective ensemble model by composing the shortlisted base classifiers. In the experiments, four datasets with four evaluation indicators are used to evaluate the performance of the proposed model, and the experimental results prove the superiority of the proposed model over other benchmark models.

Download Full-text

Overcoming the shortcomings of the extended-clearance concept: a framework for developing a physiologically-based pharmacokinetic (PBPK) model to select drug candidates involving transporter-mediated clearance

Expert Opinion on Drug Metabolism & Toxicology ◽

10.1080/17425255.2021.1912012 ◽

2021 ◽

Author(s):

Xiaomin Liang ◽

Yurong Lai

Keyword(s):

Pbpk Model ◽

Drug Candidates ◽

Physiologically Based Pharmacokinetic ◽

Physiologically Based ◽

Clearance Concept

Download Full-text

Predicting Absorption-Distribution Properties of Neuroprotective Phosphine-Borane Compounds Using In Silico Modeling and Machine Learning

Molecules ◽

10.3390/molecules26092505 ◽

2021 ◽

Vol 26 (9) ◽

pp. 2505

Author(s):

Raheem Remtulla ◽

Sanjoy Kumar Das ◽

Leonard A. Levin

Keyword(s):

Machine Learning ◽

Neural Networks ◽

In Silico ◽

Disulfide Bonds ◽

Oral Absorption ◽

In Vivo Studies ◽

Drug Candidates ◽

In Silico Methods

Phosphine-borane complexes are novel chemical entities with preclinical efficacy in neuronal and ophthalmic disease models. In vitro and in vivo studies showed that the metabolites of these compounds are capable of cleaving disulfide bonds implicated in the downstream effects of axonal injury. A difficulty in using standard in silico methods for studying these drugs is that most computational tools are not designed for borane-containing compounds. Using in silico and machine learning methodologies, the absorption-distribution properties of these unique compounds were assessed. Features examined with in silico methods included cellular permeability, octanol-water partition coefficient, blood-brain barrier permeability, oral absorption and serum protein binding. The resultant neural networks demonstrated an appropriate level of accuracy and were comparable to existing in silico methodologies. Specifically, they were able to reliably predict pharmacokinetic features of known boron-containing compounds. These methods predicted that phosphine-borane compounds and their metabolites meet the necessary pharmacokinetic features for orally active drug candidates. This study showed that the combination of standard in silico predictive and machine learning models with neural networks is effective in predicting pharmacokinetic features of novel boron-containing compounds as neuroprotective drugs.

Download Full-text

Application of a Rough Set-Based Inductive Learning System

Fundamenta Informaticae ◽

10.3233/fi-1993-182-409 ◽

1993 ◽

Vol 18 (2-4) ◽

pp. 209-220

Author(s):

Michael Hadjimichael ◽

Anita Wasilewska

Keyword(s):

Machine Learning ◽

Rough Set ◽

Presidential Election ◽

Predictive Accuracy ◽

Learning Algorithm ◽

Inductive Learning ◽

Real Data ◽

Semantic Content ◽

Learning System ◽

Voter Preferences

We present here an application of Rough Set formalism to Machine Learning. The resulting Inductive Learning algorithm is described, and its application to a set of real data is examined. The data consists of a survey of voter preferences taken during the 1988 presidential election in the U.S.A. Results include an analysis of the predictive accuracy of the generated rules, and an analysis of the semantic content of the rules.

Download Full-text

1315. No Dose Adjustment of Metformin with Fostemsavir Coadministration Based on Mechanistic Static and Physiologically Based Pharmacokinetic Models

Open Forum Infectious Diseases ◽

10.1093/ofid/ofaa439.1497 ◽

2020 ◽

Vol 7 (Supplement_1) ◽

pp. S669-S669

Author(s):

Dung N Nguyen ◽

Xiusheng Miao ◽

Mindy Magee ◽

Guoying Tai ◽

Peter D Gorycki ◽

...

Keyword(s):

Dose Adjustment ◽

Pbpk Model ◽

Systemic Exposure ◽

Multidrug Resistant ◽

Pbpk Modeling ◽

Repeat Dose ◽

Physiologically Based Pharmacokinetic ◽

Physiologically Based

Abstract Background Fostemsavir (FTR) is an oral prodrug of the first-in-class attachment inhibitor temsavir (TMR) which is being evaluated in patients with multidrug resistant HIV-1 infection. In vitro studies indicated that TMR and its 2 major metabolites are inhibitors of organic cation transporters (OCT)1, OCT2, and multidrug and toxin extrusion transporters (MATEs). To assess the clinical relevance, of OCT and MATE inhibition, mechanistic static DDI prediction with calculated Imax,u/IC50 ratios was below the cut-off limits for a DDI flag based on FDA guidelines and above the cut-off limits for MATEs based on EMA guidelines. Methods Metformin is a commonly used probe substrate for OCT1, OCT2 and MATEs. To predict the potential for a drug interaction between TMR and metformin, a physiologically based pharmacokinetic (PBPK) model for TMR was developed based on its physicochemical properties, in vitro and in vivo data. The model was verified and validated through comparison with clinical data. The TMR PBPK model accurately described AUC and Cmax within 30% of the observed data for single and repeat dose studies with or without food. The SimCYP models for metformin and ritonavir were qualified using literature data before applications of DDI prediction for TMR Results TMR was simulated at steady state concentrations after repeated oral doses of FTR 600 mg twice daily which allowed assessment of the potential OCT1, OCT2, and MATEs inhibition by TMR and metabolites. No significant increase in metformin systemic exposure (AUC or Cmax) was predicted with FTR co-administration. In addition, a sensitivity analysis was conducted for either hepatic OCT1 Ki, or renal OCT2 and MATEs Ki values. The model output indicated that, a 10-fold more potent Ki value for TMR would be required to have a ~15% increase in metformin exposure Conclusion Based on mechanistic static models and PBPK modeling and simulation, the OCT1/2 and MATEs inhibition potential of TMR and its metabolites on metformin pharmacokinetics is not clinically significant. No dose adjustment of metformin is necessary when co-administered with FTR Disclosures Xiusheng Miao, PhD, GlaxoSmithKline (Employee) Mindy Magee, Doctor of Pharmacy, GlaxoSmithKline (Employee, Shareholder) Peter D. Gorycki, BEChe, MSc, PhD, GSK (Employee, Shareholder) Katy P. Moore, PharmD, RPh, ViiV Healthcare (Employee)

Download Full-text

Identifying neuroanatomical signatures of anorexia nervosa: a multivariate machine learning approach

Psychological Medicine ◽

10.1017/s0033291715000768 ◽

2015 ◽

Vol 45 (13) ◽

pp. 2805-2812 ◽

Cited By ~ 15

Author(s):

L. Lavagnino ◽

F. Amianto ◽

B. Mwangi ◽

F. D'Agata ◽

A. Spalatro ◽

...

Keyword(s):

Machine Learning ◽

Anorexia Nervosa ◽

Predictive Accuracy ◽

Third Ventricle ◽

Healthy Controls ◽

Drive For Thinness ◽

Individual Subject ◽

Machine Learning Approach ◽

Scan Data ◽

Selection Operator

BackgroundThere are currently no neuroanatomical biomarkers of anorexia nervosa (AN) available to make clinical inferences at an individual subject level. We present results of a multivariate machine learning (ML) approach utilizing structural neuroanatomical scan data to differentiate AN patients from matched healthy controls at an individual subject level.MethodStructural neuroimaging scans were acquired from 15 female patients with AN (age = 20, s.d. = 4 years) and 15 demographically matched female controls (age = 22, s.d. = 3 years). Neuroanatomical volumes were extracted using the FreeSurfer software and input into the Least Absolute Shrinkage and Selection Operator (LASSO) multivariate ML algorithm. LASSO was ‘trained’ to identify ‘novel’ individual subjects as either AN patients or healthy controls. Furthermore, the model estimated the probability that an individual subject belonged to the AN group based on an individual scan.ResultsThe model correctly predicted 25 out of 30 subjects, translating into 83.3% accuracy (sensitivity 86.7%, specificity 80.0%) (p < 0.001; χ2 test). Six neuroanatomical regions (cerebellum white matter, choroid plexus, putamen, accumbens, the diencephalon and the third ventricle) were found to be relevant in distinguishing individual AN patients from healthy controls. The predicted probabilities showed a linear relationship with drive for thinness clinical scores (r = 0.52, p < 0.005) and with body mass index (BMI) (r = −0.45, p = 0.01).ConclusionsThe model achieved a good predictive accuracy and drive for thinness showed a strong neuroanatomical signature. These results indicate that neuroimaging scans coupled with ML techniques have the potential to provide information at an individual subject level that might be relevant to clinical outcomes.

Download Full-text