RNAPosers: Machine Learning Classifiers For RNA-Ligand Poses

Mapping Intimacies ◽

10.1101/702449 ◽

2019 ◽

Author(s):

Sahil Chhabra ◽

Jingru Xie ◽

Aaron T. Frank

Keyword(s):

Machine Learning ◽

Small Molecule ◽

3D Structure ◽

Academic Community ◽

Pose Prediction ◽

Scoring Functions ◽

Machine Learning Classifiers ◽

Learning Classifiers ◽

Validation Set ◽

Leave One Out

ABSTRACTDetermining the 3-dimensional (3D) structures of ribonucleic acid (RNA)-small molecule complexes is critical to understanding molecular recognition in RNA. Computer docking can, in principle, be used to predict the 3D structure of RNA-small molecule complexes. Unfortunately, retrospective analysis has shown that the scoring functions that are typically used to rank poses tend to misclassify non-native poses as native, and vice versa. This misclassification of non-native poses severely limits the utility of computer docking in the context pose prediction, as well as in virtual screening. Here, we use machine learning to train a set of pose classifiers that estimate the relative “nativeness” of a set of RNA-ligand poses. At the heart of our approach is the use of a pose “fingerprint” that is a composite of a set of atomic fingerprints, which individually encode the local “RNA environment” around ligand atoms. We found that by ranking poses based on the classification scores from our machine learning classifiers, we were able to recover native-like poses better than when we ranked poses based on their docking scores. With a leave-one-out training and testing approach, we found that one of our classifiers could recover poses that were within 2.5 Å of the native poses in ∼80% of the 88 cases we examined, and similarly, on a separate validation set, we could recover such poses in ∼70% of the cases. Our set of classifiers, which we refer to as RNAPosers, should find utility as a tool to aid in RNA-ligand pose prediction and so we make RNAPosers open to the academic community via https://github.com/atfrank/RNAPosers.

Download Full-text

Safe opioid prescribing: a prognostic machine learning approach to predicting 30-day risk after an opioid dispensation in Alberta, Canada

BMJ Open ◽

10.1136/bmjopen-2020-043964 ◽

2021 ◽

Vol 11 (5) ◽

pp. e043964

Author(s):

Vishal Sharma ◽

Vinaykumar Kulkarni ◽

Dean T Eurich ◽

Luke Kumar ◽

Salim Samanani

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Health Data ◽

Adverse Outcomes ◽

Health Departments ◽

Machine Learning Classifiers ◽

Administrative Health Data ◽

Prescription Monitoring ◽

Learning Classifiers ◽

Validation Set

ObjectiveTo develop machine learning models employing administrative health data that can estimate risk of adverse outcomes within 30 days of an opioid dispensation for use by health departments or prescription monitoring programmes.Design, setting and participantsThis prognostic study was conducted in Alberta, Canada between 2017 and 2018. Participants included all patients 18 years of age and older who received at least one opioid dispensation. Pregnant and cancer patients were excluded.ExposureEach opioid dispensation served as an exposure.Main outcomes/measuresOpioid-related adverse outcomes were identified from linked administrative health data. Machine learning algorithms were trained using 2017 data to predict risk of hospitalisation, emergency department visit and mortality within 30 days of an opioid dispensation. Two validation sets, using 2017 and 2018 data, were used to evaluate model performance. Model discrimination and calibration performance were assessed for all patients and those at higher risk. Machine learning discrimination was compared with current opioid guidelines.ResultsParticipants in the 2017 training set (n=275 150) and validation set (n=117 829) had similar baseline characteristics. In the 2017 validation set, c-statistics for the XGBoost, logistic regression and neural network classifiers were 0.87, 0.87 and 0.80, respectively. In the 2018 validation set (n=393 023), the corresponding c-statistics were 0.88, 0.88 and 0.82. C-statistics from the Canadian guidelines ranged from 0.54 to 0.69 while the US guidelines ranged from 0.50 to 0.62. The top five percentile of predicted risk for the XGBoost and logistic regression classifiers captured 42% of all events and translated into post-test probabilities of 13.38% and 13.45%, respectively, up from the pretest probability of 1.6%.ConclusionMachine learning classifiers, especially incorporating hospitalisation/physician claims data, have better predictive performance compared with guideline or prescription history only approaches when predicting 30-day risk of adverse outcomes. Prescription monitoring programmes and health departments with access to administrative data can use machine learning classifiers to effectively identify those at higher risk compared with current guideline-based approaches.

Download Full-text

Testing Machine Learning Classifiers based on Compositional Metamorphic Relations

International Journal of Performability Engineering ◽

10.23940/ijpe.20.01.p8.6777 ◽

2020 ◽

Vol 16 (1) ◽

pp. 67

Author(s):

Minghua Jia ◽

Xiaodong Wang ◽

Yue Xu ◽

Zhanqi Cui ◽

Ruilin Xie

Keyword(s):

Machine Learning ◽

Testing Machine ◽

Machine Learning Classifiers ◽

Learning Classifiers

Download Full-text

Performance Evaluation of Machine Learning Classifiers for Epileptic Seizure Detection

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v7i8.122129 ◽

2019 ◽

Vol 7 (8) ◽

pp. 122-129

Author(s):

Mirwais Farahi ◽

Doreswamy .

Keyword(s):

Machine Learning ◽

Performance Evaluation ◽

Epileptic Seizure ◽

Seizure Detection ◽

Epileptic Seizure Detection ◽

Machine Learning Classifiers ◽

Learning Classifiers

Download Full-text

Botnet Detection with Machine Learning Classifiers

Journal of Research on the Lepidoptera ◽

10.36872/lepi/v51i2/301100 ◽

2020 ◽

Vol 51 (2) ◽

pp. 329-335

Author(s):

POKURI ASHOK KUMAR

Keyword(s):

Machine Learning ◽

Botnet Detection ◽

Machine Learning Classifiers ◽

Learning Classifiers

Download Full-text

Sr-Mlc: Scalable Resilience Machine Learning Classifiers Approach in Cyber Security

SSRN Electronic Journal ◽

10.2139/ssrn.3492708 ◽

2019 ◽

Author(s):

Anil Lamba ◽

Natasha Dutta

Keyword(s):

Machine Learning ◽

Cyber Security ◽

Machine Learning Classifiers ◽

Learning Classifiers

Download Full-text

Machine Learning Classifiers for Efficient Spammers Detection in Twitter OSN

SSRN Electronic Journal ◽

10.2139/ssrn.3734170 ◽

2020 ◽

Author(s):

Praveen Kumar Sadineni

Keyword(s):

Machine Learning ◽

Machine Learning Classifiers ◽

Learning Classifiers

Download Full-text

Assessing the Effect of Training Sampling Design on the Performance of Machine Learning Classifiers for Land Cover Mapping Using Multi-Temporal Remote Sensing Data and Google Earth Engine

Remote Sensing ◽

10.3390/rs13081433 ◽

2021 ◽

Vol 13 (8) ◽

pp. 1433

Author(s):

Shobitha Shetty ◽

Prasun Kumar Gupta ◽

Mariana Belgiu ◽

S. K. Srivastav

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Random Sampling ◽

Sampling Design ◽

Remote Sensing Data ◽

Google Earth ◽

Machine Learning Classifiers ◽

Learning Classifiers ◽

Multi Temporal ◽

Google Earth Engine

Machine learning classifiers are being increasingly used nowadays for Land Use and Land Cover (LULC) mapping from remote sensing images. However, arriving at the right choice of classifier requires understanding the main factors influencing their performance. The present study investigated firstly the effect of training sampling design on the classification results obtained by Random Forest (RF) classifier and, secondly, it compared its performance with other machine learning classifiers for LULC mapping using multi-temporal satellite remote sensing data and the Google Earth Engine (GEE) platform. We evaluated the impact of three sampling methods, namely Stratified Equal Random Sampling (SRS(Eq)), Stratified Proportional Random Sampling (SRS(Prop)), and Stratified Systematic Sampling (SSS) upon the classification results obtained by the RF trained LULC model. Our results showed that the SRS(Prop) method favors major classes while achieving good overall accuracy. The SRS(Eq) method provides good class-level accuracies, even for minority classes, whereas the SSS method performs well for areas with large intra-class variability. Toward evaluating the performance of machine learning classifiers, RF outperformed Classification and Regression Trees (CART), Support Vector Machine (SVM), and Relevance Vector Machine (RVM) with a >95% confidence level. The performance of CART and SVM classifiers were found to be similar. RVM achieved good classification results with a limited number of training samples.

Download Full-text

A review of infant cry analysis and classification

EURASIP Journal on Audio Speech and Music Processing ◽

10.1186/s13636-021-00197-5 ◽

2021 ◽

Vol 2021 (1) ◽

Author(s):

Chunyan Ji ◽

Thosini Bamunu Mudiyanselage ◽

Yutong Gao ◽

Yi Pan

Keyword(s):

Neural Network ◽

Machine Learning ◽

Signal Analysis ◽

Future Research ◽

Prosodic Features ◽

Infant Cry ◽

Machine Learning Classification ◽

Machine Learning Classifiers ◽

Learning Classifiers ◽

Processing Techniques

AbstractThis paper reviews recent research works in infant cry signal analysis and classification tasks. A broad range of literatures are reviewed mainly from the aspects of data acquisition, cross domain signal processing techniques, and machine learning classification methods. We introduce pre-processing approaches and describe a diversity of features such as MFCC, spectrogram, and fundamental frequency, etc. Both acoustic features and prosodic features extracted from different domains can discriminate frame-based signals from one another and can be used to train machine learning classifiers. Together with traditional machine learning classifiers such as KNN, SVM, and GMM, newly developed neural network architectures such as CNN and RNN are applied in infant cry research. We present some significant experimental results on pathological cry identification, cry reason classification, and cry sound detection with some typical databases. This survey systematically studies the previous research in all relevant areas of infant cry and provides an insight on the current cutting-edge works in infant cry signal analysis and classification. We also propose future research directions in data processing, feature extraction, and neural network classification fields to better understand, interpret, and process infant cry signals.

Download Full-text

Modeling freight mode choice using machine learning classifiers: a comparative study using Commodity Flow Survey (CFS) data

Transportation Planning and Technology ◽

10.1080/03081060.2021.1927306 ◽

2021 ◽

pp. 1-17

Author(s):

Majbah Uddin ◽

Sabreena Anowar ◽

Naveen Eluru

Keyword(s):

Machine Learning ◽

Comparative Study ◽

Mode Choice ◽

Commodity Flow ◽

Machine Learning Classifiers ◽

Learning Classifiers ◽

Freight Mode Choice

Download Full-text

Integration of machine learning and genome-scale metabolic modeling identifies multi-omics biomarkers for radiation resistance

Nature Communications ◽

10.1038/s41467-021-22989-1 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Joshua E. Lewis ◽

Melissa L. Kemp

Keyword(s):

Machine Learning ◽

Metabolic Flux ◽

Metabolic Modeling ◽

The Cancer Genome Atlas ◽

Radiation Response ◽

Metabolomics Data ◽

Machine Learning Classifiers ◽

Learning Classifiers ◽

Metabolic Biomarkers ◽

Genome Scale

AbstractResistance to ionizing radiation, a first-line therapy for many cancers, is a major clinical challenge. Personalized prediction of tumor radiosensitivity is not currently implemented clinically due to insufficient accuracy of existing machine learning classifiers. Despite the acknowledged role of tumor metabolism in radiation response, metabolomics data is rarely collected in large multi-omics initiatives such as The Cancer Genome Atlas (TCGA) and consequently omitted from algorithm development. In this study, we circumvent the paucity of personalized metabolomics information by characterizing 915 TCGA patient tumors with genome-scale metabolic Flux Balance Analysis models generated from transcriptomic and genomic datasets. Metabolic biomarkers differentiating radiation-sensitive and -resistant tumors are predicted and experimentally validated, enabling integration of metabolic features with other multi-omics datasets into ensemble-based machine learning classifiers for radiation response. These multi-omics classifiers show improved classification accuracy, identify clinical patient subgroups, and demonstrate the utility of personalized blood-based metabolic biomarkers for radiation sensitivity. The integration of machine learning with genome-scale metabolic modeling represents a significant methodological advancement for identifying prognostic metabolite biomarkers and predicting radiosensitivity for individual patients.

Download Full-text