Prediction of Soil Adsorption Coefficient in Pesticides Using Physicochemical Properties and Molecular Descriptors by Machine Learning Models

The following research assesses the capability of machine learning in predicting maximum emission wavelength of organic compounds. The predictions are based on structure descriptors and fingerprints widely applied in cheminformatics. In an attempt to further improve accuracy, developed machine learning models were enriched with quantum mechanics derived features. Multi linear, gradient boosting and random forest regressions were applied. Computers were trained and tested with database of experimental data of optical properties.

Download Full-text

Antimalarial Drug Predictions Using Molecular Descriptors and Machine Learning against Plasmodium Falciparum

Biomolecules ◽

10.3390/biom11121750 ◽

2021 ◽

Vol 11 (12) ◽

pp. 1750

Author(s):

Medard Edmund Mswahili ◽

Gati Lother Martin ◽

Jiyoung Woo ◽

Guang J. Choi ◽

Young-Seob Jeong

Keyword(s):

Machine Learning ◽

Plasmodium Falciparum ◽

Feature Selection ◽

Molecular Descriptors ◽

Artemisinin Resistance ◽

Learning Models ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Artificial Neural Network Ann ◽

Machine Learning Models

Malaria remains by far one of the most threatening and dangerous illnesses caused by the plasmodium falciparum parasite. Chloroquine (CQ) and first-line artemisinin-based combination treatment (ACT) have long been the drug of choice for the treatment and controlling of malaria; however, the emergence of CQ-resistant and artemisinin resistance parasites is now present in most areas where malaria is endemic. In this work, we developed five machine learning models to predict antimalarial bioactivities of a drug against plasmodium falciparum from the features (i.e., molecular descriptors values) obtained from PaDEL software from SMILES of compounds and compare the machine learning models by experiments with our collected data of 4794 instances. As a consequence, we found that three models amongst the five, namely artificial neural network (ANN), extreme gradient boost (XGB), and random forest (RF), outperform the others in terms of accuracy while observing that, using roughly a quarter of the promising descriptors picked by the feature selection algorithm, the five models achieved equivalent and comparable performance. Nevertheless, the contribution of all molecular descriptors in the models was investigated through the comparison of their rank values by the feature selection algorithm and found that the most potent and relevant descriptors which come from the ‘Autocorrelation’ module contributed more while the ‘Atom type electrotopological state’ contributed the least to the model.

Download Full-text

An Attempt to Boost Molecular Descriptors with Quantum-Derived Features in Prediction of Maximum Emission Wavelengths of Chromophores

10.26434/chemrxiv.14534136 ◽

2021 ◽

Author(s):

Bartłomiej Fliszkiewicz

Keyword(s):

Machine Learning ◽

Experimental Data ◽

Optical Properties ◽

Molecular Descriptors ◽

Gradient Boosting ◽

Learning Models ◽

Linear Gradient ◽

Maximum Emission ◽

Improve Accuracy ◽

Machine Learning Models

The following research assesses the capability of machine learning in predicting maximum emission wavelength of organic compounds. The predictions are based on structure descriptors and fingerprints widely applied in cheminformatics. In an attempt to further improve accuracy, developed machine learning models were enriched with quantum mechanics derived features. Multi linear, gradient boosting and random forest regressions were applied. Computers were trained and tested with database of experimental data of optical properties.

Download Full-text

BitterSweet: Building machine learning models for predicting the bitter and sweet taste of small molecules

10.1101/426692 ◽

2018 ◽

Author(s):

Rudraksh Tuwani ◽

Somin Wadhwa ◽

Ganesh Bagler

Keyword(s):

Machine Learning ◽

Molecular Descriptors ◽

Wide Spectrum ◽

Predictive Performance ◽

Sweet Taste ◽

Future Research ◽

Gustatory System ◽

Learning Models ◽

Integrative Framework ◽

Machine Learning Models

ABSTRACTThe dichotomy of sweet and bitter tastes is a salient evolutionary feature of human gustatory system with an innate attraction to sweet taste and aversion to bitterness. A better understanding of molecular correlates of bitter-sweet taste gradient is crucial for identification of natural as well as synthetic compounds of desirable taste on this axis. While previous studies have advanced our understanding of the molecular basis of bitter-sweet taste and contributed models for their identification, there is ample scope to enhance these models by meticulous compilation of bitter-sweet molecules and utilization of a wide spectrum of molecular descriptors. Towards these goals, based on structured data compilation our study provides an integrative framework with state-of-the-art machine learning models for bitter-sweet taste prediction (BitterSweet). We compare different sets of molecular descriptors for their predictive performance and further identify important features as well as feature blocks. The utility of BitterSweet models is demonstrated by taste prediction on large specialized chemical sets such as FlavorDB, FooDB, SuperSweet, Super Natural II, DSSTox, and DrugBank. To facilitate future research in this direction, we make all datasets and BitterSweet models publicly available, and also present an end-to-end software for bitter-sweet taste prediction based on freely available chemical descriptors.

Download Full-text

Persistent spectral hypergraph based machine learning (PSH-ML) for protein-ligand binding affinity prediction

Briefings in Bioinformatics ◽

10.1093/bib/bbab127 ◽

2021 ◽

Author(s):

Xiang Liu ◽

Huitao Feng ◽

Jie Wu ◽

Kelin Xia

Keyword(s):

Machine Learning ◽

Ligand Binding ◽

Binding Affinity ◽

Molecular Descriptors ◽

Biological Data ◽

Learning Models ◽

Filtration Process ◽

Binding Affinity Prediction ◽

Affinity Prediction ◽

Machine Learning Models

Abstract Molecular descriptors are essential to not only quantitative structure activity/property relationship (QSAR/QSPR) models, but also machine learning based chemical and biological data analysis. In this paper, we propose persistent spectral hypergraph (PSH) based molecular descriptors or fingerprints for the first time. Our PSH-based molecular descriptors are used in the characterization of molecular structures and interactions, and further combined with machine learning models, in particular gradient boosting tree (GBT), for protein-ligand binding affinity prediction. Different from traditional molecular descriptors, which are usually based on molecular graph models, a hypergraph-based topological representation is proposed for protein–ligand interaction characterization. Moreover, a filtration process is introduced to generate a series of nested hypergraphs in different scales. For each of these hypergraphs, its eigen spectrum information can be obtained from the corresponding (Hodge) Laplacain matrix. PSH studies the persistence and variation of the eigen spectrum of the nested hypergraphs during the filtration process. Molecular descriptors or fingerprints can be generated from persistent attributes, which are statistical or combinatorial functions of PSH, and combined with machine learning models, in particular, GBT. We test our PSH-GBT model on three most commonly used datasets, including PDBbind-2007, PDBbind-2013 and PDBbind-2016. Our results, for all these databases, are better than all existing machine learning models with traditional molecular descriptors, as far as we know.

Download Full-text

Improving XGBoost with Imagination Sampling

Communications of the Blyth Institute ◽

10.33014/issn.2640-5652.2.1.holloway.1 ◽

2020 ◽

Vol 2 (1) ◽

pp. 3-6

Author(s):

Eric Holloway

Keyword(s):

Machine Learning ◽

General System ◽

Learning Models ◽

Starting Point ◽

Machine Learning Models

Imagination Sampling is the usage of a person as an oracle for generating or improving machine learning models. Previous work demonstrated a general system for using Imagination Sampling for obtaining multibox models. Here, the possibility of importing such models as the starting point for further automatic enhancement is explored.

Download Full-text

Development of Machine Learning Models to Predict Student Performance in Computer Literacy Courses

International Review on Computers and Software (IRECOS) ◽

10.15866/irecos.v13i1.16863 ◽

2018 ◽

Vol 13 (1) ◽

pp. 21

Author(s):

George Anderson ◽

Oduronke T. Eyitayo

Keyword(s):

Machine Learning ◽

Student Performance ◽

Computer Literacy ◽

Learning Models ◽

Machine Learning Models

Download Full-text

Experimental Comparison of Machine Learning Models in Malware Packing Detection

2020 21st Asia-Pacific Network Operations and Management Symposium (APNOMS) ◽

10.23919/apnoms50412.2020.9237007 ◽

2020 ◽

Author(s):

Jong-Wouk Kim ◽

Juhong Namgung ◽

Yang-Sae Moon ◽

Mi-Jung Choi

Keyword(s):

Machine Learning ◽

Experimental Comparison ◽

Learning Models ◽

Machine Learning Models

Download Full-text

Epigenetic Target Prediction with Accurate Machine Learning Models

10.26434/chemrxiv.13522313 ◽

2021 ◽

Author(s):

Norberto Sánchez-Cruz ◽

Jose L. Medina-Franco

Keyword(s):

Machine Learning ◽

Small Molecules ◽

Predictive Models ◽

Large Scale ◽

Target Prediction ◽

Quantitative Measure ◽

Learning Models ◽

Discovery Research ◽

Drug Discovery Research ◽

Machine Learning Models

<p>Epigenetic targets are a significant focus for drug discovery research, as demonstrated by the eight approved epigenetic drugs for treatment of cancer and the increasing availability of chemogenomic data related to epigenetics. This data represents a large amount of structure-activity relationships that has not been exploited thus far for the development of predictive models to support medicinal chemistry efforts. Herein, we report the first large-scale study of 26318 compounds with a quantitative measure of biological activity for 55 protein targets with epigenetic activity. Through a systematic comparison of machine learning models trained on molecular fingerprints of different design, we built predictive models with high accuracy for the epigenetic target profiling of small molecules. The models were thoroughly validated showing mean precisions up to 0.952 for the epigenetic target prediction task. Our results indicate that the herein reported models have considerable potential to identify small molecules with epigenetic activity. Therefore, our results were implemented as freely accessible and easy-to-use web application.</p>

Download Full-text