Origin of aromatase inhibitory activity via proteochemometric modeling

10.7287/peerj.preprints.1556 ◽

2015 ◽

Author(s):

Saw Simeon ◽

Ola Spjuth ◽

Maris Lapins ◽

Sunanta Nabu ◽

Virapong Prachayasittikul ◽

...

Keyword(s):

Breast Cancer ◽

Inhibitory Activity ◽

Cross Validation ◽

External Validation ◽

Predictive Performance ◽

Interaction Space ◽

Quantitative Structure Activity Relationship ◽

Inhibitory Mechanisms ◽

Rate Limiting ◽

Good Predictive Performance

Aromatase, which is a rate-limiting enzyme that catalyzes the conversion of androgen to estrogen, plays an essential role in the development of estrogen-dependent breast cancer. Side effects due to aromatase inhibitors (AIs) necessitate the pursuit of novel inhibitor candidates with high selectivity, lower toxicity and increased potency. Designing a novel therapeutic agent against aromatase could be achieved computationally by means of ligand-based and structure-based methods. For over a decade, we have utilized both approaches to design potential AIs for which quantitative structure-activity relationship and molecular docking were used to explore inhibitory mechanisms of AIs towards aromatase. However, such approaches do not consider the effects that aromatase variants have on different AIs. In this study, proteochemometrics modeling was applied to analyze the interaction space between AIs and aromatase variants as a function of their substructural and amino acid features. Good predictive performance was achieved, as rigorously verified by 10-fold cross-validation, external validation, leave-one-compound-out cross-validation, leave-one-protein-out cross-validation and Y-scrambling tests. The investigations presented herein provide important insights into the mechanisms of aromatase inhibitory activity that could aid in the design of novel potent AIs as breast cancer therapeutic agents.

Download Full-text

Origin of aromatase inhibitory activity via proteochemometric modeling

10.7287/peerj.preprints.1556v1 ◽

2015 ◽

Author(s):

Saw Simeon ◽

Ola Spjuth ◽

Maris Lapins ◽

Sunanta Nabu ◽

Virapong Prachayasittikul ◽

...

Keyword(s):

Breast Cancer ◽

Inhibitory Activity ◽

Cross Validation ◽

External Validation ◽

Predictive Performance ◽

Interaction Space ◽

Quantitative Structure Activity Relationship ◽

Inhibitory Mechanisms ◽

Rate Limiting ◽

Good Predictive Performance

Aromatase, which is a rate-limiting enzyme that catalyzes the conversion of androgen to estrogen, plays an essential role in the development of estrogen-dependent breast cancer. Side effects due to aromatase inhibitors (AIs) necessitate the pursuit of novel inhibitor candidates with high selectivity, lower toxicity and increased potency. Designing a novel therapeutic agent against aromatase could be achieved computationally by means of ligand-based and structure-based methods. For over a decade, we have utilized both approaches to design potential AIs for which quantitative structure-activity relationship and molecular docking were used to explore inhibitory mechanisms of AIs towards aromatase. However, such approaches do not consider the effects that aromatase variants have on different AIs. In this study, proteochemometrics modeling was applied to analyze the interaction space between AIs and aromatase variants as a function of their substructural and amino acid features. Good predictive performance was achieved, as rigorously verified by 10-fold cross-validation, external validation, leave-one-compound-out cross-validation, leave-one-protein-out cross-validation and Y-scrambling tests. The investigations presented herein provide important insights into the mechanisms of aromatase inhibitory activity that could aid in the design of novel potent AIs as breast cancer therapeutic agents.

Download Full-text

Breast Cancer Detection Using Image Processing and CNN Algorithm with K-Fold Cross-Validation

10.1007/978-981-16-6285-0_39 ◽

2021 ◽

pp. 481-490

Author(s):

Pruthvi Tilekar ◽

Purnima Singh ◽

Nagnath Aherwadi ◽

Sagar Pande ◽

Aditya Khamparia

Keyword(s):

Breast Cancer ◽

Image Processing ◽

Cancer Detection ◽

Cross Validation ◽

Breast Cancer Detection ◽

Fold Cross Validation

Download Full-text

Prediction of Tumor Shrinkage Pattern to Neoadjuvant Chemotherapy Using a Multiparametric MRI-Based Machine Learning Model in Patients With Breast Cancer

Frontiers in Bioengineering and Biotechnology ◽

10.3389/fbioe.2021.662749 ◽

2021 ◽

Vol 9 ◽

Author(s):

Yuhong Huang ◽

Wenben Chen ◽

Xiaoling Zhang ◽

Shaofu He ◽

Nan Shao ◽

...

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Cross Validation ◽

Learning Model ◽

Training Dataset ◽

Tumor Shrinkage ◽

Clinicopathologic Characteristics ◽

Testing Dataset ◽

Machine Learning Model ◽

Fold Cross Validation

Aim: After neoadjuvant chemotherapy (NACT), tumor shrinkage pattern is a more reasonable outcome to decide a possible breast-conserving surgery (BCS) than pathological complete response (pCR). The aim of this article was to establish a machine learning model combining radiomics features from multiparametric MRI (mpMRI) and clinicopathologic characteristics, for early prediction of tumor shrinkage pattern prior to NACT in breast cancer.Materials and Methods: This study included 199 patients with breast cancer who successfully completed NACT and underwent following breast surgery. For each patient, 4,198 radiomics features were extracted from the segmented 3D regions of interest (ROI) in mpMRI sequences such as T1-weighted dynamic contrast-enhanced imaging (T1-DCE), fat-suppressed T2-weighted imaging (T2WI), and apparent diffusion coefficient (ADC) map. The feature selection and supervised machine learning algorithms were used to identify the predictors correlated with tumor shrinkage pattern as follows: (1) reducing the feature dimension by using ANOVA and the least absolute shrinkage and selection operator (LASSO) with 10-fold cross-validation, (2) splitting the dataset into a training dataset and testing dataset, and constructing prediction models using 12 classification algorithms, and (3) assessing the model performance through an area under the curve (AUC), accuracy, sensitivity, and specificity. We also compared the most discriminative model in different molecular subtypes of breast cancer.Results: The Multilayer Perception (MLP) neural network achieved higher AUC and accuracy than other classifiers. The radiomics model achieved a mean AUC of 0.975 (accuracy = 0.912) on the training dataset and 0.900 (accuracy = 0.828) on the testing dataset with 30-round 6-fold cross-validation. When incorporating clinicopathologic characteristics, the mean AUC was 0.985 (accuracy = 0.930) on the training dataset and 0.939 (accuracy = 0.870) on the testing dataset. The model further achieved good AUC on the testing dataset with 30-round 5-fold cross-validation in three molecular subtypes of breast cancer as following: (1) HR+/HER2–: 0.901 (accuracy = 0.816), (2) HER2+: 0.940 (accuracy = 0.865), and (3) TN: 0.837 (accuracy = 0.811).Conclusions: It is feasible that our machine learning model combining radiomics features and clinical characteristics could provide a potential tool to predict tumor shrinkage patterns prior to NACT. Our prediction model will be valuable in guiding NACT and surgical treatment in breast cancer.

Download Full-text

PENERAPAN METODE SELEKSI FITUR UNTUK MENINGKATKAN HASIL DIAGNOSIS KANKER PAYUDARA

Simetris Jurnal Teknik Mesin Elektro dan Ilmu Komputer ◽

10.24176/simet.v7i1.516 ◽

2016 ◽

Vol 7 (1) ◽

pp. 283 ◽

Cited By ~ 1

Author(s):

Elvira Sukma Wahyuni

Keyword(s):

Breast Cancer ◽

Rough Set ◽

Cross Validation ◽

Naive Bayes ◽

Naïve Bayes ◽

Sequential Minimal Optimization ◽

Multi Layer Perceptron ◽

Fold Cross Validation

Tujuan utama penelitian ini adalah untuk meningkatkan peforma klasifikasi pada diagnosis kanker payudara dengan menerapkan seleksi fitur pada beberapa algoritme klasifikasi. Penelitian ini menggunakan database kanker payudara Wisconsin Breast Cancer Database (WBCD). Metode seleksi fitur F-score dan Rough Set akan dipasangkan dengan beberapa algoritme klasifikasi yaitu SMO (Sequential Minimal Optimization), Naive Bayes, Multi layer Perceptron, dan C4.5. Penelitian ini menggunakan 10 fold cross validation sebagai metode evaluasi. Hasil penelitian menunjukkan algoritme klasifikasi MLP dan C4.5 mengalami peningkatan peforma klasifikasi secara signifikan setelah dipasangkan dengan seleksi fitur rough set dan F-score, Naive Bayes menunjukan peforma terbaik ketika dipasangkan dengan metode seleksi fitur F-score saja, sedangkan SMO tidak menunjukkan peningkatan peforma klasifikas ketika dipasangkan pada kedua seleksi fitur. Kata kunci: kanker payudara, seleksi fitur, klasifikasi.

Download Full-text

Drug repositioning by prediction of drug’s anatomical therapeutic chemical code via network-based inference approaches

Briefings in Bioinformatics ◽

10.1093/bib/bbaa027 ◽

2020 ◽

Cited By ~ 4

Author(s):

Yayuan Peng ◽

Manjiong Wang ◽

Yixiang Xu ◽

Zengrui Wu ◽

Jiye Wang ◽

...

Keyword(s):

Cross Validation ◽

Drug Repositioning ◽

Anatomical Therapeutic Chemical ◽

External Validation ◽

Chemical Properties ◽

Glucose Deprivation ◽

Target Drug ◽

Validation Set ◽

Fold Cross Validation

Abstract Drug discovery and development is a time-consuming and costly process. Therefore, drug repositioning has become an effective approach to address the issues by identifying new therapeutic or pharmacological actions for existing drugs. The drug’s anatomical therapeutic chemical (ATC) code is a hierarchical classification system categorized as five levels according to the organs or systems that drugs act and the pharmacology, therapeutic and chemical properties of drugs. The 2nd-, 3rd- and 4th-level ATC codes reserved the therapeutic and pharmacological information of drugs. With the hypothesis that drugs with similar structures or targets would possess similar ATC codes, we exploited a network-based approach to predict the 2nd-, 3rd- and 4th-level ATC codes by constructing substructure drug-ATC (SD-ATC), target drug-ATC (TD-ATC) and Substructure&Target drug-ATC (STD-ATC) networks. After 10-fold cross validation and two external validations, the STD-ATC models outperformed the SD-ATC and TD-ATC ones. Furthermore, with KR as fingerprint, the STD-ATC model was identified as the optimal model with AUC values at 0.899 ± 0.015, 0.916 and 0.893 for 10-fold cross validation, external validation set 1 and external validation set 2, respectively. To illustrate the predictive capability of the STD-ATC model with KR fingerprint, as a case study, we predicted 25 FDA-approved drugs (22 drugs were actually purchased) to have potential activities on heart failure using that model. Experiments in vitro confirmed that 8 of the 22 old drugs have shown mild to potent cardioprotective activities on both hypoxia model and oxygen–glucose deprivation model, which demonstrated that our STD-ATC prediction model would be an effective tool for drug repositioning.

Download Full-text

Analyzing performance of classifiers for medical datasets

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i2.15.11370 ◽

2018 ◽

Vol 7 (2.15) ◽

pp. 136 ◽

Cited By ~ 1

Author(s):

Rosaida Rosly ◽

Mokhairi Makhtar ◽

Mohd Khalid Awang ◽

Mohd Isa Awang ◽

Mohd Nordin Abdul Rahman

Keyword(s):

Breast Cancer ◽

Cross Validation ◽

Ensemble Methods ◽

Data Sets ◽

Ensemble Classifiers ◽

Classification Models ◽

Data Set ◽

Mining Tool ◽

Fold Cross Validation

This paper analyses the performance of classification models using single classification and combination of ensemble method, which are Breast Cancer Wisconsin and Hepatitis data sets as training datasets. This paper presents a comparison of different classifiers based on a 10-fold cross validation using a data mining tool. In this experiment, various classifiers are implemented including three popular ensemble methods which are boosting, bagging and stacking for the combination. The result shows that for the classification of the Breast Cancer Wisconsin data set, the single classification of Naïve Bayes (NB) and a combination of bagging+NB algorithm displayed the highest accuracy at the same percentage (97.51%) compared to other combinations of ensemble classifiers. For the classification of the Hepatitisdata set, the result showed that the combination of stacking+Multi-Layer Perception (MLP) algorithm achieved a higher accuracy at 86.25%. By using the ensemble classifiers, the result may be improved. In future, a multi-classifier approach will be proposed by introducing a fusion at the classification level between these classifiers to obtain classification with higher accuracies.

Download Full-text

Deep Learning Based Pectoral Muscle Segmentation on MIAS Mammograms

10.21203/rs.3.rs-92779/v1 ◽

2020 ◽

Author(s):

Young Jae Kim ◽

Eun Young Yoo ◽

Kwang Gi Kim

Keyword(s):

Breast Cancer ◽

Image Processing ◽

Deep Learning ◽

Cross Validation ◽

Pectoral Muscle ◽

Training Data ◽

Dice Similarity Coefficient ◽

Detection Accuracy ◽

Pectoral Muscle Detection ◽

Fold Cross Validation

Abstract Background: The purpose of this study was to propose a deep learning-based method for automated detection of the pectoral muscle, in order to reduce misdetection in a computer-aided diagnosis (CAD) system for diagnosing breast cancer in mammography. This study also aimed to assess the performance of the deep learning method for pectoral muscle detection by comparing it to an image processing-based method using the random sample consensus (RANSAC) algorithm. Methods: Using the 322 images in the Mammographic Image Analysis Society (MIAS) database, the pectoral muscle detection model was trained with the U-Net architecture. Of the total data, 80% was allocated as training data and 20% was allocated as test data, and the performance of the deep learning model was tested by 5-fold cross validation. Results: The image processing-based method for pectoral muscle detection using RANSAC showed 92% detection accuracy. Using the 5-fold cross validation, the deep learning-based method showed a mean sensitivity of 95.55%, mean specificity of 99.88%, mean accuracy of 99.67%, and mean Dice similarity coefficient (DSC) of 95.88%. Conclusions: The proposed deep learning-based method of pectoral muscle detection performed better than an existing image processing-based method. In the future, by collecting data from various medical institutions and devices to further train the model and improve its reliability, we expect that this model could greatly reduce misdetection rates by CAD systems for breast cancer diagnosis.

Download Full-text

Machine learning meets pKa

F1000Research ◽

10.12688/f1000research.22090.2 ◽

2020 ◽

Vol 9 ◽

pp. 113 ◽

Cited By ~ 2

Author(s):

Marcel Baltruschat ◽

Paul Czodrowski

Keyword(s):

Machine Learning ◽

Cross Validation ◽

Mean Squared Error ◽

Mean Absolute Error ◽

External Validation ◽

Absolute Error ◽

Source Model ◽

Squared Error ◽

Fold Cross Validation ◽

Better Than

We present a small molecule pKa prediction tool entirely written in Python. It predicts the macroscopic pKa value and is trained on a literature compilation of monoprotic compounds. Different machine learning models were tested and random forest performed best given a five-fold cross-validation (mean absolute error=0.682, root mean squared error=1.032, correlation coefficient r2 =0.82). We test our model on two external validation sets, where our model performs comparable to Marvin and is better than a recently published open source model. Our Python tool and all data is freely available at https://github.com/czodrowskilab/Machine-learning-meets-pKa.

Download Full-text

Identifying the Main Risk Factors for CVD Prediction Using Machine Learning Algorithms

10.20944/preprints202108.0471.v1 ◽

2021 ◽

Author(s):

Luis Rolando Guarneros-Nolasco ◽

Nancy Aracely Cruz-Ramos ◽

Giner Alor-Hernández ◽

Lisbeth Rodríguez-Mazahua ◽

José Luis Sánchez-Cervantes

Keyword(s):

Machine Learning ◽

Cross Validation ◽

Performance Metrics ◽

Learning Algorithms ◽

Predictive Performance ◽

Machine Learning Algorithms ◽

Algorithm Performance ◽

Body Regions ◽

Risks Factors ◽

Fold Cross Validation

CVDs are a leading cause of death globally. In CVDs, the heart is unable to deliver enough blood to other body regions. Since effective and accurate diagnosis of CVDs is essential for CVD prevention and treatment, machine learning (ML) techniques can be effectively and reliably used to discern patients suffering from a CVD from those who do not suffer from any heart condition. Namely, machine learning algorithms (MLAs) play a key role in the diagnosis of CVDs through predictive models that allow us to identify the main risks factors influencing CVD development. In this study, we analyze the performance of ten MLAs on two datasets for CVD prediction and two for CVD diagnosis. Algorithm performance is analyzed on top-two and top-four dataset attributes/features with respect to five performance metrics –accuracy, precision, recall, f1-score, and roc-auc – using the train-test split technique and k-fold cross-validation. Our study identifies the top two and four attributes from each CVD diagnosis/prediction dataset. As our main findings, the ten MLAs exhibited appropriate diagnosis and predictive performance; hence, they can be successfully implemented for improving current CVD diagnosis efforts and help patients around the world, especially in regions where medical staff is lacking.

Download Full-text