Predicting Ewing Sarcoma Treatment Outcome Using Infrared Spectroscopy and Machine Learning

Radosław Chaber; Christopher Arthur; Kornelia Łach; Anna Raciborska; Elżbieta Michalak; Katarzyna Bilska; Katarzyna Drabko; Joanna Depciuch; Ewa Kaznowska; Józef Cebulski

doi:10.3390/molecules24061075

Predicting Ewing Sarcoma Treatment Outcome Using Infrared Spectroscopy and Machine Learning

Molecules ◽

10.3390/molecules24061075 ◽

2019 ◽

Vol 24 (6) ◽

pp. 1075 ◽

Cited By ~ 1

Author(s):

Radosław Chaber ◽

Christopher Arthur ◽

Kornelia Łach ◽

Anna Raciborska ◽

Elżbieta Michalak ◽

...

Keyword(s):

Machine Learning ◽

Treatment Outcome ◽

Ewing Sarcoma ◽

Bone Tumour ◽

Ftir Spectra ◽

Support Vector ◽

Learning Approaches ◽

Linear Discriminant ◽

Before And After ◽

Leave One Out

Background: Improved outcome prediction is vital for the delivery of risk-adjusted, appropriate and effective care to paediatric patients with Ewing sarcoma—the second most common paediatric malignant bone tumour. Fourier transform infrared (FTIR) spectroscopy of tissues allows the bulk biochemical content of a biological sample to be probed and makes possible the study and diagnosis of disease. Methods: In this retrospective study, FTIR spectra of sections of biopsy-obtained bone tissue were recorded. Twenty-seven patients (between 5 and 20 years of age) with newly diagnosed Ewing sarcoma of bone were included in this study. The prognostic value of FTIR spectra obtained from Ewing sarcoma (ES) tumours before and after neoadjuvant chemotherapy were analysed in combination with various data-reduction and machine learning approaches. Results: Random forest and linear discriminant analysis supervised learning models were able to correctly predict patient mortality in 92% of cases using leave-one-out cross-validation. The best performing model for predicting patient relapse was a linear Support Vector Machine trained on the observed spectral changes as a result of chemotherapy treatment, which achieved 92% accuracy. Conclusion: FTIR spectra of tumour biopsy samples may predict treatment outcome in paediatric Ewing sarcoma patients with greater than 92% accuracy.

Download Full-text

Application of Machine Learning Approaches for the Design and Study of Anticancer Drugs

Current Drug Targets ◽

10.2174/1389450119666180809122244 ◽

2019 ◽

Vol 20 (5) ◽

pp. 488-500 ◽

Cited By ~ 6

Author(s):

Yan Hu ◽

Yi Lu ◽

Shuo Wang ◽

Mengying Zhang ◽

Xiaosheng Qu ◽

...

Keyword(s):

Machine Learning ◽

Drug Design ◽

Anticancer Drugs ◽

Nearest Neighbor ◽

Cost Effective ◽

Support Vector ◽

Learning Approaches ◽

K Nearest Neighbor ◽

Activity Prediction ◽

Linear Discriminant

Background: Globally the number of cancer patients and deaths are continuing to increase yearly, and cancer has, therefore, become one of the world's highest causes of morbidity and mortality. In recent years, the study of anticancer drugs has become one of the most popular medical topics. Objective: In this review, in order to study the application of machine learning in predicting anticancer drugs activity, some machine learning approaches such as Linear Discriminant Analysis (LDA), Principal components analysis (PCA), Support Vector Machine (SVM), Random forest (RF), k-Nearest Neighbor (kNN), and Naïve Bayes (NB) were selected, and the examples of their applications in anticancer drugs design are listed. Results: Machine learning contributes a lot to anticancer drugs design and helps researchers by saving time and is cost effective. However, it can only be an assisting tool for drug design. Conclusion: This paper introduces the application of machine learning approaches in anticancer drug design. Many examples of success in identification and prediction in the area of anticancer drugs activity prediction are discussed, and the anticancer drugs research is still in active progress. Moreover, the merits of some web servers related to anticancer drugs are mentioned.

Download Full-text

Machine Learning Approaches for Detecting Tropical Cyclone Formation Using Satellite Data

Remote Sensing ◽

10.3390/rs11101195 ◽

2019 ◽

Vol 11 (10) ◽

pp. 1195 ◽

Cited By ~ 6

Author(s):

Minsang Kim ◽

Myung-Sook Park ◽

Jungho Im ◽

Seonyoung Park ◽

Myong-In Lee

Keyword(s):

Machine Learning ◽

Tropical Cyclone ◽

Surface Wind ◽

Support Vector ◽

Learning Approaches ◽

Yearly Variation ◽

Linear Discriminant ◽

Forecast Lead Time ◽

Vector Machines ◽

First Time

This study compared detection skill for tropical cyclone (TC) formation using models based on three different machine learning (ML) algorithms-decision trees (DT), random forest (RF), and support vector machines (SVM)-and a model based on Linear Discriminant Analysis (LDA). Eight predictors were derived from WindSat satellite measurements of ocean surface wind and precipitation over the western North Pacific for 2005–2009. All of the ML approaches performed better with significantly higher hit rates ranging from 94 to 96% compared with LDA performance (~77%), although false alarm rate by MLs is slightly higher (21–28%) than that by LDA (~13%). Besides, MLs could detect TC formation at the time as early as 26–30 h before the first time diagnosed as tropical depression by the JTWC best track, which was also 5 to 9 h earlier than that by LDA. The skill differences across MLs were relatively smaller than difference between MLs and LDA. Large yearly variation in forecast lead time was common in all models due to the limitation in sampling from orbiting satellite. This study highlights that ML approaches provide an improved skill for detecting TC formation compared with conventional linear approaches.

Download Full-text

Diagnostic Performance of 2D and 3D T2WI-Based Radiomics Features With Machine Learning Algorithms to Distinguish Solid Solitary Pulmonary Lesion

Frontiers in Oncology ◽

10.3389/fonc.2021.683587 ◽

2021 ◽

Vol 11 ◽

Author(s):

Qi Wan ◽

Jiaxuan Zhou ◽

Xiaoying Xia ◽

Jianfeng Hu ◽

Peng Wang ◽

...

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Diagnostic Performance ◽

Feature Selection Method ◽

Machine Learning Algorithms ◽

Support Vector ◽

Learning Approaches ◽

Selection Methods ◽

Linear Discriminant ◽

2D And 3D

ObjectiveTo evaluate the performance of 2D and 3D radiomics features with different machine learning approaches to classify SPLs based on magnetic resonance(MR) T2 weighted imaging (T2WI).Material and MethodsA total of 132 patients with pathologically confirmed SPLs were examined and randomly divided into training (n = 92) and test datasets (n = 40). A total of 1692 3D and 1231 2D radiomics features per patient were extracted. Both radiomics features and clinical data were evaluated. A total of 1260 classification models, comprising 3 normalization methods, 2 dimension reduction algorithms, 3 feature selection methods, and 10 classifiers with 7 different feature numbers (confined to 3–9), were compared. The ten-fold cross-validation on the training dataset was applied to choose the candidate final model. The area under the receiver operating characteristic curve (AUC), precision-recall plot, and Matthews Correlation Coefficient were used to evaluate the performance of machine learning approaches.ResultsThe 3D features were significantly superior to 2D features, showing much more machine learning combinations with AUC greater than 0.7 in both validation and test groups (129 vs. 11). The feature selection method Analysis of Variance(ANOVA), Recursive Feature Elimination(RFE) and the classifier Logistic Regression(LR), Linear Discriminant Analysis(LDA), Support Vector Machine(SVM), Gaussian Process(GP) had relatively better performance. The best performance of 3D radiomics features in the test dataset (AUC = 0.824, AUC-PR = 0.927, MCC = 0.514) was higher than that of 2D features (AUC = 0.740, AUC-PR = 0.846, MCC = 0.404). The joint 3D and 2D features (AUC=0.813, AUC-PR = 0.926, MCC = 0.563) showed similar results as 3D features. Incorporating clinical features with 3D and 2D radiomics features slightly improved the AUC to 0.836 (AUC-PR = 0.918, MCC = 0.620) and 0.780 (AUC-PR = 0.900, MCC = 0.574), respectively.ConclusionsAfter algorithm optimization, 2D feature-based radiomics models yield favorable results in differentiating malignant and benign SPLs, but 3D features are still preferred because of the availability of more machine learning algorithmic combinations with better performance. Feature selection methods ANOVA and RFE, and classifier LR, LDA, SVM and GP are more likely to demonstrate better diagnostic performance for 3D features in the current study.

Download Full-text

Machine Learning Methods Applied to the Prediction of Pseudo-nitzschia spp. Blooms in the Galician Rias Baixas (NW Spain)

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10040199 ◽

2021 ◽

Vol 10 (4) ◽

pp. 199

Author(s):

Francisco M. Bellas Aláez ◽

Jesus M. Torres Palenzuela ◽

Evangelos Spyrakos ◽

Luis González Vilas

Keyword(s):

Machine Learning ◽

Performance Metrics ◽

Prediction Models ◽

Support Vector ◽

False Alarms ◽

Learning Approaches ◽

Learning Methods ◽

Machine Learning Methods ◽

Rías Baixas ◽

New Algorithms

This work presents new prediction models based on recent developments in machine learning methods, such as Random Forest (RF) and AdaBoost, and compares them with more classical approaches, i.e., support vector machines (SVMs) and neural networks (NNs). The models predict Pseudo-nitzschia spp. blooms in the Galician Rias Baixas. This work builds on a previous study by the authors (doi.org/10.1016/j.pocean.2014.03.003) but uses an extended database (from 2002 to 2012) and new algorithms. Our results show that RF and AdaBoost provide better prediction results compared to SVMs and NNs, as they show improved performance metrics and a better balance between sensitivity and specificity. Classical machine learning approaches show higher sensitivities, but at a cost of lower specificity and higher percentages of false alarms (lower precision). These results seem to indicate a greater adaptation of new algorithms (RF and AdaBoost) to unbalanced datasets. Our models could be operationally implemented to establish a short-term prediction system.

Download Full-text

Practical CO2—WAG Field Operational Designs Using Hybrid Numerical-Machine-Learning Approaches

Energies ◽

10.3390/en14041055 ◽

2021 ◽

Vol 14 (4) ◽

pp. 1055

Author(s):

Qian Sun ◽

William Ampomah ◽

Junyu You ◽

Martha Cather ◽

Robert Balch

Keyword(s):

Machine Learning ◽

Oil Recovery ◽

History Matching ◽

Optimization Problems ◽

Learning Technologies ◽

Petroleum Engineering ◽

Support Vector ◽

Learning Approaches ◽

Field Development ◽

Proxy Models

Machine-learning technologies have exhibited robust competences in solving many petroleum engineering problems. The accurate predictivity and fast computational speed enable a large volume of time-consuming engineering processes such as history-matching and field development optimization. The Southwest Regional Partnership on Carbon Sequestration (SWP) project desires rigorous history-matching and multi-objective optimization processes, which fits the superiorities of the machine-learning approaches. Although the machine-learning proxy models are trained and validated before imposing to solve practical problems, the error margin would essentially introduce uncertainties to the results. In this paper, a hybrid numerical machine-learning workflow solving various optimization problems is presented. By coupling the expert machine-learning proxies with a global optimizer, the workflow successfully solves the history-matching and CO2 water alternative gas (WAG) design problem with low computational overheads. The history-matching work considers the heterogeneities of multiphase relative characteristics, and the CO2-WAG injection design takes multiple techno-economic objective functions into accounts. This work trained an expert response surface, a support vector machine, and a multi-layer neural network as proxy models to effectively learn the high-dimensional nonlinear data structure. The proposed workflow suggests revisiting the high-fidelity numerical simulator for validation purposes. The experience gained from this work would provide valuable guiding insights to similar CO2 enhanced oil recovery (EOR) projects.

Download Full-text

Detection of Malicious Software by Analyzing Distinct Artifacts Using Machine Learning and Deep Learning Algorithms

Electronics ◽

10.3390/electronics10141694 ◽

2021 ◽

Vol 10 (14) ◽

pp. 1694

Author(s):

Mathew Ashik ◽

A. Jyothish ◽

S. Anandaram ◽

P. Vinod ◽

Francesco Mercaldo ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Deep Learning ◽

Support Vector ◽

Malware Analysis ◽

Learning Approaches ◽

Dynamic Features ◽

System Calls ◽

Prevention Methods ◽

Structural Aspects

Malware is one of the most significant threats in today’s computing world since the number of websites distributing malware is increasing at a rapid rate. Malware analysis and prevention methods are increasingly becoming necessary for computer systems connected to the Internet. This software exploits the system’s vulnerabilities to steal valuable information without the user’s knowledge, and stealthily send it to remote servers controlled by attackers. Traditionally, anti-malware products use signatures for detecting known malware. However, the signature-based method does not scale in detecting obfuscated and packed malware. Considering that the cause of a problem is often best understood by studying the structural aspects of a program like the mnemonics, instruction opcode, API Call, etc. In this paper, we investigate the relevance of the features of unpacked malicious and benign executables like mnemonics, instruction opcodes, and API to identify a feature that classifies the executable. Prominent features are extracted using Minimum Redundancy and Maximum Relevance (mRMR) and Analysis of Variance (ANOVA). Experiments were conducted on four datasets using machine learning and deep learning approaches such as Support Vector Machine (SVM), Naïve Bayes, J48, Random Forest (RF), and XGBoost. In addition, we also evaluate the performance of the collection of deep neural networks like Deep Dense network, One-Dimensional Convolutional Neural Network (1D-CNN), and CNN-LSTM in classifying unknown samples, and we observed promising results using APIs and system calls. On combining APIs/system calls with static features, a marginal performance improvement was attained comparing models trained only on dynamic features. Moreover, to improve accuracy, we implemented our solution using distinct deep learning methods and demonstrated a fine-tuned deep neural network that resulted in an F1-score of 99.1% and 98.48% on Dataset-2 and Dataset-3, respectively.

Download Full-text

Analysis of the Nosema Cells Identification for Microscopic Images

Sensors ◽

10.3390/s21093068 ◽

2021 ◽

Vol 21 (9) ◽

pp. 3068

Author(s):

Soumaya Dghim ◽

Carlos M. Travieso-González ◽

Radim Burget

Keyword(s):

Neural Network ◽

Machine Learning ◽

Image Processing ◽

Deep Learning ◽

The Other ◽

Support Vector ◽

Learning Approaches ◽

Microscopic Images ◽

Trained Neural Network ◽

Nosema Disease

The use of image processing tools, machine learning, and deep learning approaches has become very useful and robust in recent years. This paper introduces the detection of the Nosema disease, which is considered to be one of the most economically significant diseases today. This work shows a solution for recognizing and identifying Nosema cells between the other existing objects in the microscopic image. Two main strategies are examined. The first strategy uses image processing tools to extract the most valuable information and features from the dataset of microscopic images. Then, machine learning methods are applied, such as a neural network (ANN) and support vector machine (SVM) for detecting and classifying the Nosema disease cells. The second strategy explores deep learning and transfers learning. Several approaches were examined, including a convolutional neural network (CNN) classifier and several methods of transfer learning (AlexNet, VGG-16 and VGG-19), which were fine-tuned and applied to the object sub-images in order to identify the Nosema images from the other object images. The best accuracy was reached by the VGG-16 pre-trained neural network with 96.25%.

Download Full-text

Forecasting the risk at infractions: an ensemble comparison of machine learning approach

Industrial Management & Data Systems ◽

10.1108/imds-10-2020-0603 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Lei Li ◽

Desheng Wu

Keyword(s):

Machine Learning ◽

Prediction Models ◽

Short Term Memory ◽

Model Performance ◽

Large Data ◽

Support Vector ◽

Learning Approaches ◽

Content Type ◽

Day To Day Operations ◽

Prediction Approach

PurposeThe infraction of securities regulations (ISRs) of listed firms in their day-to-day operations and management has become one of common problems. This paper proposed several machine learning approaches to forecast the risk at infractions of listed corporates to solve financial problems that are not effective and precise in supervision.Design/methodology/approachThe overall proposed research framework designed for forecasting the infractions (ISRs) include data collection and cleaning, feature engineering, data split, prediction approach application and model performance evaluation. We select Logistic Regression, Naïve Bayes, Random Forest, Support Vector Machines, Artificial Neural Network and Long Short-Term Memory Networks (LSTMs) as ISRs prediction models.FindingsThe research results show that prediction performance of proposed models with the prior infractions provides a significant improvement of the ISRs than those without prior, especially for large sample set. The results also indicate when judging whether a company has infractions, we should pay attention to novel artificial intelligence methods, previous infractions of the company, and large data sets.Originality/valueThe findings could be utilized to address the problems of identifying listed corporates' ISRs at hand to a certain degree. Overall, results elucidate the value of the prior infraction of securities regulations (ISRs). This shows the importance of including more data sources when constructing distress models and not only focus on building increasingly more complex models on the same data. This is also beneficial to the regulatory authorities.

Download Full-text

Dynamic Detection of Delayed Cerebral Ischemia Using Machine Learning

10.1101/2020.04.15.20067041 ◽

2020 ◽

Author(s):

Murad Megjhani ◽

Kalijah Terilli ◽

Ayham Alkhachroum ◽

David J. Roh ◽

Sachin Agarwal ◽

...

Keyword(s):

Machine Learning ◽

Cerebral Ischemia ◽

Characteristic Curve ◽

Delayed Cerebral Ischemia ◽

Risk Scores ◽

Support Vector ◽

Model Parameters ◽

Learning Approaches ◽

Physiologic Data ◽

Over Time

AbstractObjectiveTo develop a machine learning based tool, using routine vital signs, to assess delayed cerebral ischemia (DCI) risk over time.MethodsIn this retrospective analysis, physiologic data for 540 consecutive acute subarachnoid hemorrhage patients were collected and annotated as part of a prospective observational cohort study between May 2006 and December 2014. Patients were excluded if (i) no physiologic data was available, (ii) they expired prior to the DCI onset window (< post bleed day 3) or (iii) early angiographic vasospasm was detected on admitting angiogram. DCI was prospectively labeled by consensus of treating physicians. Occurrence of DCI was classified using various machine learning approaches including logistic regression, random forest, support vector machine (linear and kernel), and an ensemble classifier, trained on vitals and subject characteristic features. Hourly risk scores were generated as the posterior probability at time t. We performed five-fold nested cross validation to tune the model parameters and to report the accuracy. All classifiers were evaluated for good discrimination using the area under the receiver operating characteristic curve (AU-ROC) and confusion matrices.ResultsOf 310 patients included in our final analysis, 101 (32.6%) patients developed DCI. We achieved maximal classification of 0.81 [0.75-0.82] AU-ROC. We also predicted 74.7 % of all DCI events 12 hours before typical clinical detection with a ratio of 3 true alerts for every 2 false alerts.ConclusionA data-driven machine learning based detection tool offered hourly assessments of DCI risk and incorporated new physiologic information over time.

Download Full-text

Use of machine learning to identify autism with natural social gaze behavior (Preprint)

10.2196/preprints.29328 ◽

2021 ◽

Author(s):

Zhong Zhao ◽

Haiming Tang ◽

Xiaobin Zhang ◽

Xingda Qu ◽

Jianping Lu

Keyword(s):

Machine Learning ◽

Eye Tracking ◽

Classification Accuracy ◽

Autism Spectrum ◽

Support Vector ◽

Gaze Behavior ◽

Linear Discriminant ◽

Face To Face ◽

Area Of Interest ◽

Social Gaze

BACKGROUND Abnormal gaze behavior is a prominent feature of the autism spectrum disorder (ASD). Previous eye tracking studies had participants watch images (i.e., picture, video and webpage), and the application of machine learning (ML) on these data showed promising results in identify ASD individuals. Given the fact that gaze behavior differs in face-to-face interaction from image viewing tasks, no study has investigated whether natural social gaze behavior could accurately identify ASD. OBJECTIVE The objective of this study was to examine whether and what area of interest (AOI)-based features extracted from the natural social gaze behavior could identify ASD. METHODS Both children with ASD and typical development (TD) were eye-tracked when they were engaged in a face-to-face conversation with an interviewer. Four ML classifiers (support vector machine, SVM; linear discriminant analysis, LDA; decision tree, DT; and random forest, RF) were used to determine the maximum classification accuracy and the corresponding features. RESULTS A maximum classification accuracy of 84.62% were achieved with three classifiers (LDA, DT and RF). Results showed that the mouth, but not the eyes AOI, was a powerful feature in detecting ASD. CONCLUSIONS Natural gaze behavior could be leveraged to identify ASD, suggesting that ASD might be objectively screened with eye tracking technology in everyday social interaction. In addition, the comparison between our and previous findings suggests that eye tracking features that could identify ASD might be culture dependent and context sensitive.

Download Full-text