Development of a Neurodegenerative Disease Gait Classification Algorithm Using Multiscale Sample Entropy and Machine Learning Classifiers

Quoc Duy Nam Nguyen; An-Bang Liu; Che-Wei Lin

doi:10.3390/e22121340

Development of a Neurodegenerative Disease Gait Classification Algorithm Using Multiscale Sample Entropy and Machine Learning Classifiers

Entropy ◽

10.3390/e22121340 ◽

2020 ◽

Vol 22 (12) ◽

pp. 1340

Author(s):

Quoc Duy Nam Nguyen ◽

An-Bang Liu ◽

Che-Wei Lin

Keyword(s):

Machine Learning ◽

Time Window ◽

Time Windows ◽

Sample Entropy ◽

Classification Algorithm ◽

Support Vector ◽

Machine Learning Classifiers ◽

Gait Classification ◽

Learning Classifiers ◽

Gait Database

The prevalence of neurodegenerative diseases (NDD) has grown rapidly in recent years and NDD screening receives much attention. NDD could cause gait abnormalities so that to screen NDD using gait signal is feasible. The research aim of this study is to develop an NDD classification algorithm via gait force (GF) using multiscale sample entropy (MSE) and machine learning models. The Physionet NDD gait database is utilized to validate the proposed algorithm. In the preprocessing stage of the proposed algorithm, new signals were generated by taking one and two times of differential on GF and are divided into various time windows (10/20/30/60-sec). In feature extraction, the GF signal is used to calculate statistical and MSE values. Owing to the imbalanced nature of the Physionet NDD gait database, the synthetic minority oversampling technique (SMOTE) was used to rebalance data of each class. Support vector machine (SVM) and k-nearest neighbors (KNN) were used as the classifiers. The best classification accuracies for the healthy controls (HC) vs. Parkinson’s disease (PD), HC vs. Huntington’s disease (HD), HC vs. amyotrophic lateral sclerosis (ALS), PD vs. HD, PD vs. ALS, HD vs. ALS, HC vs. PD vs. HD vs. ALS, were 99.90%, 99.80%, 100%, 99.75%, 99.90%, 99.55%, and 99.68% under 10-sec time window with KNN. This study successfully developed an NDD gait classification based on MSE and machine learning classifiers.

Download Full-text

Classifying Lensed Gravitational Waves in the Geometrical Optics Limit with Machine Learning

American Journal of Undergraduate Research ◽

10.33697/ajur.2019.019 ◽

2019 ◽

Vol 16 (2) ◽

pp. 5-16

Author(s):

Amit Singh ◽

Ivan Li ◽

Otto Hannuksela ◽

Tjonnie Li ◽

Kyungmin Kim

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Gravitational Wave ◽

Gravitational Waves ◽

Geometrical Optics ◽

Supervised Machine Learning ◽

Support Vector ◽

Multi Layer Perceptron ◽

Machine Learning Classifiers ◽

Learning Classifiers

Gravitational waves are theorized to be gravitationally lensed when they propagate near massive objects. Such lensing effects cause potentially detectable repeated gravitational wave patterns in ground- and space-based gravitational wave detectors. These effects are difficult to discriminate when the lens is small and the repeated patterns superpose. Traditionally, matched filtering techniques are used to identify gravitational-wave signals, but we instead aim to utilize machine learning techniques to achieve this. In this work, we implement supervised machine learning classifiers (support vector machine, random forest, multi-layer perceptron) to discriminate such lensing patterns in gravitational wave data. We train classifiers with spectrograms of both lensed and unlensed waves using both point-mass and singular isothermal sphere lens models. As the result, classifiers return F1 scores ranging from 0:852 to 0:996, with precisions from 0:917 to 0:992 and recalls ranging from 0:796 to 1:000 depending on the type of classifier and lensing model used. This supports the idea that machine learning classifiers are able to correctly determine lensed gravitational wave signals. This also suggests that in the future, machine learning classifiers may be used as a possible alternative to identify lensed gravitational wave events and to allow us to study gravitational wave sources and massive astronomical objects through further analysis. KEYWORDS: Gravitational Waves; Gravitational Lensing; Geometrical Optics; Machine Learning; Classification; Support Vector Machine; Random Tree Forest; Multi-layer Perceptron

Download Full-text

Comparative Study of Machine Learning Classifiers for Modelling Road Traffic Accidents

Applied Sciences ◽

10.3390/app12020828 ◽

2022 ◽

Vol 12 (2) ◽

pp. 828

Author(s):

Tebogo Bokaba ◽

Wesley Doorsamy ◽

Babu Sena Paul

Keyword(s):

Machine Learning ◽

Traffic Accidents ◽

Road Traffic ◽

Real Life ◽

Support Vector ◽

Road Traffic Accidents ◽

Machine Learning Classifiers ◽

Reduction Techniques ◽

Learning Classifiers ◽

Accident Data

Road traffic accidents (RTAs) are a major cause of injuries and fatalities worldwide. In recent years, there has been a growing global interest in analysing RTAs, specifically concerned with analysing and modelling accident data to better understand and assess the causes and effects of accidents. This study analysed the performance of widely used machine learning classifiers using a real-life RTA dataset from Gauteng, South Africa. The study aimed to assess prediction model designs for RTAs to assist transport authorities and policymakers. It considered classifiers such as naïve Bayes, logistic regression, k-nearest neighbour, AdaBoost, support vector machine, random forest, and five missing data methods. These classifiers were evaluated using five evaluation metrics: accuracy, root-mean-square error, precision, recall, and receiver operating characteristic curves. Furthermore, the assessment involved parameter adjustment and incorporated dimensionality reduction techniques. The empirical results and analyses show that the RF classifier, combined with multiple imputations by chained equations, yielded the best performance when compared with the other combinations.

Download Full-text

Different Machine Learning Classifiers for Music Emotion Recognition

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.d7833.118419 ◽

2019 ◽

Vol 8 (4) ◽

pp. 2187-2191

Keyword(s):

Machine Learning ◽

Emotion Recognition ◽

Naive Bayes ◽

Naïve Bayes ◽

Support Vector ◽

Bayes Classifier ◽

Promising Alternative ◽

Machine Learning Classifiers ◽

Learning Classifiers ◽

Statistical Metrics

Music in an essential part of life and the emotion carried by it is key to its perception and usage. Music Emotion Recognition (MER) is the task of identifying the emotion in musical tracks and classifying them accordingly. The objective of this research paper is to check the effectiveness of popular machine learning classifiers like XGboost, Random Forest, Decision Trees, Support Vector Machine (SVM), K-Nearest-Neighbour (KNN) and Gaussian Naive Bayes on the task of MER. Using the MIREX-like dataset [17] to test these classifiers, the effects of oversampling algorithms like Synthetic Minority Oversampling Technique (SMOTE) [22] and Random Oversampling (ROS) were also verified. In all, the Gaussian Naive Bayes classifier gave the maximum accuracy of 40.33%. The other classifiers gave accuracies in between 20.44% and 38.67%. Thus, a limit on the classification accuracy has been reached using these classifiers and also using traditional musical or statistical metrics derived from the music as input features. In view of this, deep learning-based approaches using Convolutional Neural Networks (CNNs) [13] and spectrograms of the music clips for MER is a promising alternative.

Download Full-text

Investigating the Physics of Tokamak Global Stability with Interpretable Machine Learning Tools

Applied Sciences ◽

10.3390/app10196683 ◽

2020 ◽

Vol 10 (19) ◽

pp. 6683

Author(s):

Andrea Murari ◽

Emmanuele Peluso ◽

Michele Lungaroni ◽

Riccardo Rossi ◽

Michela Gelfusa ◽

...

Keyword(s):

Machine Learning ◽

Data Mining ◽

Independent Learning ◽

Support Vector ◽

Learning Tools ◽

Feedback Systems ◽

Theoretical Understanding ◽

Machine Learning Classifiers ◽

Learning Classifiers ◽

Mining Tools

The inadequacies of basic physics models for disruption prediction have induced the community to increasingly rely on data mining tools. In the last decade, it has been shown how machine learning predictors can achieve a much better performance than those obtained with manually identified thresholds or empirical descriptions of the plasma stability limits. The main criticisms of these techniques focus therefore on two different but interrelated issues: poor “physics fidelity” and limited interpretability. Insufficient “physics fidelity” refers to the fact that the mathematical models of most data mining tools do not reflect the physics of the underlying phenomena. Moreover, they implement a black box approach to learning, which results in very poor interpretability of their outputs. To overcome or at least mitigate these limitations, a general methodology has been devised and tested, with the objective of combining the predictive capability of machine learning tools with the expression of the operational boundary in terms of traditional equations more suited to understanding the underlying physics. The proposed approach relies on the application of machine learning classifiers (such as Support Vector Machines or Classification Trees) and Symbolic Regression via Genetic Programming directly to experimental databases. The results are very encouraging. The obtained equations of the boundary between the safe and disruptive regions of the operational space present almost the same performance as the machine learning classifiers, based on completely independent learning techniques. Moreover, these models possess significantly better predictive power than traditional representations, such as the Hugill or the beta limit. More importantly, they are realistic and intuitive mathematical formulas, which are well suited to supporting theoretical understanding and to benchmarking empirical models. They can also be deployed easily and efficiently in real-time feedback systems.

Download Full-text

Linear SVM-Based Android Malware Detection for Reliable IoT Services

Journal of Applied Mathematics ◽

10.1155/2014/594501 ◽

2014 ◽

Vol 2014 ◽

pp. 1-10 ◽

Cited By ~ 35

Author(s):

Hyo-Sik Ham ◽

Hwan-Hee Kim ◽

Myung-Sup Kim ◽

Mi-Jung Choi

Keyword(s):

Machine Learning ◽

Mobile Devices ◽

Malware Detection ◽

Information Leakage ◽

Support Vector ◽

Android Malware ◽

Machine Learning Classifiers ◽

Android Malware Detection ◽

Learning Classifiers ◽

Linear Svm

Current many Internet of Things (IoT) services are monitored and controlled through smartphone applications. By combining IoT with smartphones, many convenient IoT services have been provided to users. However, there are adverse underlying effects in such services including invasion of privacy and information leakage. In most cases, mobile devices have become cluttered with important personal user information as various services and contents are provided through them. Accordingly, attackers are expanding the scope of their attacks beyond the existing PC and Internet environment into mobile devices. In this paper, we apply a linear support vector machine (SVM) to detect Android malware and compare the malware detection performance of SVM with that of other machine learning classifiers. Through experimental validation, we show that the SVM outperforms other machine learning classifiers.

Download Full-text

A Hadoop Based Framework Integrating Machine Learning Classifiers for Anomaly Detection in the Internet of Things

Electronics ◽

10.3390/electronics10161955 ◽

2021 ◽

Vol 10 (16) ◽

pp. 1955

Author(s):

Ikram Sumaiya Thaseen ◽

Vanitha Mohanraj ◽

Sakthivel Ramachandran ◽

Kishore Sanapala ◽

Sang-Soo Yeo

Keyword(s):

Machine Learning ◽

Internet Of Things ◽

Experimental Analysis ◽

Parameter Tuning ◽

Computational Time ◽

Support Vector ◽

Learning Approaches ◽

K Nearest Neighbor ◽

Machine Learning Classifiers ◽

Learning Classifiers

In recent years, different variants of the botnet are targeting government, private organizations and there is a crucial need to develop a robust framework for securing the IoT (Internet of Things) network. In this paper, a Hadoop based framework is proposed to identify the malicious IoT traffic using a modified Tomek-link under-sampling integrated with automated Hyper-parameter tuning of machine learning classifiers. The novelty of this paper is to utilize a big data platform for benchmark IoT datasets to minimize computational time. The IoT benchmark datasets are loaded in the Hadoop Distributed File System (HDFS) environment. Three machine learning approaches namely naive Bayes (NB), K-nearest neighbor (KNN), and support vector machine (SVM) are used for categorizing IoT traffic. Artificial immune network optimization is deployed during cross-validation to obtain the best classifier parameters. Experimental analysis is performed on the Hadoop platform. The average accuracy of 99% and 90% is obtained for BoT_IoT and ToN_IoT datasets. The accuracy difference in ToN-IoT dataset is due to the huge number of data samples captured at the edge layer and fog layer. However, in BoT-IoT dataset only 5% of the training and test samples from the complete dataset are considered for experimental analysis as released by the dataset developers. The overall accuracy is improved by 19% in comparison with state-of-the-art techniques. The computational times for the huge datasets are reduced by 3–4 hours through Map Reduce in HDFS.

Download Full-text

Hybrid Machine Learning Classifiers for Indoor User Localization Problem

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.c8375.0110321 ◽

2021 ◽

Vol 10 (3) ◽

pp. 49-53

Author(s):

Hamza Turabieh ◽

Ahmad S. Alghamdi

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Indoor Localization ◽

Signal Strength ◽

Access Point ◽

Support Vector ◽

Linear Discriminant ◽

Machine Learning Classifiers ◽

Learning Classifiers ◽

User Location

Wi-Fi technology is now everywhere either inside or outside buildings. Using Wi-fi technology introduces an indoor localization service(s) (ILS). Determining indoor user location is a hard and complex problem. Several applications highlight the importance of indoor user localization such as disaster management, health care zones, Internet of Things applications (IoT), and public settlement planning. The measurements of Wi-Fi signal strength (i.e., Received Signal Strength Indicator (RSSI)) can be used to determine indoor user location. In this paper, we proposed a hybrid model between a wrapper feature selection algorithm and machine learning classifiers to determine indoor user location. We employed the Minimum Redundancy Maximum Relevance (mRMR) algorithm as a feature selection to select the most active access point (AP) based on RSSI values. Six different machine learning classifiers were used in this work (i.e., Decision Tree (DT), Support Vector Machine (SVM), k-nearest neighbors (kNN), Linear Discriminant Analysis (LDA), Ensemble-Bagged Tree (EBaT), and Ensemble Boosted Tree (EBoT)). We examined all classifiers on a public dataset obtained from UCI repository. The obtained results show that EBoT outperforms all other classifiers based on accuracy value/

Download Full-text

Prediction of COVID-19 Patient using Supervised Machine Learning Algorithm

Sains Malaysiana ◽

10.17576/jsm-2021-5008-28 ◽

2021 ◽

Vol 50 (8) ◽

pp. 2479-2497

Author(s):

Buvana M. ◽

Muthumayil K.

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Nasal Congestion ◽

Supervised Machine Learning ◽

Support Vector ◽

Data Set ◽

Physiological Measurement ◽

Machine Learning Classifiers ◽

Balanced Diet ◽

Learning Classifiers

One of the most symptomatic diseases is COVID-19. Early and precise physiological measurement-based prediction of breathing will minimize the risk of COVID-19 by a reasonable distance from anyone; wearing a mask, cleanliness, medication, balanced diet, and if not well stay safe at home. To evaluate the collected datasets of COVID-19 prediction, five machine learning classifiers were used: Nave Bayes, Support Vector Machine (SVM), Logistic Regression, K-Nearest Neighbour (KNN), and Decision Tree. COVID-19 datasets from the Repository were combined and re-examined to remove incomplete entries, and a total of 2500 cases were utilized in this study. Features of fever, body pain, runny nose, difficulty in breathing, shore throat, and nasal congestion, are considered to be the most important differences between patients who have COVID-19s and those who do not. We exhibit the prediction functionality of five machine learning classifiers. A publicly available data set was used to train and assess the model. With an overall accuracy of 99.88 percent, the ensemble model is performed commendably. When compared to the existing methods and studies, the proposed model is performed better. As a result, the model presented is trustworthy and can be used to screen COVID-19 patients timely, efficiently.

Download Full-text

Machine Learning Classifiers for Twitter Surveillance of Vaping: Comparative Machine Learning Study

Journal of Medical Internet Research ◽

10.2196/17478 ◽

2020 ◽

Vol 22 (8) ◽

pp. e17478 ◽

Cited By ~ 1

Author(s):

Shyam Visweswaran ◽

Jason B Colditz ◽

Patrick O’Halloran ◽

Na-Rae Han ◽

Sanya B Taneja ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Surveillance System ◽

Short Term Memory ◽

Characteristic Curve ◽

Superior Performance ◽

Support Vector ◽

Data Set ◽

Machine Learning Classifiers ◽

Learning Classifiers

Background Twitter presents a valuable and relevant social media platform to study the prevalence of information and sentiment on vaping that may be useful for public health surveillance. Machine learning classifiers that identify vaping-relevant tweets and characterize sentiments in them can underpin a Twitter-based vaping surveillance system. Compared with traditional machine learning classifiers that are reliant on annotations that are expensive to obtain, deep learning classifiers offer the advantage of requiring fewer annotated tweets by leveraging the large numbers of readily available unannotated tweets. Objective This study aims to derive and evaluate traditional and deep learning classifiers that can identify tweets relevant to vaping, tweets of a commercial nature, and tweets with provape sentiments. Methods We continuously collected tweets that matched vaping-related keywords over 2 months from August 2018 to October 2018. From this data set of tweets, a set of 4000 tweets was selected, and each tweet was manually annotated for relevance (vape relevant or not), commercial nature (commercial or not), and sentiment (provape or not). Using the annotated data, we derived traditional classifiers that included logistic regression, random forest, linear support vector machine, and multinomial naive Bayes. In addition, using the annotated data set and a larger unannotated data set of tweets, we derived deep learning classifiers that included a convolutional neural network (CNN), long short-term memory (LSTM) network, LSTM-CNN network, and bidirectional LSTM (BiLSTM) network. The unannotated tweet data were used to derive word vectors that deep learning classifiers can leverage to improve performance. Results LSTM-CNN performed the best with the highest area under the receiver operating characteristic curve (AUC) of 0.96 (95% CI 0.93-0.98) for relevance, all deep learning classifiers including LSTM-CNN performed better than the traditional classifiers with an AUC of 0.99 (95% CI 0.98-0.99) for distinguishing commercial from noncommercial tweets, and BiLSTM performed the best with an AUC of 0.83 (95% CI 0.78-0.89) for provape sentiment. Overall, LSTM-CNN performed the best across all 3 classification tasks. Conclusions We derived and evaluated traditional machine learning and deep learning classifiers to identify vaping-related relevant, commercial, and provape tweets. Overall, deep learning classifiers such as LSTM-CNN had superior performance and had the added advantage of requiring no preprocessing. The performance of these classifiers supports the development of a vaping surveillance system.

Download Full-text

NDE of Discontinuities in Thermal Barrier Coatings with Terahertz Time-Domain Spectroscopy and Machine Learning Classifiers

Materials Evaluation ◽

10.32548/2021.me-04189 ◽

2021 ◽

Vol 79 (2) ◽

pp. 125-135

Author(s):

Binghua Cao ◽

Enze Cai ◽

Mengbao Fan

Keyword(s):

Machine Learning ◽

Thermal Barrier Coatings ◽

Time Domain ◽

Support Vector ◽

Thermal Barrier ◽

Barrier Coatings ◽

Time Domain Spectroscopy ◽

Running Time ◽

Machine Learning Classifiers ◽

Learning Classifiers

Internal discontinuities are critical factors that can lead to premature failure of thermal barrier coatings (TBCs). This paper proposes a technique that combines terahertz (THz) time-domain spectroscopy and machine learning classifiers to identify discontinuities in TBCs. First, the finite-difference time-domain method was used to build a theoretical model of THz signals due to discontinuities in TBCs. Then, simulations were carried out to compute THz waveforms of different discontinuities in TBCs. Further, six machine learning classifiers were employed to classify these different discontinuities. Principal component analysis (PCA) was used for dimensionality reduction, and the Grid Search method was utilized to optimize the hyperparameters of the designed machine learning classifiers. Accuracy and running time were used to characterize their performances. The results show that the support vector machine (SVM) has a better performance than the others in TBC discontinuity classification. Using PCA, the average accuracy of the SVM classifier is 94.3%, and the running time is 65.6 ms.

Download Full-text