scholarly journals The classification of motor imagery response: an accuracy enhancement through the ensemble of random subspace k-NN

2021 ◽  
Vol 7 ◽  
pp. e374
Author(s):  
Mamunur Rashid ◽  
Bifta Sama Bari ◽  
Md Jahid Hasan ◽  
Mohd Azraai Mohd Razman ◽  
Rabiu Muazu Musa ◽  
...  

Brain-computer interface (BCI) is a viable alternative communication strategy for patients of neurological disorders as it facilitates the translation of human intent into device commands. The performance of BCIs primarily depends on the efficacy of the feature extraction and feature selection techniques, as well as the classification algorithms employed. More often than not, high dimensional feature set contains redundant features that may degrade a given classifier’s performance. In the present investigation, an ensemble learning-based classification algorithm, namely random subspace k-nearest neighbour (k-NN) has been proposed to classify the motor imagery (MI) data. The common spatial pattern (CSP) has been applied to extract the features from the MI response, and the effectiveness of random forest (RF)-based feature selection algorithm has also been investigated. In order to evaluate the efficacy of the proposed method, an experimental study has been implemented using four publicly available MI dataset (BCI Competition III dataset 1 (data-1), dataset IIIA (data-2), dataset IVA (data-3) and BCI Competition IV dataset II (data-4)). It was shown that the ensemble-based random subspace k-NN approach achieved the superior classification accuracy (CA) of 99.21%, 93.19%, 93.57% and 90.32% for data-1, data-2, data-3 and data-4, respectively against other models evaluated, namely linear discriminant analysis, support vector machine, random forest, Naïve Bayes and the conventional k-NN. In comparison with other classification approaches reported in the recent studies, the proposed method enhanced the accuracy by 2.09% for data-1, 1.29% for data-2, 4.95% for data-3 and 5.71% for data-4, respectively. Moreover, it is worth highlighting that the RF feature selection technique employed in the present study was able to significantly reduce the feature dimension without compromising the overall CA. The outcome from the present study implies that the proposed method may significantly enhance the accuracy of MI data classification.

2020 ◽  
Vol 16 (2) ◽  
pp. 155014772090523
Author(s):  
ZhenLong Li ◽  
HaoXin Wang ◽  
YaoWei Zhang ◽  
XiaoHua Zhao

A method for drunk driving detection using Feature Selection based on the Random Forest was proposed. First, driving behavior data were collected using a driving simulator at Beijing University of Technology. Second, the features were selected according to the Feature Importance in the random forest. Third, a dummy variable was introduced to encode the geometric characteristics of different roads so that drunk driving under different road conditions can be detected with the same classifier based on the random forest. Finally, the linear discriminant analysis, support vector machine, and AdaBoost classifiers were used and compared with the random forest. The accuracy, F1 score, receiver operating characteristic curve, and area under the curve value were used to evaluate the performance of the classifiers. The results show that Accelerator Depth, Speed, Distance to the Center of the Lane, Acceleration, Engine Revolution, Brake Depth, and Steering Angle have important influences on identifying the drivers’ states and can be used to detect drunk driving. Specifically, the classifiers with Accelerator Depth outperformed the other classifiers without Accelerator Depth. This means that Accelerator Depth is an important feature. Both the AdaBoost and random forest classifiers have an accuracy of 81.48%, which verified the effectiveness of the proposed method.


2013 ◽  
Vol 23 (06) ◽  
pp. 1350026 ◽  
Author(s):  
WEI-YEN HSU

In this study, we propose a recognition system for single-trial analysis of motor imagery (MI) electroencephalogram (EEG) data. Applying event-related brain potential (ERP) data acquired from the sensorimotor cortices, the system chiefly consists of automatic artifact elimination, feature extraction, feature selection and classification. In addition to the use of independent component analysis, a similarity measure is proposed to further remove the electrooculographic (EOG) artifacts automatically. Several potential features, such as wavelet-fractal features, are then extracted for subsequent classification. Next, quantum-behaved particle swarm optimization (QPSO) is used to select features from the feature combination. Finally, selected sub-features are classified by support vector machine (SVM). Compared with without artifact elimination, feature selection using a genetic algorithm (GA) and feature classification with Fisher's linear discriminant (FLD) on MI data from two data sets for eight subjects, the results indicate that the proposed method is promising in brain–computer interface (BCI) applications.


2021 ◽  
Author(s):  
Ashutosh Tripathi ◽  
Naman Bhoj ◽  
Mayank Khari ◽  
Bishwajeet Pandey

Abstract With the rise of internet usage malwares pose a great threat to user security and privacy. Therefore, to mitigate the problem it is essential to develop an efficient malware detection framework. In our research we experimented with various machine learning and feature scaling algorithms. Chi-Square was used as the feature selection technique which selected a set of 48 features from a feature space of 128 features. The empirical results provide us with conclusive evidence that Random Forest is the best algorithm for detection of malware achieving an accuracy of 91.300% on our dataset followed by Gradient Boosting and Support Vector Machines.


Sensors ◽  
2019 ◽  
Vol 19 (20) ◽  
pp. 4523 ◽  
Author(s):  
Carlos Cabo ◽  
Celestino Ordóñez ◽  
Fernando Sáchez-Lasheras ◽  
Javier Roca-Pardiñas ◽  
and Javier de Cos-Juez

We analyze the utility of multiscale supervised classification algorithms for object detection and extraction from laser scanning or photogrammetric point clouds. Only the geometric information (the point coordinates) was considered, thus making the method independent of the systems used to collect the data. A maximum of five features (input variables) was used, four of them related to the eigenvalues obtained from a principal component analysis (PCA). PCA was carried out at six scales, defined by the diameter of a sphere around each observation. Four multiclass supervised classification models were tested (linear discriminant analysis, logistic regression, support vector machines, and random forest) in two different scenarios, urban and forest, formed by artificial and natural objects, respectively. The results obtained were accurate (overall accuracy over 80% for the urban dataset, and over 93% for the forest dataset), in the range of the best results found in the literature, regardless of the classification method. For both datasets, the random forest algorithm provided the best solution/results when discrimination capacity, computing time, and the ability to estimate the relative importance of each variable are considered together.


Author(s):  
Gang Liu ◽  
Chunlei Yang ◽  
Sen Liu ◽  
Chunbao Xiao ◽  
Bin Song

A feature selection method based on mutual information and support vector machine (SVM) is proposed in order to eliminate redundant feature and improve classification accuracy. First, local correlation between features and overall correlation is calculated by mutual information. The correlation reflects the information inclusion relationship between features, so the features are evaluated and redundant features are eliminated with analyzing the correlation. Subsequently, the concept of mean impact value (MIV) is defined and the influence degree of input variables on output variables for SVM network based on MIV is calculated. The importance weights of the features described with MIV are sorted by descending order. Finally, the SVM classifier is used to implement feature selection according to the classification accuracy of feature combination which takes MIV order of feature as a reference. The simulation experiments are carried out with three standard data sets of UCI, and the results show that this method can not only effectively reduce the feature dimension and high classification accuracy, but also ensure good robustness.


Complexity ◽  
2022 ◽  
Vol 2022 ◽  
pp. 1-11
Author(s):  
Marium Mehmood ◽  
Nasser Alshammari ◽  
Saad Awadh Alanazi ◽  
Fahad Ahmad

The liver is the human body’s mandatory organ, but detecting liver disease at an early stage is very difficult due to the hiddenness of symptoms. Liver diseases may cause loss of energy or weakness when some irregularities in the working of the liver get visible. Cancer is one of the most common diseases of the liver and also the most fatal of all. Uncontrolled growth of harmful cells is developed inside the liver. If diagnosed late, it may cause death. Treatment of liver diseases at an early stage is, therefore, an important issue as is designing a model to diagnose early disease. Firstly, an appropriate feature should be identified which plays a more significant part in the detection of liver cancer at an early stage. Therefore, it is essential to extract some essential features from thousands of unwanted features. So, these features will be mined using data mining and soft computing techniques. These techniques give optimized results that will be helpful in disease diagnosis at an early stage. In these techniques, we use feature selection methods to reduce the dataset’s feature, which include Filter, Wrapper, and Embedded methods. Different Regression algorithms are then applied to these methods individually to evaluate the result. Regression algorithms include Linear Regression, Ridge Regression, LASSO Regression, Support Vector Regression, Decision Tree Regression, Multilayer Perceptron Regression, and Random Forest Regression. Based on the accuracy and error rates generated by these Regression algorithms, we have evaluated our results. The result shows that Random Forest Regression with the Wrapper Method from all the deployed Regression techniques is the best and gives the highest R2-Score of 0.8923 and lowest MSE of 0.0618.


F1000Research ◽  
2016 ◽  
Vol 5 ◽  
pp. 2673 ◽  
Author(s):  
Daniel Kristiyanto ◽  
Kevin E. Anderson ◽  
Ling-Hong Hung ◽  
Ka Yee Yeung

Prostate cancer is the most common cancer among men in developed countries. Androgen deprivation therapy (ADT) is the standard treatment for prostate cancer. However, approximately one third of all patients with metastatic disease treated with ADT develop resistance to ADT. This condition is called metastatic castrate-resistant prostate cancer (mCRPC). Patients who do not respond to hormone therapy are often treated with a chemotherapy drug called docetaxel. Sub-challenge 2 of the Prostate Cancer DREAM Challenge aims to improve the prediction of whether a patient with mCRPC would discontinue docetaxel treatment due to adverse effects. Specifically, a dataset containing three distinct clinical studies of patients with mCRPC treated with docetaxel was provided. We  applied the k-nearest neighbor method for missing data imputation, the hill climbing algorithm and random forest importance for feature selection, and the random forest algorithm for classification. We also empirically studied the performance of many classification algorithms, including support vector machines and neural networks. Additionally, we found using random forest importance for feature selection provided slightly better results than the more computationally expensive method of hill climbing.


Author(s):  
Harsha A K

Abstract: Since the advent of encryption, there has been a steady increase in malware being transmitted over encrypted networks. Traditional approaches to detect malware like packet content analysis are inefficient in dealing with encrypted data. In the absence of actual packet contents, we can make use of other features like packet size, arrival time, source and destination addresses and other such metadata to detect malware. Such information can be used to train machine learning classifiers in order to classify malicious and benign packets. In this paper, we offer an efficient malware detection approach using classification algorithms in machine learning such as support vector machine, random forest and extreme gradient boosting. We employ an extensive feature selection process to reduce the dimensionality of the chosen dataset. The dataset is then split into training and testing sets. Machine learning algorithms are trained using the training set. These models are then evaluated against the testing set in order to assess their respective performances. We further attempt to tune the hyper parameters of the algorithms, in order to achieve better results. Random forest and extreme gradient boosting algorithms performed exceptionally well in our experiments, resulting in area under the curve values of 0.9928 and 0.9998 respectively. Our work demonstrates that malware traffic can be effectively classified using conventional machine learning algorithms and also shows the importance of dimensionality reduction in such classification problems. Keywords: Malware Detection, Extreme Gradient Boosting, Random Forest, Feature Selection.


Author(s):  
Beaulah Jeyavathana Rajendran ◽  
Kanimozhi K. V.

Tuberculosis is one of the hazardous infectious diseases that can be categorized by the evolution of tubercles in the tissues. This disease mainly affects the lungs and also the other parts of the body. The disease can be easily diagnosed by the radiologists. The main objective of this chapter is to get best solution selected by means of modified particle swarm optimization is regarded as optimal feature descriptor. Five stages are being used to detect tuberculosis disease. They are pre-processing an image, segmenting the lungs and extracting the feature, feature selection and classification. These stages that are used in medical image processing to identify the tuberculosis. In the feature extraction, the GLCM approach is used to extract the features and from the extracted feature sets the optimal features are selected by random forest. Finally, support vector machine classifier method is used for image classification. The experimentation is done, and intermediate results are obtained. The proposed system accuracy results are better than the existing method in classification.


Author(s):  
Ling Zou ◽  
Xinguang Wang ◽  
Guodong Shi ◽  
Zhenghua Ma

Accurate classification of EEG left and right hand motor imagery is an important issue in brain-computer interface. Firstly, discrete wavelet transform method was used to decompose the average power of C3 electrode and C4 electrode in left-right hands imagery movement during some periods of time. The reconstructed signal of approximation coefficient A6 on the sixth level was selected to build up a feature signal. Secondly, the performances by Fisher Linear Discriminant Analysis with two different threshold calculation ways and Support Vector Machine methods were compared. The final classification results showed that false classification rate by Support Vector Machine was lower and gained an ideal classification results.


Sign in / Sign up

Export Citation Format

Share Document