Dysphonic Voice Pattern Analysis of Patients in Parkinson’s Disease Using Minimum Interclass Probability Risk Feature Selection and Bagging Ensemble Learning Methods

Computational and Mathematical Methods in Medicine ◽

10.1155/2017/4201984 ◽

2017 ◽

Vol 2017 ◽

pp. 1-11 ◽

Cited By ~ 6

Author(s):

Yunfeng Wu ◽

Pinnan Chen ◽

Yuchen Yao ◽

Xiaoquan Ye ◽

Yugui Xiao ◽

...

Keyword(s):

Feature Selection ◽

Pattern Analysis ◽

Characteristic Curve ◽

Support Vector ◽

Linear Discriminant ◽

Leibler Divergence ◽

Ensemble Algorithm ◽

Highly Correlated ◽

Feature Selection Approach ◽

Bagging Ensemble

Analysis of quantified voice patterns is useful in the detection and assessment of dysphonia and related phonation disorders. In this paper, we first study the linear correlations between 22 voice parameters of fundamental frequency variability, amplitude variations, and nonlinear measures. The highly correlated vocal parameters are combined by using the linear discriminant analysis method. Based on the probability density functions estimated by the Parzen-window technique, we propose an interclass probability risk (ICPR) method to select the vocal parameters with small ICPR values as dominant features and compare with the modified Kullback-Leibler divergence (MKLD) feature selection approach. The experimental results show that the generalized logistic regression analysis (GLRA), support vector machine (SVM), and Bagging ensemble algorithm input with the ICPR features can provide better classification results than the same classifiers with the MKLD selected features. The SVM is much better at distinguishing normal vocal patterns with a specificity of 0.8542. Among the three classification methods, the Bagging ensemble algorithm with ICPR features can identify 90.77% vocal patterns, with the highest sensitivity of 0.9796 and largest area value of 0.9558 under the receiver operating characteristic curve. The classification results demonstrate the effectiveness of our feature selection and pattern analysis methods for dysphonic voice detection and measurement.

Download Full-text

Random forest–based feature selection and detection method for drunk driving recognition

International Journal of Distributed Sensor Networks ◽

10.1177/1550147720905234 ◽

2020 ◽

Vol 16 (2) ◽

pp. 155014772090523

Author(s):

ZhenLong Li ◽

HaoXin Wang ◽

YaoWei Zhang ◽

XiaoHua Zhao

Keyword(s):

Feature Selection ◽

Random Forest ◽

Driving Simulator ◽

Characteristic Curve ◽

Area Under The Curve ◽

Drunk Driving ◽

Support Vector ◽

Linear Discriminant ◽

Dummy Variable ◽

University Of Technology

A method for drunk driving detection using Feature Selection based on the Random Forest was proposed. First, driving behavior data were collected using a driving simulator at Beijing University of Technology. Second, the features were selected according to the Feature Importance in the random forest. Third, a dummy variable was introduced to encode the geometric characteristics of different roads so that drunk driving under different road conditions can be detected with the same classifier based on the random forest. Finally, the linear discriminant analysis, support vector machine, and AdaBoost classifiers were used and compared with the random forest. The accuracy, F1 score, receiver operating characteristic curve, and area under the curve value were used to evaluate the performance of the classifiers. The results show that Accelerator Depth, Speed, Distance to the Center of the Lane, Acceleration, Engine Revolution, Brake Depth, and Steering Angle have important influences on identifying the drivers’ states and can be used to detect drunk driving. Specifically, the classifiers with Accelerator Depth outperformed the other classifiers without Accelerator Depth. This means that Accelerator Depth is an important feature. Both the AdaBoost and random forest classifiers have an accuracy of 81.48%, which verified the effectiveness of the proposed method.

Download Full-text

A nonlinear support vector machine‐based feature selection approach for fault detection and diagnosis: Application to the Tennessee Eastman process

AIChE Journal ◽

10.1002/aic.16497 ◽

2019 ◽

Vol 65 (3) ◽

pp. 992-1005 ◽

Cited By ~ 14

Author(s):

Melis Onel ◽

Chris A. Kieslich ◽

Efstratios N. Pistikopoulos

Keyword(s):

Support Vector Machine ◽

Feature Selection ◽

Fault Detection ◽

Fault Detection And Diagnosis ◽

Support Vector ◽

Tennessee Eastman Process ◽

Selection Approach ◽

Detection And Diagnosis ◽

Feature Selection Approach ◽

Nonlinear Support

Download Full-text

SVR-FFS: A novel forward feature selection approach for high-frequency time series forecasting using support vector regression

Expert Systems with Applications ◽

10.1016/j.eswa.2020.113729 ◽

2020 ◽

Vol 160 ◽

pp. 113729 ◽

Cited By ~ 2

Author(s):

José Manuel Valente ◽

Sebastián Maldonado

Keyword(s):

Time Series ◽

Feature Selection ◽

Support Vector Regression ◽

High Frequency ◽

Time Series Forecasting ◽

Support Vector ◽

Selection Approach ◽

Feature Selection Approach

Download Full-text

Prediction of human disease-associated phosphorylation sites with combined feature selection approach and support vector machine

IET Systems Biology ◽

10.1049/iet-syb.2014.0051 ◽

2015 ◽

Vol 9 (4) ◽

pp. 155-163 ◽

Cited By ~ 9

Author(s):

Xiaoyi Xu ◽

Ao Li ◽

Minghui Wang

Keyword(s):

Support Vector Machine ◽

Feature Selection ◽

Human Disease ◽

Support Vector ◽

Phosphorylation Sites ◽

Selection Approach ◽

Feature Selection Approach ◽

Combined Feature

Download Full-text

Sparse Least Squares Support Vector Machines Based on Genetic Algorithms: A Feature Selection Approach

Advances in Computational Intelligence - Lecture Notes in Computer Science ◽

10.1007/978-3-030-20518-8_42 ◽

2019 ◽

pp. 500-511

Author(s):

Pedro Hericson Machado Araújo ◽

Ajalmar R. Rocha Neto

Keyword(s):

Genetic Algorithms ◽

Feature Selection ◽

Support Vector Machines ◽

Least Squares ◽

Support Vector ◽

Vector Machines ◽

Selection Approach ◽

Feature Selection Approach

Download Full-text

Diagnostic Performance of 2D and 3D T2WI-Based Radiomics Features With Machine Learning Algorithms to Distinguish Solid Solitary Pulmonary Lesion

Frontiers in Oncology ◽

10.3389/fonc.2021.683587 ◽

2021 ◽

Vol 11 ◽

Author(s):

Qi Wan ◽

Jiaxuan Zhou ◽

Xiaoying Xia ◽

Jianfeng Hu ◽

Peng Wang ◽

...

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Diagnostic Performance ◽

Feature Selection Method ◽

Machine Learning Algorithms ◽

Support Vector ◽

Learning Approaches ◽

Selection Methods ◽

Linear Discriminant ◽

2D And 3D

ObjectiveTo evaluate the performance of 2D and 3D radiomics features with different machine learning approaches to classify SPLs based on magnetic resonance(MR) T2 weighted imaging (T2WI).Material and MethodsA total of 132 patients with pathologically confirmed SPLs were examined and randomly divided into training (n = 92) and test datasets (n = 40). A total of 1692 3D and 1231 2D radiomics features per patient were extracted. Both radiomics features and clinical data were evaluated. A total of 1260 classification models, comprising 3 normalization methods, 2 dimension reduction algorithms, 3 feature selection methods, and 10 classifiers with 7 different feature numbers (confined to 3–9), were compared. The ten-fold cross-validation on the training dataset was applied to choose the candidate final model. The area under the receiver operating characteristic curve (AUC), precision-recall plot, and Matthews Correlation Coefficient were used to evaluate the performance of machine learning approaches.ResultsThe 3D features were significantly superior to 2D features, showing much more machine learning combinations with AUC greater than 0.7 in both validation and test groups (129 vs. 11). The feature selection method Analysis of Variance(ANOVA), Recursive Feature Elimination(RFE) and the classifier Logistic Regression(LR), Linear Discriminant Analysis(LDA), Support Vector Machine(SVM), Gaussian Process(GP) had relatively better performance. The best performance of 3D radiomics features in the test dataset (AUC = 0.824, AUC-PR = 0.927, MCC = 0.514) was higher than that of 2D features (AUC = 0.740, AUC-PR = 0.846, MCC = 0.404). The joint 3D and 2D features (AUC=0.813, AUC-PR = 0.926, MCC = 0.563) showed similar results as 3D features. Incorporating clinical features with 3D and 2D radiomics features slightly improved the AUC to 0.836 (AUC-PR = 0.918, MCC = 0.620) and 0.780 (AUC-PR = 0.900, MCC = 0.574), respectively.ConclusionsAfter algorithm optimization, 2D feature-based radiomics models yield favorable results in differentiating malignant and benign SPLs, but 3D features are still preferred because of the availability of more machine learning algorithmic combinations with better performance. Feature selection methods ANOVA and RFE, and classifier LR, LDA, SVM and GP are more likely to demonstrate better diagnostic performance for 3D features in the current study.

Download Full-text

A Composite Hybrid Feature Selection Learning-Based Optimization of Genetic Algorithm For Breast Cancer Detection

10.20944/preprints202003.0298.v1 ◽

2020 ◽

Author(s):

Ahmed Abdullah Farid ◽

Gamal Selim ◽

Hatem Khater

Keyword(s):

Breast Cancer ◽

Genetic Algorithm ◽

Feature Selection ◽

Early Stage ◽

Fitness Function ◽

Support Vector ◽

Initial Population ◽

Tree Classifier ◽

Selection Approach ◽

Feature Selection Approach

Breast cancer is a significant health issue across the world. Breast cancer is the most widely-diagnosed cancer in women; early-stage diagnosis of disease and therapies increase patient safety. This paper proposes a synthetic model set of features focused on the optimization of the genetic algorithm (CHFS-BOGA) to forecast breast cancer. This hybrid feature selection approach combines the advantages of three filter feature selection approaches with an optimize Genetic Algorithm (OGA) to select the best features to improve the performance of the classification process and scalability. We propose OGA by improving the initial population generating and genetic operators using the results of filter approaches as some prior information with using the C4.5 decision tree classifier as a fitness function instead of probability and random selection. The authors collected available updated data from Wisconsin UCI machine learning with a total of 569 rows and 32 columns. The dataset evaluated using an explorer set of weka data mining open-source software for the analysis purpose. The results show that the proposed hybrid feature selection approach significantly outperforms the single filter approaches and principal component analysis (PCA) for optimum feature selection. These characteristics are good indicators for the return prediction. The highest accuracy achieved with the proposed system before (CHFS-BOGA) using the support vector machine (SVM) classifiers was 97.3%. The highest accuracy after (CHFS-BOGA-SVM) was 98.25% on split 70.0% train, remainder test, and 100% on the full training set. Moreover, the receiver operating characteristic (ROC) curve was equal to 1.0. The results showed that the proposed (CHFS-BOGA-SVM) system was able to accurately classify the type of breast tumor, whether malignant or benign.

Download Full-text

Feature Selection for Bankruptcy Prediction

Nature-Inspired Computing Design, Development, and Applications ◽

10.4018/978-1-4666-1574-8.ch009 ◽

2012 ◽

pp. 158-178

Author(s):

A. Gaspar-Cunha ◽

F. Mendes ◽

J. Duarte ◽

A. Vieira ◽

B. Ribeiro ◽

...

Keyword(s):

Logistic Regression ◽

Feature Selection ◽

Financial Statements ◽

Bankruptcy Prediction ◽

Decision Makers ◽

Support Vector ◽

Financial Health ◽

Vector Machines ◽

Feature Selection Approach ◽

A Company

In this work a Multi-Objective Evolutionary Algorithm (MOEA) was applied for feature selection in the problem of bankruptcy prediction. This algorithm maximizes the accuracy of the classifier while keeping the number of features low. A two-objective problem, that is minimization of the number of features and accuracy maximization, was fully analyzed using the Logistic Regression (LR) and Support Vector Machines (SVM) classifiers. Simultaneously, the parameters required by both classifiers were also optimized, and the validity of the methodology proposed was tested using a database containing financial statements of 1200 medium sized private French companies. Based on extensive tests, it is shown that MOEA is an efficient feature selection approach. Best results were obtained when both the accuracy and the classifiers parameters are optimized. The proposed method can provide useful information for decision makers in characterizing the financial health of a company.

Download Full-text

An ensemble feature selection approach using hybrid kernel based SVM for network intrusion detection system

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v23.i1.pp558-565 ◽

2021 ◽

Vol 23 (1) ◽

pp. 558

Author(s):

Gaddam Venu Gopal ◽

Gatram Rama Mohan Babu

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Detection System ◽

Network Intrusion Detection ◽

Support Vector ◽

Feature Subset ◽

Network Intrusion ◽

Feature Selection Approach ◽

Hybrid Kernel

Feature selection is a process of identifying relevant feature subset that leads to the machine learning algorithm in a well-defined manner. In this paper, anovel ensemble feature selection approach that comprises of Relief Attribute Evaluation and hybrid kernel-based support vector machine (HK-SVM) approach is proposed as a feature selection method for network intrusion detection system (NIDS). A Hybrid approach along with the combination of Gaussian and Polynomial methods is used as a kernel for support vector machine (SVM). The key issue is to select a feature subset that yields good accuracy at a minimal computational cost. The proposed approach is implemented and compared with classical SVM and simple kernel. Kyoto2006+, a bench mark intrusion detection dataset,is used for experimental evaluation and then observations are drawn.

Download Full-text

Recursive Cluster Elimination based Rank Function (SVM-RCE-R) implemented in KNIME

F1000Research ◽

10.12688/f1000research.26880.1 ◽

2020 ◽

Vol 9 ◽

pp. 1255 ◽

Cited By ~ 1

Author(s):

Malik Yousef ◽

Burcu Bakir-Gungor ◽

Amhar Jabeer ◽

Gokhan Goy ◽

Rehman Qureshi ◽

...

Keyword(s):

Feature Selection ◽

Simple Structure ◽

Selection Process ◽

Ranking Function ◽

Support Vector ◽

Scientific Publications ◽

Vector Machines ◽

Feature Selection Approach ◽

Sensitivity Specificity ◽

Excel File

In our earlier study, we proposed a novel feature selection approach, Recursive Cluster Elimination with Support Vector Machines (SVM-RCE) and implemented this approach in Matlab. Interest in this approach has grown over time and several researchers have incorporated SVM-RCE into their studies, resulting in a substantial number of scientific publications. This increased interest encouraged us to reconsider how feature selection, particularly in biological datasets, can benefit from considering the relationships of those genes in the selection process, this led to our development of SVM-RCE-R. The usefulness of SVM-RCE-R is further supported by development of maTE tool, which uses a similar approach to identify microRNA (miRNA) targets. We have now implemented the SVM-RCE-R algorithm in Knime in order to make it easier to apply and to make it more accessible to the biomedical community. The use of SVM-RCE-R in Knime is simple and intuitive, allowing researchers to immediately begin their data analysis without having to consult an information technology specialist. The input for the Knime tool is an EXCEL file (or text or CSV) with a simple structure and the output is also an EXCEL file. The Knime version also incorporates new features not available in the previous version. One of these features is a user-specific ranking function that enables the user to provide the weights of the accuracy, sensitivity, specificity, f-measure, area under curve and precision in the ranking function, allowing the user to select for greater sensitivity or greater specificity as needed. The results show that the ranking function has an impact on the performance of SVM-RCE-R. Some of the clusters that achieve high scores for a specified ranking can also have high scores in other metrics. This finding motivates future studies to suggest the optimal ranking function.

Download Full-text