A Large-Scale Study of the Impact of Feature Selection Techniques on Defect Classification Models

A comprehensive investigation of the impact of feature selection techniques on crashing fault residence prediction models

Information and Software Technology ◽

10.1016/j.infsof.2021.106652 ◽

2021 ◽

pp. 106652

Author(s):

Kunsong Zhao ◽

Zhou Xu ◽

Meng Yan ◽

Tao Zhang ◽

Dan Yang ◽

...

Keyword(s):

Feature Selection ◽

Prediction Models ◽

Comprehensive Investigation ◽

The Impact ◽

Feature Selection Techniques

Download Full-text

THE ACTIVITY OF F.F. ERISMAN IN MOSCOW LOW TERRITORIAL ORGANIZATIONS (ON THE 175TH ANNIVERSARY OF HIS BIRTH)

Hygiene and Sanitation ◽

10.18821/0016-9900-2018-97-4-375-377 ◽

2018 ◽

Vol 97 (4) ◽

pp. 375-377

Author(s):

Irina V. Egorysheva

Keyword(s):

Large Scale ◽

Dental Hygienist ◽

Statistical Research ◽

Large Scale Study ◽

The Creation ◽

The Impact ◽

Work And Life

The article is devoted to the participation of the outstanding dental hygienist F. F. Erisman in the development of the Moscow low territorial sanitary organization. Under his leadership, there was carried out a large-scale study of the impact of conditions of the work and life on the health of plant workers, served as a model for similar types of sanitary-statistical research in a number of rural provinces. F. F. Erisman actively participated in the work of the sanitary organization of the Moscow gubernia Zemstvo, the creation of the first district sanitary Bureau.

Download Full-text

Feature selection techniques and comparative studies for large-scale manufacturing processes

The International Journal of Advanced Manufacturing Technology ◽

10.1007/s00170-004-2434-7 ◽

2006 ◽

Vol 28 (9-10) ◽

pp. 1006-1011 ◽

Cited By ~ 7

Author(s):

Buhwan Jeong ◽

Hyunbo Cho

Keyword(s):

Feature Selection ◽

Comparative Studies ◽

Large Scale ◽

Manufacturing Processes ◽

Feature Selection Techniques

Download Full-text

Widespread effects of DNA methylation and intra-motif dependencies revealed by novel transcription factor binding models

10.1101/2020.10.21.348193 ◽

2020 ◽

Author(s):

Jan Grau ◽

Florian Schmidt ◽

Marcel H. Schulz

Keyword(s):

Dna Methylation ◽

Transcription Factor ◽

Large Scale ◽

Transcription Factor Binding ◽

Cpg Methylation ◽

Motif Analysis ◽

Factor Binding ◽

Large Scale Study ◽

Binding Models ◽

The Impact

AbstractSeveral studies suggested that transcription factor (TF) binding to DNA may be impaired or enhanced by DNA methylation. We present MeDeMo, a toolbox for TF motif analysis that combines information about DNA methylation with models capturing intra-motif dependencies. In a large-scale study using ChIP-seq data for 335 TFs, we identify novel TFs that are affected by DNA methylation. Overall, we find that CpG methylation decreases the likelihood of binding for the majority of TFs. For a considerable subset of TFs, we show that intra-motif dependencies are pivotal for accurately modelling the impact of DNA methylation on TF binding.

Download Full-text

An Ensemble Voted Feature Selection Technique for Predictive Modeling of Malwares of Android

International Journal of Information System Modeling and Design ◽

10.4018/ijismd.2019040103 ◽

2019 ◽

Vol 10 (2) ◽

pp. 46-69

Author(s):

Abhishek Bhattacharya ◽

Radha Tamal Goswami ◽

Kuntal Mukherjee ◽

Nhu Gia Nguyen

Keyword(s):

Feature Selection ◽

Predictive Modeling ◽

Data Partitioning ◽

Coefficient Of Determination ◽

Feature Selection Technique ◽

Selection Technique ◽

Feature Selector ◽

The Impact ◽

Feature Selection Techniques ◽

Installation Time

Each Android application requires accumulations of permissions in installation time and they are considered as the features which can be utilized in permission-based identification of Android malwares. Recently, ensemble feature selection techniques have received increasing attention over conventional techniques in different applications. In this work, a cluster based voted ensemble voted feature selection technique combining five base wrapper approaches of R libraries is projected for identifying most prominent set of features in the predictive modeling of Android malwares. The proposed method preserves both the desirable features of an ensemble feature selector, accuracy and diversity. Moreover, in this work, five different data partitioning ratios are considered and the impact of those ratios on predictive model are measured using coefficient of determination (r-square) and root mean square error. The proposed strategy has created significant better outcome in term of the number of selected features and classification accuracy.

Download Full-text

Feature Selection Techniques for the Analysis of Discriminative Features in Temporal and Frontal Lobe Epilepsy: A Comparative Study

The Open Biomedical Engineering Journal ◽

10.2174/1874120702115010001 ◽

2021 ◽

Vol 15 (1) ◽

pp. 1-15

Author(s):

Behrooz Abbaszadeh ◽

Cesar Alexandre Domingues Teixeira ◽

Mustapha C.E. Yagoub

Keyword(s):

Feature Selection ◽

Frontal Lobe ◽

Time Domain ◽

Life Quality ◽

Epileptic Seizures ◽

Ar Model ◽

Model Parameters ◽

Frontal Lobe Epilepsy ◽

The Impact ◽

Feature Selection Techniques

Background: Because about 30% of epileptic patients suffer from refractory epilepsy, an efficient automatic seizure prediction tool is in great demand to improve their life quality. Methods: In this work, time-domain discriminating preictal and interictal features were efficiently extracted from the intracranial electroencephalogram of twelve patients, i.e., six with temporal and six with frontal lobe epilepsy. The performance of three types of feature selection methods was compared using Matthews’s correlation coefficient (MCC). Results: Kruskal Wallis, a non-parametric approach, was found to perform better than the other approaches due to a simple and less resource consuming strategy as well as maintaining the highest MCC score. The impact of dividing the electroencephalogram signals into various sub-bands was investigated as well. The highest performance of Kruskal Wallis may suggest considering the importance of univariate features like complexity and interquartile ratio (IQR), along with autoregressive (AR) model parameters and the maximum (MAX) cross-correlation to efficiently predict epileptic seizures. Conclusion: The proposed approach has the potential to be implemented on a low power device by considering a few simple time domain characteristics for a specific sub-band. It should be noted that, as there is not a great deal of literature on frontal lobe epilepsy, the results of this work can be considered promising.

Download Full-text

Tuning impact in Latin America: is there implementation beyond design?

Tuning Journal for Higher Education ◽

10.18543/tjhe-3(1)-2015pp187-216 ◽

2015 ◽

Vol 3 (1) ◽

pp. 187 ◽

Cited By ~ 2

Author(s):

Pablo Beneitone ◽

Maria Yarosh

Keyword(s):

Latin America ◽

Present Article ◽

General Framework ◽

Large Scale ◽

Institutional Support ◽

World Regions ◽

Large Scale Study ◽

Teaching Learning ◽

Two Stages ◽

The Impact

Deusto International Tuning Academy is undertaking a large-scale study to analyse the impact Tuning projects may have had in participating universities. More particularly, the study hopes to provide an unambiguous answer regarding the presence or absence of the implementation of a competence-based student-centred approach in the different world regions where Tuning projects have taken place. The present article focuses only on Latin America where two Tuning projects have been developed. It describes the findings of the first two stages of the study. After reporting the data, the authors argue that there is evidence of a Tuning impact in each of three intended impact domains: (1) understanding of the importance of a shift from content- to competence-based education; (2) provision of institutional support necessary to facilitate this change; and (3) appropriate teaching, learning and assessment within the general framework of the study plans and degree profiles.

Download Full-text

The Impact of Feature Selection Techniques on the Performance of Predicting Parkinson’s Disease

International Journal of Information Technology and Computer Science ◽

10.5815/ijitcs.2018.11.02 ◽

2018 ◽

Vol 10 (11) ◽

pp. 14-29

Author(s):

Abdullah Al Imran ◽

◽

Ananya Rahman ◽

Humayoun Kabir ◽

Shamsur Rahim

Keyword(s):

Parkinson’S Disease ◽

Parkinson's Disease ◽

Feature Selection ◽

The Impact ◽

Feature Selection Techniques

Download Full-text

Feature Selection for Interpatient Supervised Heart Beat Classification

Computational Intelligence and Neuroscience ◽

10.1155/2011/643816 ◽

2011 ◽

Vol 2011 ◽

pp. 1-9 ◽

Cited By ~ 15

Author(s):

G. Doquire ◽

G. de Lannoy ◽

D. François ◽

M. Verleysen

Keyword(s):

Feature Selection ◽

Domain Knowledge ◽

Heart Beat ◽

Classification Models ◽

Feature Sets ◽

Beat Classification ◽

Feature Selection Techniques ◽

Optimal Feature

Supervised and interpatient classification of heart beats is primordial in many applications requiring long-term monitoring of the cardiac function. Several classification models able to cope with the strong class unbalance and a large variety of feature sets have been proposed for this task. In practice, over 200 features are often considered, and the features retained in the final model are either chosen using domain knowledge or an exhaustive search in the feature sets without evaluating the relevance of each individual feature included in the classifier. As a consequence, the results obtained by these models can be suboptimal and difficult to interpret. In this work, feature selection techniques are considered to extract optimal feature subsets for state-of-the-art ECG classification models. The performances are evaluated on real ambulatory recordings and compared to previously reported feature choices using the same models. Results indicate that a small number of individual features actually serve the classification and that better performances can be achieved by removing useless features.

Download Full-text

Developing Roadway Safety Models for Winter Weather Conditions Using a Feature Selection Algorithm

Journal of Advanced Transportation ◽

10.1155/2020/8824943 ◽

2020 ◽

Vol 2020 ◽

pp. 1-13

Author(s):

Bryce Hallmark ◽

Jing Dong

Keyword(s):

Feature Selection ◽

Large Scale ◽

Negative Binomial ◽

Weather Conditions ◽

Winter Weather ◽

Crash Frequency ◽

Frequency Model ◽

Roadway Safety ◽

Selection Framework ◽

The Impact

Inclement winter weather such as snow, sleet, and freezing rain significantly impacts roadway safety. To assess the safety implications of winter weather, maintenance operations, and traffic operations, various crash frequency models have been developed. In this study, several datasets, including for weather, snowplow operations, and traffic information, were combined to develop a robust crash frequency model for winter weather conditions. When developing statistical models using such large-scale multivariate datasets, one of the challenges is to determine which explanatory variables should be included in the model. This paper presents a feature selection framework using a machine-learning algorithm known as the Boruta algorithm and exhaustive search to select a list of variables to be included in a negative binomial crash frequency model. This paper’s proposed feature selection framework generates consistent and intuitive results because the feature selection process reduces the complexity of interactions among different variables in the dataset. This enables our crash frequency model to better help agencies identify effective ways to improve roadway safety via winter maintenance operations. For example, increased plowing operations before the start of storms are associated with a decrease in crash rates. Thus, pretreatment operations can play a significant role in mitigating the impact of winter storms.

Download Full-text