A Research of Variable Selection Method within A Framework of Real-coded Genetic Algorithm

A Study of New Variable Selection Method Within a Framework of Real-Coded Genetic Algorithm

New Frontiers in Artificial Intelligence - Lecture Notes in Computer Science ◽

10.1007/978-3-030-31605-1_4 ◽

2019 ◽

pp. 50-64

Author(s):

Takahiro Obata ◽

Setsuya Kurahashi

Keyword(s):

Genetic Algorithm ◽

Variable Selection ◽

Selection Method ◽

Variable Selection Method ◽

Real Coded Genetic Algorithm

Download Full-text

Effect of genetic algorithm as a variable selection method on different chemometric models applied for the analysis of binary mixture of amoxicillin and flucloxacillin: A comparative study

Spectrochimica Acta Part A Molecular and Biomolecular Spectroscopy ◽

10.1016/j.saa.2015.11.024 ◽

2016 ◽

Vol 156 ◽

pp. 54-62 ◽

Cited By ~ 15

Author(s):

Khalid A.M. Attia ◽

Mohammed W.I. Nassar ◽

Mohamed B. El-Zeiny ◽

Ahmed Serag

Keyword(s):

Genetic Algorithm ◽

Binary Mixture ◽

Variable Selection ◽

Comparative Study ◽

Selection Method ◽

Variable Selection Method

Download Full-text

Exploring environmental coverages of species: a new variable contribution estimation methodology for rulesets from the genetic algorithm for rule-set prediction

PeerJ ◽

10.7717/peerj.8968 ◽

2020 ◽

Vol 8 ◽

pp. e8968

Author(s):

Anni Yang ◽

Juan Pablo Gomez ◽

Jason K. Blackburn

Keyword(s):

Genetic Algorithm ◽

Variable Selection ◽

Explanatory Power ◽

Selection Procedure ◽

Environmental Variable ◽

Selection Method ◽

Variable Importance ◽

Ecological Niche Models ◽

Variable Selection Method ◽

Variable Contribution

Variable contribution estimation for, and determination of variable importance within, ecological niche models (ENMs) remain an important area of research with continuing challenges. Most ENM algorithms provide normally exhaustive searches through variable space; however, selecting variables to include in models is a first challenge. The estimation of the explanatory power of variables and the selection of the most appropriate variable set within models can be a second challenge. Although some ENMs incorporate the variable selection rubric inside the algorithms, there is no integrated rubric to evaluate the variable importance in the Genetic Algorithm for Ruleset Production (GARP). Here, we designed a novel variable selection methodology based on the rulesets generated from a GARP experiment. The importance of the variables in a GARP experiment can be estimated based on the consideration of the prevalence of each environmental variable in the dominant presence rules of the best subset of models and its coverage. We tested the performance of this variable selection method based on simulated species with both weak and strong responses to simulated environmental covariates. The variable selection method generally performed well during the simulations with over 2/3 of the trials correctly identifying most covariates. We then predict the distribution of Toxostoma rufum (a bird with a cosmopolitan distribution) in the continental United States (US) and apply our variable selection procedure as a real-world example. We found that the distribution of T. rufum could be accurately modeled with 13 or 10 of 21 variables, using an UI cutoff of 0.5 or 0.25, respectively, arriving at parsimonious environmental coverages with good model accuracy. We also provide tools to simulate species distributions for testing ENM approaches using R.

Download Full-text

Variable selection method for quantitative trait analysis based on parallel genetic algorithm

Annals of Human Genetics ◽

10.1111/j.1469-1809.2009.00548.x ◽

2010 ◽

Vol 74 (1) ◽

pp. 88-96 ◽

Cited By ~ 5

Author(s):

Siuli Mukhopadhyay ◽

Varghese George ◽

Hongyan Xu

Keyword(s):

Genetic Algorithm ◽

Variable Selection ◽

Quantitative Trait ◽

Selection Method ◽

Parallel Genetic Algorithm ◽

Variable Selection Method ◽

Trait Analysis ◽

Quantitative Trait Analysis

Download Full-text

A Variable Selection Method for Pulverizing Capability Prediction of Tumbling Mill Based on Improved Hybrid Genetic Algorithm

Information Technology And Control ◽

10.5755/j01.itc.40.3.629 ◽

2011 ◽

Vol 40 (3) ◽

Author(s):

Lixin Jia ◽

Hui Cao ◽

Gangquan Si ◽

Yanbin Zhang

Keyword(s):

Genetic Algorithm ◽

Variable Selection ◽

Hybrid Genetic Algorithm ◽

Selection Method ◽

Variable Selection Method ◽

Tumbling Mill

Download Full-text

Exploring Environmental Coverages of Species: A New Variable Selection Methodology for Rulesets from the Genetic Algorithm for Ruleset Prediction

10.1101/531079 ◽

2019 ◽

Author(s):

Anni Yang ◽

Juan Pablo Gomez ◽

Jason K. Blackburn

Keyword(s):

Genetic Algorithm ◽

Variable Selection ◽

Explanatory Power ◽

Selection Procedure ◽

Selection Method ◽

Organic Content ◽

Variable Importance ◽

Distribution Models ◽

Variable Selection Method ◽

Selection Methodology

AbstractVariable selection for, and determination of variable importance within, species distribution models (SDMs) remain an important area of research with continuing challenges. Most SDM algorithms provide normally exhaustive searches through variable space, however, selecting variables to include in models is a first challenge. The estimation of the explanatory power of variables and the selection of the most appropriate variable set within models can be a second challenge. Although some SDMs incorporate the variable selection rubric inside the algorithms, there is no integrated rubric to evaluate the variable importance in the Genetic Algorithm for Ruleset Production (GARP). Here, we designed a novel variable selection methodology based on the rulesets generated from a GARP experiment. The importance of the variables in a GARP experiment can be estimated based on the consideration of the prevalence of each environmental variable in the dominant presence rules of the best subset of models and its coverage. We tested the performance of this variable selection method based on simulated species with both weak and strong responses to simulated environmental covariates. The variable selection method generally performed well during the simulations with over 2/3 of the trials correctly identifying most covariates. We then predict the distribution of Bacillus anthracis (the bacterium that causes anthrax) in the continental United States (US) and apply our variable selection procedure as a real-world example. We found that the distribution of B. anthracis was primarily determined by organic content, soil pH, calcic vertisols, vegetation, sand fraction, elevation, and seasonality in temperature and moisture.

Download Full-text

Variable Selection Method Based on Partial Mutual Information and Its Application to NOx Emission Prediction

2020 39th Chinese Control Conference (CCC) ◽

10.23919/ccc50068.2020.9189070 ◽

2020 ◽

Author(s):

QIN Tianmu ◽

ZHANG Jinzhe ◽

YOU Mo ◽

YANG Tingting

Keyword(s):

Mutual Information ◽

Variable Selection ◽

Selection Method ◽

Nox Emission ◽

Variable Selection Method

Download Full-text

Predictive and Descriptive CoMFA Models: The Effect of Variable Selection

Combinatorial Chemistry & High Throughput Screening ◽

10.2174/1386207321666180212162028 ◽

2018 ◽

Vol 21 (2) ◽

pp. 117-124 ◽

Cited By ~ 4

Author(s):

Bakhtyar Sepehri ◽

Nematollah Omidikia ◽

Mohsen Kompany-Zareh ◽

Raouf Ghavami

Keyword(s):

Variable Selection ◽

Predictive Power ◽

Selection Method ◽

Data Sets ◽

Data Set ◽

Comfa Model ◽

Variable Selection Method

Aims & Scope: In this research, 8 variable selection approaches were used to investigate the effect of variable selection on the predictive power and stability of CoMFA models. Materials & Methods: Three data sets including 36 EPAC antagonists, 79 CD38 inhibitors and 57 ATAD2 bromodomain inhibitors were modelled by CoMFA. First of all, for all three data sets, CoMFA models with all CoMFA descriptors were created then by applying each variable selection method a new CoMFA model was developed so for each data set, 9 CoMFA models were built. Obtained results show noisy and uninformative variables affect CoMFA results. Based on created models, applying 5 variable selection approaches including FFD, SRD-FFD, IVE-PLS, SRD-UVEPLS and SPA-jackknife increases the predictive power and stability of CoMFA models significantly. Result & Conclusion: Among them, SPA-jackknife removes most of the variables while FFD retains most of them. FFD and IVE-PLS are time consuming process while SRD-FFD and SRD-UVE-PLS run need to few seconds. Also applying FFD, SRD-FFD, IVE-PLS, SRD-UVE-PLS protect CoMFA countor maps information for both fields.

Download Full-text

Performance of smoothly clipped absolute deviation as a variable selection method in the artificial neural network‐based QSAR studies

Journal of Chemometrics ◽

10.1002/cem.3338 ◽

2021 ◽

Author(s):

Zeinab Mozafari ◽

Mansour Arab Chamjangali ◽

Mohammad Arashi ◽

Nasser Goudarzi

Keyword(s):

Neural Network ◽

Artificial Neural Network ◽

Variable Selection ◽

Selection Method ◽

Absolute Deviation ◽

Qsar Studies ◽

Variable Selection Method ◽

Smoothly Clipped Absolute Deviation ◽

Artificial Neural

Download Full-text

Variable Selection in the Regularized Simultaneous Component Analysis Method for Multi-Source Data Integration

Scientific Reports ◽

10.1038/s41598-019-54673-2 ◽

2019 ◽

Vol 9 (1) ◽

Cited By ~ 1

Author(s):

Zhengguo Gu ◽

Niek C. de Schipper ◽

Katrijn Van Deun

Keyword(s):

Variable Selection ◽

Data Integration ◽

Component Analysis ◽

Selection Method ◽

Gps Data ◽

Positioning Systems ◽

Variable Selection Method ◽

Diary Data ◽

Simultaneous Component Analysis ◽

Travel Diary

AbstractInterdisciplinary research often involves analyzing data obtained from different data sources with respect to the same subjects, objects, or experimental units. For example, global positioning systems (GPS) data have been coupled with travel diary data, resulting in a better understanding of traveling behavior. The GPS data and the travel diary data are very different in nature, and, to analyze the two types of data jointly, one often uses data integration techniques, such as the regularized simultaneous component analysis (regularized SCA) method. Regularized SCA is an extension of the (sparse) principle component analysis model to the cases where at least two data blocks are jointly analyzed, which - in order to reveal the joint and unique sources of variation - heavily relies on proper selection of the set of variables (i.e., component loadings) in the components. Regularized SCA requires a proper variable selection method to either identify the optimal values for tuning parameters or stably select variables. By means of two simulation studies with various noise and sparseness levels in simulated data, we compare six variable selection methods, which are cross-validation (CV) with the “one-standard-error” rule, repeated double CV (rdCV), BIC, Bolasso with CV, stability selection, and index of sparseness (IS) - a lesser known (compared to the first five methods) but computationally efficient method. Results show that IS is the best-performing variable selection method.

Download Full-text