Longitudinal variable selection by cross-validation in the case of many covariates

E. Cantoni; C. Field; J. Mills Flemming; E. Ronchetti

doi:10.1002/sim.2572

Determination of Iodine Value of Palm Oils Using Partial Least Squares Regression-Fourier Transform Infrared Data

Jurnal Teknologi ◽

10.11113/jt.v70.3522 ◽

2014 ◽

Vol 70 (5) ◽

Cited By ~ 1

Author(s):

Nor Fazila Rasaruddin ◽

Mas Ezatul Nadia Mohd Ruah ◽

Mohamed Noor Hasan ◽

Mohd Zuli Jaafar

Keyword(s):

Fourier Transform ◽

Variable Selection ◽

Least Squares ◽

Partial Least Squares ◽

Correlation Coefficient ◽

Iodine Value ◽

Cross Validation ◽

Pls Regression ◽

Pure Sample

This paper shows the determination of iodine value (IV) of pure and frying palm oils using Partial Least Squares (PLS) regression with application of variable selection. A total of 28 samples consisting of pure and frying palm oils which acquired from markets. Seven of them were considered as high-priced palm oils while the remaining was low-priced. PLS regression models were developed for the determination of IV using Fourier Transform Infrared (FTIR) spectra data in absorbance mode in the range from 650 cm-1 to 4000 cm-1. Savitzky Golay derivative was applied before developing the prediction models. The models were constructed using wavelength selected in the FTIR region by adopting selectivity ratio (SR) plot and correlation coefficient to the IV parameter. Each model was validated through Root Mean Square Error Cross Validation, RMSECV and cross validation correlation coefficient, R2cv. The best model using SR plot was the model with mean centring for pure sample and model with a combination of row scaling and standardization of frying sample. The best model with the application of the correlation coefficient variable selection was the model with a combination of row scaling and standardization of pure sample and model with mean centering data pre-processing for frying sample. It is not necessary to row scaled the variables to develop the model since the effect of row scaling on model quality is insignificant.

Download Full-text

Spatial leave-one-out cross-validation for variable selection in the presence of spatial autocorrelation

Global Ecology and Biogeography ◽

10.1111/geb.12161 ◽

2014 ◽

Vol 23 (7) ◽

pp. 811-820 ◽

Cited By ~ 27

Author(s):

Kévin Le Rest ◽

David Pinaud ◽

Pascal Monestiez ◽

Joël Chadoeuf ◽

Vincent Bretagnolle

Keyword(s):

Variable Selection ◽

Spatial Autocorrelation ◽

Cross Validation ◽

Leave One Out

Download Full-text

The restricted consistency property of leave-nv-out cross-validation for high-dimensional variable selection

Statistica Sinica ◽

10.5705/ss.202015.0394 ◽

2019 ◽

Cited By ~ 1

Author(s):

Yang Feng ◽

Yi Yu

Keyword(s):

Variable Selection ◽

Cross Validation ◽

High Dimensional ◽

Consistency Property ◽

Dimensional Variable

Download Full-text

The One Standard Error Rule for Model Selection: Does It Work?

Stats ◽

10.3390/stats4040051 ◽

2021 ◽

Vol 4 (4) ◽

pp. 868-892

Author(s):

Yuchen Chen ◽

Yuhong Yang

Keyword(s):

Variable Selection ◽

Standard Error ◽

Cross Validation ◽

Housing Prices ◽

Regression Function ◽

Real Data ◽

Estimation Accuracy ◽

Regression Estimation ◽

Large Sample Size ◽

Estimation Formula

Previous research provided a lot of discussion on the selection of regularization parameters when it comes to the application of regularization methods for high-dimensional regression. The popular “One Standard Error Rule” (1se rule) used with cross validation (CV) is to select the most parsimonious model whose prediction error is not much worse than the minimum CV error. This paper examines the validity of the 1se rule from a theoretical angle and also studies its estimation accuracy and performances in applications of regression estimation and variable selection, particularly for Lasso in a regression framework. Our theoretical result shows that when a regression procedure produces the regression estimator converging relatively fast to the true regression function, the standard error estimation formula in the 1se rule is justified asymptotically. The numerical results show the following: 1. the 1se rule in general does not necessarily provide a good estimation for the intended standard deviation of the cross validation error. The estimation bias can be 50–100% upwards or downwards in various situations; 2. the results tend to support that 1se rule usually outperforms the regular CV in sparse variable selection and alleviates the over-selection tendency of Lasso; 3. in regression estimation or prediction, the 1se rule often performs worse. In addition, comparisons are made over two real data sets: Boston Housing Prices (large sample size n, small/moderate number of variables p) and Bardet–Biedl data (large p, small n). Data guided simulations are done to provide insight on the relative performances of the 1se rule and the regular CV.

Download Full-text

An extended sweep operator for the cross validation of variable selection in linear regression

Journal of Statistical Computation and Simulation ◽

10.1080/00949659208811432 ◽

1992 ◽

Vol 43 (1-2) ◽

pp. 117-126 ◽

Cited By ~ 1

Author(s):

David Alan Grier

Keyword(s):

Linear Regression ◽

Variable Selection ◽

Cross Validation ◽

The Cross

Download Full-text

Embedding Boosted Regression Trees approach to variable selection and cross-validation in parametric regression to predict diameter distribution after thinning

Forest Ecology and Management ◽

10.1016/j.foreco.2021.119631 ◽

2021 ◽

Vol 499 ◽

pp. 119631

Author(s):

Ho-Tung Lin ◽

Tzeng Yih Lam ◽

Ping-Hsun Peng ◽

Chih-Ming Chiu

Keyword(s):

Variable Selection ◽

Cross Validation ◽

Regression Trees ◽

Boosted Regression Trees ◽

Diameter Distribution ◽

Parametric Regression

Download Full-text

Generalized Cross Validation in variable selection with and without shrinkage

Journal of Statistical Planning and Inference ◽

10.1016/j.jspi.2014.10.007 ◽

2015 ◽

Vol 159 ◽

pp. 90-104 ◽

Cited By ~ 7

Author(s):

Maarten Jansen

Keyword(s):

Variable Selection ◽

Cross Validation ◽

Generalized Cross Validation

Download Full-text

Statistical Confidence for Variable Selection in QSAR Models via Monte Carlo Cross-Validation

Journal of Chemical Information and Modeling ◽

10.1021/ci700283s ◽

2008 ◽

Vol 48 (2) ◽

pp. 370-383 ◽

Cited By ~ 28

Author(s):

Dmitry A. Konovalov ◽

Nigel Sim ◽

Eric Deconinck ◽

Yvan Vander Heyden ◽

Danny Coomans

Keyword(s):

Monte Carlo ◽

Variable Selection ◽

Cross Validation ◽

Statistical Confidence ◽

Qsar Models ◽

Monte Carlo Cross Validation

Download Full-text

Cross-Validation, Shrinkage and Variable Selection in Linear Regression Revisited

Open Journal of Statistics ◽

10.4236/ojs.2013.32011 ◽

2013 ◽

Vol 03 (02) ◽

pp. 79-102 ◽

Cited By ~ 18

Author(s):

Hans C. van Houwelingen ◽

Willi Sauerbrei

Keyword(s):

Linear Regression ◽

Variable Selection ◽

Cross Validation

Download Full-text

Cross-validation as the objective function for variable-selection techniques

TrAC Trends in Analytical Chemistry ◽

10.1016/s0165-9936(03)00607-1 ◽

2003 ◽

Vol 22 (6) ◽

pp. 395-406 ◽

Cited By ~ 167

Author(s):

Knut Baumann

Keyword(s):

Variable Selection ◽

Objective Function ◽

Cross Validation

Download Full-text