Bayesian Forecasting with Highly Correlated Predictors

Summary This study presents a first comparative analysis of Lasso-type (Lasso, adaptive Lasso, elastic net) and heuristic subset selection methods. Although the Lasso has shown success in many situations, it has some limitations. In particular, inconsistent results are obtained for pairwise highly correlated predictors. An alternative to the Lasso is constituted by model selection based on information criteria (IC), which remain consistent in the situation mentioned. However, these criteria are hard to optimize due to a discrete search space. To overcome this problem, an optimization heuristic (Genetic Algorithm) is applied. To this end, results of a Monte-Carlo simulation study together with an application to an actual empirical problem are reported to illustrate the performance of the methods.

Download Full-text

Bayesian variable selection and shrinkage strategies in a complicated modelling setting with missing data: A case study using multistate models

Statistical Modelling ◽

10.1177/1471082x20920972 ◽

2020 ◽

pp. 1471082X2092097

Author(s):

Lauren J Beesley ◽

Jeremy MG Taylor

Keyword(s):

Missing Data ◽

Variable Selection ◽

Bayesian Variable Selection ◽

Multistate Models ◽

Time To Event ◽

Shrinkage Methods ◽

Highly Correlated ◽

Correlated Predictors ◽

Insight Into

Multistate modelling is a strategy for jointly modelling related time-to-event outcomes that can handle complicated outcome relationships, has appealing interpretations, can provide insight into different aspects of disease development and can be useful for making individualized predictions. A challenge with using multistate modelling in practice is the large number of parameters, and variable selection and shrinkage strategies are needed in order for these models to gain wider adoption. Application of existing selection and shrinkage strategies in the multistate modelling setting can be challenging due to complicated patterns of data missingness, inclusion of highly correlated predictors and hierarchical parameter relationships. In this article, we discuss how to modify and implement several existing Bayesian variable selection and shrinkage methods in a general multistate modelling setting. We compare the performance of these methods in terms of parameter estimation and model selection in a multistate cure model of recurrence and death in patients treated for head and neck cancer. We can view this work as a case study of variable selection and shrinkage in a complicated modelling setting with missing data.

Download Full-text

On distribution-weighted partial least squares with diverging number of highly correlated predictors

Journal of the Royal Statistical Society Series B (Statistical Methodology) ◽

10.1111/j.1467-9868.2008.00697.x ◽

2009 ◽

Vol 71 (2) ◽

pp. 525-548 ◽

Cited By ~ 23

Author(s):

Li-Ping Zhu ◽

Li-Xing Zhu

Keyword(s):

Least Squares ◽

Partial Least Squares ◽

Highly Correlated ◽

Correlated Predictors

Download Full-text

Regression with Highly Correlated Predictors: Variable Omission Is Not the Solution

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph18084259 ◽

2021 ◽

Vol 18 (8) ◽

pp. 4259

Author(s):

Mariella Gregorich ◽

Susanne Strohmaier ◽

Daniela Dunkler ◽

Georg Heinze

Keyword(s):

Public Health ◽

Regression Models ◽

Research Question ◽

Diagnostic Tools ◽

Environmental Sciences ◽

Independent Variables ◽

Bivariate Correlation ◽

Highly Correlated ◽

Correlated Predictors ◽

Epidemiology And Public Health

Regression models have been in use for decades to explore and quantify the association between a dependent response and several independent variables in environmental sciences, epidemiology and public health. However, researchers often encounter situations in which some independent variables exhibit high bivariate correlation, or may even be collinear. Improper statistical handling of this situation will most certainly generate models of little or no practical use and misleading interpretations. By means of two example studies, we demonstrate how diagnostic tools for collinearity or near-collinearity may fail in guiding the analyst. Instead, the most appropriate way of handling collinearity should be driven by the research question at hand and, in particular, by the distinction between predictive or explanatory aims.

Download Full-text

KAJIAN SIMULASI PERBANDINGAN METODE REGRESI KUADRAT TERKECIL PARSIAL, SUPPORT VECTOR MACHINE, DAN RANDOM FOREST

Indonesian Journal of Statistics and Its Applications ◽

10.29244/ijsa.v4i1.610 ◽

2020 ◽

Vol 4 (1) ◽

pp. 203-215

Author(s):

Asep Andri Fauzi ◽

Agus M. Soleh ◽

Anik Djuraidah

Keyword(s):

Random Forest ◽

Small Sample Size ◽

Small Sample ◽

Partial Least Square ◽

Least Square ◽

Partial Least Square Regression ◽

Support Vector ◽

Ordinary Least Square ◽

Highly Correlated ◽

Correlated Predictors

Highly correlated predictors and nonlinear relationships between response and predictors potentially affected the performance of predictive modeling, especially when using the ordinary least square (OLS) method. The simple technique to solve this problem is by using another method such as Partial Least Square Regression (PLSR), Support Vector Regression with kernel Radial Basis Function (SVR-RBF), and Random Forest Regression (RFR). The purpose of this study is to compare OLS, PLSR, SVR-RBF, and RFR using simulation data. The methods were evaluated by the root mean square error prediction (RMSEP). The result showed that in the linear model, SVR-RBF and RFR have large RMSEP; OLS and PLSR are better than SVR-RBF and RFR, and PLSR provides much more stable prediction than OLS in case of highly correlated predictors and small sample size. In nonlinear data, RFR produced the smallest RMSEP when data contains high correlated predictors.

Download Full-text

Sparse Bayesian Variable Selection with Correlation Prior for Forecasting Macroeconomic Variable using Highly Correlated Predictors

Computational Economics ◽

10.1007/s10614-017-9741-1 ◽

2017 ◽

Vol 51 (2) ◽

pp. 323-338 ◽

Cited By ~ 1

Author(s):

Aijun Yang ◽

Ju Xiang ◽

Lianjie Shu ◽

Hongqiang Yang

Keyword(s):

Variable Selection ◽

Bayesian Variable Selection ◽

Macroeconomic Variable ◽

Highly Correlated ◽

Correlation Prior ◽

Correlated Predictors

Download Full-text