scholarly journals Bayesian Forecasting with Highly Correlated Predictors

2012 ◽  
Author(s):  
Dimitris Korobilis
2013 ◽  
Vol 233 (4) ◽  
pp. 526-549 ◽  
Author(s):  
Ivan Savin

Summary This study presents a first comparative analysis of Lasso-type (Lasso, adaptive Lasso, elastic net) and heuristic subset selection methods. Although the Lasso has shown success in many situations, it has some limitations. In particular, inconsistent results are obtained for pairwise highly correlated predictors. An alternative to the Lasso is constituted by model selection based on information criteria (IC), which remain consistent in the situation mentioned. However, these criteria are hard to optimize due to a discrete search space. To overcome this problem, an optimization heuristic (Genetic Algorithm) is applied. To this end, results of a Monte-Carlo simulation study together with an application to an actual empirical problem are reported to illustrate the performance of the methods.


2020 ◽  
pp. 1471082X2092097
Author(s):  
Lauren J Beesley ◽  
Jeremy MG Taylor

Multistate modelling is a strategy for jointly modelling related time-to-event outcomes that can handle complicated outcome relationships, has appealing interpretations, can provide insight into different aspects of disease development and can be useful for making individualized predictions. A challenge with using multistate modelling in practice is the large number of parameters, and variable selection and shrinkage strategies are needed in order for these models to gain wider adoption. Application of existing selection and shrinkage strategies in the multistate modelling setting can be challenging due to complicated patterns of data missingness, inclusion of highly correlated predictors and hierarchical parameter relationships. In this article, we discuss how to modify and implement several existing Bayesian variable selection and shrinkage methods in a general multistate modelling setting. We compare the performance of these methods in terms of parameter estimation and model selection in a multistate cure model of recurrence and death in patients treated for head and neck cancer. We can view this work as a case study of variable selection and shrinkage in a complicated modelling setting with missing data.


Author(s):  
Mariella Gregorich ◽  
Susanne Strohmaier ◽  
Daniela Dunkler ◽  
Georg Heinze

Regression models have been in use for decades to explore and quantify the association between a dependent response and several independent variables in environmental sciences, epidemiology and public health. However, researchers often encounter situations in which some independent variables exhibit high bivariate correlation, or may even be collinear. Improper statistical handling of this situation will most certainly generate models of little or no practical use and misleading interpretations. By means of two example studies, we demonstrate how diagnostic tools for collinearity or near-collinearity may fail in guiding the analyst. Instead, the most appropriate way of handling collinearity should be driven by the research question at hand and, in particular, by the distinction between predictive or explanatory aims.


2020 ◽  
Vol 4 (1) ◽  
pp. 203-215
Author(s):  
Asep Andri Fauzi ◽  
Agus M. Soleh ◽  
Anik Djuraidah

Highly correlated predictors and nonlinear relationships between response and predictors potentially affected the performance of predictive modeling, especially when using the ordinary least square (OLS) method. The simple technique to solve this problem is by using another method such as Partial Least Square Regression (PLSR), Support Vector Regression with kernel Radial Basis Function (SVR-RBF), and Random Forest Regression (RFR). The purpose of this study is to compare OLS, PLSR, SVR-RBF, and RFR using simulation data. The methods were evaluated by the root mean square error prediction (RMSEP). The result showed that in the linear model, SVR-RBF and RFR have large RMSEP; OLS and PLSR are better than SVR-RBF and RFR, and PLSR provides much more stable prediction than OLS in case of highly correlated predictors and small sample size. In nonlinear data, RFR produced the smallest RMSEP when data contains high correlated predictors.


Sign in / Sign up

Export Citation Format

Share Document