The Comparison Between Different Approaches to Overcome the Multicollinearity Problem in Linear Regression Models

Hazim Mansoor Gorgees; Fatimah Assim Mahdi

doi:10.30526/31.1.1841

The Comparison Between Different Approaches to Overcome the Multicollinearity Problem in Linear Regression Models

Ibn AL- Haitham Journal For Pure and Applied Science ◽

10.30526/31.1.1841 ◽

2018 ◽

Vol 31 (1) ◽

pp. 212

Author(s):

Hazim Mansoor Gorgees ◽

Fatimah Assim Mahdi

Keyword(s):

Ridge Regression ◽

Regression Models ◽

Estimation Method ◽

Principal Component ◽

Ordinary Least Squares ◽

Linear Regression Models ◽

Principal Component Method ◽

Regression Estimators ◽

The Mean ◽

Collinearity Problem

In the presence of multi-collinearity problem, the parameter estimation method based on the ordinary least squares procedure is unsatisfactory. In 1970, Hoerl and Kennard insert analternative method labeled as estimator of ridge regression. In such estimator, ridge parameter plays an important role in estimation. Various methods were proposed by many statisticians to select the biasing constant (ridge parameter). Another popular method that is used to deal with the multi-collinearity problem is the principal component method. In this paper,we employ the simulation technique to compare the performance of principal component estimator with some types of ordinary ridge regression estimators based on the value of the biasing constant (ridge parameter). The mean square error (MSE) is used as a criterion to assess the performance of such estimators.

Download Full-text

Ridge regression estimators in the linear regression models with non-spherical errors

Communication in Statistics- Theory and Methods ◽

10.1080/03610929308831147 ◽

1993 ◽

Vol 22 (8) ◽

pp. 2275-2284 ◽

Cited By ~ 1

Author(s):

Ancop Chaturvedi

Keyword(s):

Linear Regression ◽

Ridge Regression ◽

Regression Models ◽

Linear Regression Models ◽

Regression Estimators

Download Full-text

Assessing Early Heterogeneity in Doubling Times of the COVID-19 Epidemic across Prefectures in Mainland China, January–February, 2020

Epidemiologia ◽

10.3390/epidemiologia2010009 ◽

2021 ◽

Vol 2 (1) ◽

pp. 95-113 ◽

Cited By ~ 1

Author(s):

Isaac Chun-Hai Fung ◽

Xiaolu Zhou ◽

Chi-Ngai Cheung ◽

Sylvia K. Ofori ◽

Kamalich Muniz-Rodriguez ◽

...

Keyword(s):

Regression Models ◽

Doubling Time ◽

Metropolitan Areas ◽

Mainland China ◽

Linear Regression Models ◽

Case Count ◽

Daily Time Series ◽

The Mean ◽

Doubling Times ◽

Daily Time

To describe the geographical heterogeneity of COVID-19 across prefectures in mainland China, we estimated doubling times from daily time series of the cumulative case count between 24 January and 24 February 2020. We analyzed the prefecture-level COVID-19 case burden using linear regression models and used the local Moran’s I to test for spatial autocorrelation and clustering. Four hundred prefectures (~98% population) had at least one COVID-19 case and 39 prefectures had zero cases by 24 February 2020. Excluding Wuhan and those prefectures where there was only one case or none, 76 (17.3% of 439) prefectures had an arithmetic mean of the epidemic doubling time <2 d. Low-population prefectures had a higher per capita cumulative incidence than high-population prefectures during the study period. An increase in population size was associated with a very small reduction in the mean doubling time (−0.012, 95% CI, −0.017, −0.006) where the cumulative case count doubled ≥3 times. Spatial analysis revealed high case count clusters in Hubei and Heilongjiang and fast epidemic growth in several metropolitan areas by mid-February 2020. Prefectures in Hubei and neighboring provinces and several metropolitan areas in coastal and northeastern China experienced rapid growth with cumulative case count doubling multiple times with a small mean doubling time.

Download Full-text

Abstract WMP14: Association of Systolic Blood Pressure, LDL Cholesterol, and Hgb A1c With White Matter Hyperintensity on MRI in the ACCORDION MIND Study

Stroke ◽

10.1161/str.51.suppl_1.wmp14 ◽

2020 ◽

Vol 51 (Suppl_1) ◽

Author(s):

Adam H de Havenon ◽

Tanya Turan ◽

Rebecca Gottesman ◽

Sharon Yeatts ◽

Shyam Prabhakaran ◽

...

Keyword(s):

Blood Pressure ◽

Risk Factor ◽

Regression Models ◽

Ldl Cholesterol ◽

Vascular Risk Factor ◽

White Matter Hyperintensity ◽

Vascular Risk ◽

Linear Regression Models ◽

Risk Factor Profile ◽

The Mean

Introduction: While retrospective studies have shown that poor control of vascular risk factors is associated with progression of white matter hyperintensity (WMH), it has not been studied prospectively. Hypothesis: We hypothesize that higher systolic blood pressure (SBP) mean, LDL cholesterol, and Hgb A1c will be correlated with WMH progression in diabetics. Methods: This is a secondary analysis of the Memory in Diabetes (MIND) substudy of the Action to Control Cardiovascular Risk in Diabetes Follow-on Study (ACCORDION). The primary outcome was WMH progression, evaluated by fitting linear regression models to the WMH volume on the month 80 MRI and adjusting for the WMH volume on the baseline MRI. The primary predictors were the mean values of SBP, LDL, and A1c from baseline to month 80. We defined a good vascular risk factor profile as mean SBP <120 mm Hg and mean LDL <120 mg/dL. Results: We included 292 patients, with a mean (SD) age of 62.6 (5.3) years and 55.8% male. The mean number of SBP, LDL, and A1c measurements per patient was 17, 5, and 12. We identified 86 (29.4%) patients with good vascular risk factor profile. In the linear regression models, mean SBP and LDL were associated with WMH progression and in a second fully adjusted model they both remained associated with WMH progression (Table). Those with a good vascular risk factor profile had less WMH progression (β Coefficient -0.80, 95% CI -1.42, -0.18, p=0.012). Conclusions: Our data reinforce prior research showing that higher SBP and LDL is associated with progression of WMH in diabetics, likely secondary to chronic microvascular ischemia, and suggest that control of these factors may have protective effects. This study has unique strengths, including prospective serial measurement of the exposures, validated algorithmic measurement methodology for WMH, and rigorous adjudication of study data. Clinical trials are needed to investigate the effect of vascular risk factor reduction on WMH progression.

Download Full-text

KAJIAN METODE ORDINARY LEAST SQUARE DAN ROBUST ESTIMASI M PADA MODEL REGRESI LINIER SEDERHANA YANG MEMUAT OUTLIER

Jurnal Ilmiah Matematika dan Pendidikan Matematika ◽

10.20884/1.jmp.2020.12.1.1934 ◽

2020 ◽

Vol 11 (1) ◽

pp. 21

Author(s):

Zahrotul Aflakhah ◽

Jajang Jajang ◽

Agustini Tripena Br. Sb.

Keyword(s):

Full Article ◽

Regression Model ◽

Sample Size ◽

Regression Models ◽

Estimation Method ◽

Ordinary Least Squares ◽

Least Square ◽

Simple Linear Regression ◽

Ordinary Least Square ◽

M Estimation

This research discusses about the Ordinary Least Squares (OLS) method and robust M-estimation method; compare between the Tukey bisquare and Huber weighting from simple linier regression models that contain outliers. Data are generated through simulation with the percentages of outliers and sample sizes. Each data will be formed into a simple linier regression model, then the percentage of outliers, RSE and MAD values are calculated. The results show that RSE and MAD values produced by a simple linear regression model with the OLS method are influenced by the percentage of outliers. However, the regression model of robust M-estimation with sample size 30, 60, 90, 120, and 150 results an unstable RSE values with the change of the percentage of outlier and the MAD values that are not affected by the percentage of outliers and sample size. The robust M-estimation method with Tukey Bisquare weighting is as good as the Huber weighting. Full Article

Download Full-text

An Effective Hybrid Semi-Parametric Regression Strategy for Rainfall Forecasting Combining Linear and Nonlinear Regression

Modeling Applications and Theoretical Innovations in Interdisciplinary Evolutionary Computation ◽

10.4018/978-1-4666-3628-6.ch017 ◽

2013 ◽

pp. 273-289

Author(s):

Jiansheng Wu

Keyword(s):

Nonlinear Regression ◽

Regression Models ◽

Principal Component ◽

Linear Regression Models ◽

Rainfall Forecasting ◽

Forecasting Accuracy ◽

Parametric Regression ◽

Nonlinear Regression Models ◽

Linear And Nonlinear ◽

Artificial Neural Network Ann

Rainfall forecasting is an important research topic in disaster prevention and reduction. The characteristic of rainfall involves a rather complex systematic dynamics under the influence of different meteorological factors, including linear and nonlinear pattern. Recently, many approaches to improve forecasting accuracy have been introduced. Artificial neural network (ANN), which performs a nonlinear mapping between inputs and outputs, has played a crucial role in forecasting rainfall data. In this paper, an effective hybrid semi-parametric regression ensemble (SRE) model is presented for rainfall forecasting. In this model, three linear regression models are used to capture rainfall linear characteristics and three nonlinear regression models based on ANN are able to capture rainfall nonlinear characteristics. The semi-parametric regression is used for ensemble model based on the principal component analysis technique. Empirical results reveal that the prediction using the SRE model is generally better than those obtained using other models in terms of the same evaluation measurements. The SRE model proposed in this paper can be used as a promising alternative forecasting tool for rainfall to achieve greater forecasting accuracy and improve prediction quality.

Download Full-text

GENERAL TRIMMED ESTIMATION: ROBUST APPROACH TO NONLINEAR AND LIMITED DEPENDENT VARIABLE MODELS

Econometric Theory ◽

10.1017/s0266466608080596 ◽

2008 ◽

Vol 24 (6) ◽

pp. 1500-1529 ◽

Cited By ~ 35

Author(s):

Pavel Čížek

Keyword(s):

Regression Models ◽

Breakdown Point ◽

Robust Estimators ◽

Linear Regression Models ◽

Mixing Conditions ◽

Least Trimmed Squares ◽

Limited Dependent Variable ◽

Data Contamination ◽

High Breakdown Point ◽

Regression Estimators

High-breakdown-point regression estimators protect against large errors and data contamination. We generalize the concept of trimming used by many of these robust estimators, such as the least trimmed squares and maximum trimmed likelihood, and propose a general trimmed estimator, which renders robust estimators applicable far beyond the standard (non)linear regression models. We derive here the consistency and asymptotic distribution of the proposed general trimmed estimator under mild β-mixing conditions and demonstrate its applicability in nonlinear regression and limited dependent variable models.

Download Full-text

Use of principal component scores in multiple linear regression models for simulation of chlorophyll-a and phytoplankton abundance at a karst deep reservoir, southwest of China

Acta Ecologica Sinica ◽

10.1016/j.chnaes.2013.11.009 ◽

2014 ◽

Vol 34 (1) ◽

pp. 72-78 ◽

Cited By ~ 3

Author(s):

Li Qiuhua ◽

Shang Lihai ◽

Gao Tingjing ◽

Zhang Lei ◽

Ou Teng ◽

...

Keyword(s):

Linear Regression ◽

Multiple Linear Regression ◽

Chlorophyll A ◽

Regression Models ◽

Principal Component ◽

Linear Regression Models ◽

Phytoplankton Abundance ◽

Deep Reservoir ◽

Component Scores ◽

Multiple Linear Regression Models

Download Full-text

A Calibration Tutorial for Spectral Data. Part 1: Data Pretreatment and Principal Component Regression Using Matlab

Journal of Near Infrared Spectroscopy ◽

10.1255/jnirs.93 ◽

1996 ◽

Vol 4 (1) ◽

pp. 225-242 ◽

Cited By ~ 6

Author(s):

Paul Geladi ◽

Harald Martens

Keyword(s):

Least Squares ◽

Spectral Data ◽

Partial Least Squares ◽

Regression Models ◽

Principal Component Regression ◽

Principal Component ◽

Ordinary Least Squares ◽

Least Squares Regression ◽

Statistical Parameters ◽

Calibration Samples

Regression and calibration play an important role in analytical chemistry. All analytical instrumentation is dependent on a calibration that uses some regression model for a set of calibration samples. The ordinary least squares (OLS) method of building a multivariate linear regression (MLR) model has strict limitations. Therefore, biased or regularised regression models have been introduced. Some selected ones are ridge regression (RR), principal component regression (PCR) and partial least squares regression (PLS or PLSR). Also, artificial neural networks (ANN) based on back-propagation can be used as regression models. In order to understand regression models more is needed than just a set of statistical parameters. A deeper understanding of the underlying chemistry and physics is always equally important. For spectral data this means that a basic understanding of spectra and their errors is useful and that spectral representation should be included in judging the usefulness of the data treatment. A “constructed” spectrometric example is introduced. It consists of real spectrometric measurements in the range 408–1176 nm for 26 calibration samples and 10 test samples. The main response variable is litmus concentration, but other constituents such as bromocresolgreen and ZnO are added as interferents and also the pH is changed. The example is introduced as a tutorial. All calculations are shown in detail in Matlab. This makes it easy for the reader to follow and understand the calculations. It also makes the calculations completely traceable. The raw data are available as a file. In Part 1, the emphasis is on pretreatment of the data and on visualisation in different stages of the calculations. Part 1 ends with principal component regression calculations. Partial least squares calculations and some ANN results are presented in Part 2.

Download Full-text

Modified Ridge Regression Estimator with the Application of Peanut Production in Pakistan

Asian Journal of Advanced Research and Reports ◽

10.9734/ajarr/2019/v7i230172 ◽

2019 ◽

pp. 1-8

Author(s):

Asifa Mubeen ◽

Nasir Jamal ◽

Muhammad Hanif ◽

Usman Shahzad

Keyword(s):

Ridge Regression ◽

Real Data ◽

Production Data ◽

Regression Estimator ◽

Mean Square ◽

Data Set ◽

Regression Estimators ◽

Peanut Production ◽

Ridge Regression Estimator ◽

The Mean

The main objective of the present study was to develop a new ridge regression estimator and fit the ridge regression model to the peanut production data of Pakistan. Peanut production data has been used to analyze the results. The data has been taken peanut production and growth rate of Pakistan. The mean square error of the proposed estimator is compared with some existing ridge regression estimators. In this study, we proposed a ridge regression estimator. The properties of proposed estimators are also discussed. The real data set of peanut production is used for assuming the performance of proposed and existing estimators. Numerical results of real data set show that proposed ridge regression estimator provides best results as compare to reviewed ones.

Download Full-text

On Some Ridge Regression Estimators for Logistic Regression Models

10.25148/etd.fidc006547 ◽

2018 ◽

Author(s):

Ulyana P Williams

Keyword(s):

Logistic Regression ◽

Ridge Regression ◽

Regression Models ◽

Logistic Regression Models ◽

Regression Estimators

Download Full-text