scholarly journals The Comparison Between Different Approaches to Overcome the Multicollinearity Problem in Linear Regression Models

2018 ◽  
Vol 31 (1) ◽  
pp. 212
Author(s):  
Hazim Mansoor Gorgees ◽  
Fatimah Assim Mahdi

    In the presence of multi-collinearity problem, the parameter estimation method based on the ordinary least squares procedure is unsatisfactory. In 1970, Hoerl and Kennard insert analternative method labeled as estimator of ridge regression. In such estimator, ridge parameter plays an important role in estimation. Various methods were proposed by many statisticians to select the biasing constant (ridge parameter). Another popular method that is used to deal with the multi-collinearity problem is the principal component method. In this paper,we employ the simulation technique to compare the performance of principal component estimator with some types of ordinary ridge regression estimators based on the value of the biasing constant (ridge parameter). The mean square error (MSE) is used as a criterion to assess the performance of such estimators.

Epidemiologia ◽  
2021 ◽  
Vol 2 (1) ◽  
pp. 95-113 ◽  
Author(s):  
Isaac Chun-Hai Fung ◽  
Xiaolu Zhou ◽  
Chi-Ngai Cheung ◽  
Sylvia K. Ofori ◽  
Kamalich Muniz-Rodriguez ◽  
...  

To describe the geographical heterogeneity of COVID-19 across prefectures in mainland China, we estimated doubling times from daily time series of the cumulative case count between 24 January and 24 February 2020. We analyzed the prefecture-level COVID-19 case burden using linear regression models and used the local Moran’s I to test for spatial autocorrelation and clustering. Four hundred prefectures (~98% population) had at least one COVID-19 case and 39 prefectures had zero cases by 24 February 2020. Excluding Wuhan and those prefectures where there was only one case or none, 76 (17.3% of 439) prefectures had an arithmetic mean of the epidemic doubling time <2 d. Low-population prefectures had a higher per capita cumulative incidence than high-population prefectures during the study period. An increase in population size was associated with a very small reduction in the mean doubling time (−0.012, 95% CI, −0.017, −0.006) where the cumulative case count doubled ≥3 times. Spatial analysis revealed high case count clusters in Hubei and Heilongjiang and fast epidemic growth in several metropolitan areas by mid-February 2020. Prefectures in Hubei and neighboring provinces and several metropolitan areas in coastal and northeastern China experienced rapid growth with cumulative case count doubling multiple times with a small mean doubling time.


Stroke ◽  
2020 ◽  
Vol 51 (Suppl_1) ◽  
Author(s):  
Adam H de Havenon ◽  
Tanya Turan ◽  
Rebecca Gottesman ◽  
Sharon Yeatts ◽  
Shyam Prabhakaran ◽  
...  

Introduction: While retrospective studies have shown that poor control of vascular risk factors is associated with progression of white matter hyperintensity (WMH), it has not been studied prospectively. Hypothesis: We hypothesize that higher systolic blood pressure (SBP) mean, LDL cholesterol, and Hgb A1c will be correlated with WMH progression in diabetics. Methods: This is a secondary analysis of the Memory in Diabetes (MIND) substudy of the Action to Control Cardiovascular Risk in Diabetes Follow-on Study (ACCORDION). The primary outcome was WMH progression, evaluated by fitting linear regression models to the WMH volume on the month 80 MRI and adjusting for the WMH volume on the baseline MRI. The primary predictors were the mean values of SBP, LDL, and A1c from baseline to month 80. We defined a good vascular risk factor profile as mean SBP <120 mm Hg and mean LDL <120 mg/dL. Results: We included 292 patients, with a mean (SD) age of 62.6 (5.3) years and 55.8% male. The mean number of SBP, LDL, and A1c measurements per patient was 17, 5, and 12. We identified 86 (29.4%) patients with good vascular risk factor profile. In the linear regression models, mean SBP and LDL were associated with WMH progression and in a second fully adjusted model they both remained associated with WMH progression (Table). Those with a good vascular risk factor profile had less WMH progression (β Coefficient -0.80, 95% CI -1.42, -0.18, p=0.012). Conclusions: Our data reinforce prior research showing that higher SBP and LDL is associated with progression of WMH in diabetics, likely secondary to chronic microvascular ischemia, and suggest that control of these factors may have protective effects. This study has unique strengths, including prospective serial measurement of the exposures, validated algorithmic measurement methodology for WMH, and rigorous adjudication of study data. Clinical trials are needed to investigate the effect of vascular risk factor reduction on WMH progression.


2020 ◽  
Vol 11 (1) ◽  
pp. 21
Author(s):  
Zahrotul Aflakhah ◽  
Jajang Jajang ◽  
Agustini Tripena Br. Sb.

This research discusses about the Ordinary Least Squares (OLS) method and robust M-estimation method; compare between the Tukey bisquare and Huber weighting from simple linier regression models that contain outliers. Data are generated through simulation with the percentages of outliers and sample sizes. Each data will be formed into a simple linier regression model, then the percentage of outliers, RSE and MAD values are calculated. The results show that RSE and MAD values produced by a simple linear regression model with the OLS method are influenced by the percentage of outliers. However, the regression model of robust M-estimation with sample size 30, 60, 90, 120, and 150 results an unstable RSE values with the change of the percentage of outlier and the MAD values that are not affected by the percentage of outliers and sample size. The robust M-estimation method with Tukey Bisquare weighting is as good as the Huber weighting. Full Article


Author(s):  
Jiansheng Wu

Rainfall forecasting is an important research topic in disaster prevention and reduction. The characteristic of rainfall involves a rather complex systematic dynamics under the influence of different meteorological factors, including linear and nonlinear pattern. Recently, many approaches to improve forecasting accuracy have been introduced. Artificial neural network (ANN), which performs a nonlinear mapping between inputs and outputs, has played a crucial role in forecasting rainfall data. In this paper, an effective hybrid semi-parametric regression ensemble (SRE) model is presented for rainfall forecasting. In this model, three linear regression models are used to capture rainfall linear characteristics and three nonlinear regression models based on ANN are able to capture rainfall nonlinear characteristics. The semi-parametric regression is used for ensemble model based on the principal component analysis technique. Empirical results reveal that the prediction using the SRE model is generally better than those obtained using other models in terms of the same evaluation measurements. The SRE model proposed in this paper can be used as a promising alternative forecasting tool for rainfall to achieve greater forecasting accuracy and improve prediction quality.


2008 ◽  
Vol 24 (6) ◽  
pp. 1500-1529 ◽  
Author(s):  
Pavel Čížek

High-breakdown-point regression estimators protect against large errors and data contamination. We generalize the concept of trimming used by many of these robust estimators, such as the least trimmed squares and maximum trimmed likelihood, and propose a general trimmed estimator, which renders robust estimators applicable far beyond the standard (non)linear regression models. We derive here the consistency and asymptotic distribution of the proposed general trimmed estimator under mild β-mixing conditions and demonstrate its applicability in nonlinear regression and limited dependent variable models.


1996 ◽  
Vol 4 (1) ◽  
pp. 225-242 ◽  
Author(s):  
Paul Geladi ◽  
Harald Martens

Regression and calibration play an important role in analytical chemistry. All analytical instrumentation is dependent on a calibration that uses some regression model for a set of calibration samples. The ordinary least squares (OLS) method of building a multivariate linear regression (MLR) model has strict limitations. Therefore, biased or regularised regression models have been introduced. Some selected ones are ridge regression (RR), principal component regression (PCR) and partial least squares regression (PLS or PLSR). Also, artificial neural networks (ANN) based on back-propagation can be used as regression models. In order to understand regression models more is needed than just a set of statistical parameters. A deeper understanding of the underlying chemistry and physics is always equally important. For spectral data this means that a basic understanding of spectra and their errors is useful and that spectral representation should be included in judging the usefulness of the data treatment. A “constructed” spectrometric example is introduced. It consists of real spectrometric measurements in the range 408–1176 nm for 26 calibration samples and 10 test samples. The main response variable is litmus concentration, but other constituents such as bromocresolgreen and ZnO are added as interferents and also the pH is changed. The example is introduced as a tutorial. All calculations are shown in detail in Matlab. This makes it easy for the reader to follow and understand the calculations. It also makes the calculations completely traceable. The raw data are available as a file. In Part 1, the emphasis is on pretreatment of the data and on visualisation in different stages of the calculations. Part 1 ends with principal component regression calculations. Partial least squares calculations and some ANN results are presented in Part 2.


Author(s):  
Asifa Mubeen ◽  
Nasir Jamal ◽  
Muhammad Hanif ◽  
Usman Shahzad

The main objective of the present study was to develop a new ridge regression estimator and fit the ridge regression model to the peanut production data of Pakistan. Peanut production data has been used to analyze the results. The data has been taken peanut production and growth rate of Pakistan. The mean square error of the proposed estimator is compared with some existing ridge regression estimators. In this study, we proposed a ridge regression estimator. The properties of proposed estimators are also discussed. The real data set of peanut production is used for assuming the performance of proposed and existing estimators. Numerical results of real data set show that proposed ridge regression estimator provides best results as compare to reviewed ones.


Sign in / Sign up

Export Citation Format

Share Document