Weighted Multicollinearity in Logistic Regression: Diagnostics and Biased Estimation Techniques with an Example from Lake Acidification

Brian D. Marx; Eric P. Smith

doi:10.1139/f90-131

Weighted Multicollinearity in Logistic Regression: Diagnostics and Biased Estimation Techniques with an Example from Lake Acidification

Canadian Journal of Fisheries and Aquatic Sciences ◽

10.1139/f90-131 ◽

1990 ◽

Vol 47 (6) ◽

pp. 1128-1135 ◽

Cited By ~ 9

Author(s):

Brian D. Marx ◽

Eric P. Smith

Keyword(s):

Logistic Regression ◽

Parameter Estimation ◽

Maximum Likelihood ◽

Water Chemistry ◽

Acid Precipitation ◽

Estimation Methods ◽

Parameter Estimates ◽

Regression Diagnostics ◽

Data Set ◽

Biased Estimation

An historical data set from the Adirondack region of New York is revisited to study the relationship between water chemistry variables associated with acid precipitation and the presence/absence of brook trout (Salvelinus fontinalis) and lake trout (Salvelinus namaycush). For the trout species data sets, water chemistry variables associated with acid precipitation, for example pH and alkalinity, are highly correlated. Regression models to assess their effects on the probability of the presence of fish species are therefore affected by multicollinearity. Because the appropriate regressions are logistic, correction techniques based on least squares do not work. Maximum likelihood parameter estimation is highly unstable for the trout presence/absence data. Developments in weighted multicollinearity diagnostics are used to evaluate maximum likelihood logistic regression parameter estimates. Further, an application of biased parameter estimation is presented as an option to the traditional maximum likelihood logistic regression. Biased estimation methods, like ridge, principal component, or Stein estimation can substantially reduce the variance of the parameter estimates and prediction variance for certain future observations. In many cases, only a slight modification to the converged maximum likelihood estimator is necessary.

Download Full-text

Hierarchical Maximum Likelihood Parameter Estimation for Cumulative Prospect Theory: Improving the Reliability of Individual Risk Parameter Estimates

SSRN Electronic Journal ◽

10.2139/ssrn.2425670 ◽

2014 ◽

Cited By ~ 4

Author(s):

Ryan O. Murphy ◽

Robert H.W. ten Brincke

Keyword(s):

Parameter Estimation ◽

Maximum Likelihood ◽

Prospect Theory ◽

Cumulative Prospect Theory ◽

Individual Risk ◽

Parameter Estimates ◽

Risk Parameter

Download Full-text

Properties and methods of estimation for a bivariate exponentiated Fréchet distribution

Mathematica Slovaca ◽

10.1515/ms-2017-0426 ◽

2020 ◽

Vol 70 (5) ◽

pp. 1211-1230

Author(s):

Abdus Saboor ◽

Hassan S. Bakouch ◽

Fernando A. Moala ◽

Sheraz Hussain

Keyword(s):

Maximum Likelihood ◽

Probability Density Function ◽

Probability Density ◽

Density Function ◽

Superior Performance ◽

Estimation Methods ◽

Conditional Probability Density ◽

Data Set ◽

Fréchet Distribution ◽

Frechet Distribution

AbstractIn this paper, a bivariate extension of exponentiated Fréchet distribution is introduced, namely a bivariate exponentiated Fréchet (BvEF) distribution whose marginals are univariate exponentiated Fréchet distribution. Several properties of the proposed distribution are discussed, such as the joint survival function, joint probability density function, marginal probability density function, conditional probability density function, moments, marginal and bivariate moment generating functions. Moreover, the proposed distribution is obtained by the Marshall-Olkin survival copula. Estimation of the parameters is investigated by the maximum likelihood with the observed information matrix. In addition to the maximum likelihood estimation method, we consider the Bayesian inference and least square estimation and compare these three methodologies for the BvEF. A simulation study is carried out to compare the performance of the estimators by the presented estimation methods. The proposed bivariate distribution with other related bivariate distributions are fitted to a real-life paired data set. It is shown that, the BvEF distribution has a superior performance among the compared distributions using several tests of goodness–of–fit.

Download Full-text

ESTIMATION OF THE BINARY LOGISTIC REGRESSION MODEL PARAMETER USING BOOTSTRAP RE-SAMPLING

Latin American Applied Research - An international journal ◽

10.52292/j.laar.2018.228 ◽

2018 ◽

Vol 48 (3) ◽

pp. 199-204 ◽

Cited By ~ 1

Author(s):

R. LI ◽

J. ZHOU ◽

L. WANG

Keyword(s):

Logistic Regression ◽

Parameter Estimation ◽

Maximum Likelihood ◽

Regression Model ◽

Logistic Regression Model ◽

Binary Logistic Regression ◽

Parametric Bootstrap ◽

Binary Logistic Regression Model ◽

Bayesian Bootstrap ◽

Non Parametric

In this paper, the non-parametric bootstrap and non-parametric Bayesian bootstrap methods are applied for parameter estimation in the binary logistic regression model. A real data study and a simulation study are conducted to compare the Nonparametric bootstrap, Non-parametric Bayesian bootstrap and the maximum likelihood methods. Study results shows that three methods are all effective ways for parameter estimation in the binary logistic regression model. In small sample case, the non-parametric Bayesian bootstrap method performs relatively better than the non-parametric bootstrap and the maximum likelihood method for parameter estimation in the binary logistic regression model.

Download Full-text

New Lindley Half Cauchy Distribution: Theory and Applications

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.d4734.119420 ◽

2020 ◽

Vol 9 (4) ◽

pp. 1-7

Keyword(s):

Maximum Likelihood ◽

Real Data ◽

Cauchy Distribution ◽

Least Square ◽

Cumulative Distribution ◽

Likelihood Method ◽

Estimation Methods ◽

Model Parameters ◽

Data Set ◽

Von Mises

In this paper, we have defined a new two-parameter new Lindley half Cauchy (NLHC) distribution using Lindley-G family of distribution which accommodates increasing, decreasing and a variety of monotone failure rates. The statistical properties of the proposed distribution such as probability density function, cumulative distribution function, quantile, the measure of skewness and kurtosis are presented. We have briefly described the three well-known estimation methods namely maximum likelihood estimators (MLE), least-square (LSE) and Cramer-Von-Mises (CVM) methods. All the computations are performed in R software. By using the maximum likelihood method, we have constructed the asymptotic confidence interval for the model parameters. We verify empirically the potentiality of the new distribution in modeling a real data set.

Download Full-text

Efficient Maximum-Likelihood Inference For The Isolation-With-Initial-Migration Model With Potentially Asymmetric Gene Flow

10.1101/052894 ◽

2016 ◽

Author(s):

Rui J. Costa ◽

Hilde Wilkinson-Herbots

Keyword(s):

Gene Flow ◽

Maximum Likelihood ◽

Dna Sequences ◽

Computing Time ◽

Likelihood Method ◽

Ancestral Population ◽

Parameter Estimates ◽

Fast Method ◽

Data Set ◽

Im Model

AbstractThe isolation-with-migration (IM) model is commonly used to make inferences about gene flow during speciation, using polymorphism data. However, Becquet and Przeworski (2009) report that the parameter estimates obtained by fitting the IM model are very sensitive to the model's assumptions (including the assumption of constant gene flow until the present). This paper is concerned with the isolation-with-initial-migration (IIM) model of Wilkinson-Herbots (2012), which drops precisely this assumption. In the IIM model, one ancestral population divides into two descendant subpopulations, between which there is an initial period of gene flow and a subsequent period of isolation. We derive a very fast method of fitting an extended version of the IIM model, which also allows for asymmetric gene flow and unequal population sizes. This is a maximum-likelihood method, applicable to data on the number of segregating sites between pairs of DNA sequences from a large number of independent loci. In addition to obtaining parameter estimates, our method can also be used to distinguish between alternative models representing different evolutionary scenarios, by means of likelihood ratio tests. We illustrate the procedure on pairs of Drosophila sequences from approximately 30,000 loci. The computing time needed to fit the most complex version of the model to this data set is only a couple of minutes. The R code to fit the IIM model can be found in the supplementary files of this paper.

Download Full-text

Parameter Estimation of Binomial Logistic Regression Based on Classical (Maximum Likelihood) and Bayesian (MCMC) Approach for Screening B-Thalassemia

International Journal of Intelligent Information Processing ◽

10.4156/ijiip.vol3.issue1.9 ◽

2012 ◽

Vol 3 (1) ◽

pp. 90-100 ◽

Cited By ~ 3

Author(s):

Patcharaporn Paokanta ◽

Napat Harnpornchai ◽

Nopasit Chakpitak ◽

Michele Ceccarelli ◽

Somdet Srichairatanakool

Keyword(s):

Logistic Regression ◽

Parameter Estimation ◽

Maximum Likelihood ◽

Bayesian Mcmc ◽

Binomial Logistic Regression ◽

Mcmc Approach

Download Full-text

Challenges in Nonlinear Structural Equation Modeling

Methodology ◽

10.1027/1614-2241.3.3.100 ◽

2007 ◽

Vol 3 (3) ◽

pp. 100-114 ◽

Cited By ~ 47

Author(s):

Polina Dimitruk ◽

Karin Schermelleh-Engel ◽

Augustin Kelava ◽

Helfried Moosbrugger

Keyword(s):

Structural Equation Modeling ◽

Maximum Likelihood ◽

Structural Equation ◽

Nonlinear Effects ◽

Structural Equations ◽

Estimation Methods ◽

Equation Modeling ◽

Parameter Estimates ◽

Nonlinear Structural ◽

Nonlinear Structural Equation

Abstract. Challenges in evaluating nonlinear effects in multiple regression analyses include reliability, validity, multicollinearity, and dichotomization of continuous variables. While reliability and validity issues are solved by employing nonlinear structural equation modeling, multicollinearity remains a problem which may even be aggravated when using latent variable approaches. Further challenges of nonlinear latent analyses comprise the distribution of latent product terms, a problem especially relevant for approaches using maximum likelihood estimation methods based on multivariate normally distributed variables, and unbiased estimates of nonlinear effects under multicollinearity. The only methods that explicitly take the nonnormality of nonlinear latent models into account are latent moderated structural equations (LMS) and quasi-maximum likelihood (QML). In a small simulation study both methods yielded unbiased parameter estimates and correct estimates of standard errors for inferential statistics. The advantages and limitations of nonlinear structural equation modeling are discussed.

Download Full-text

A Comparison of the Internal Validity of Alternative Parameter Estimation Methods in Decompositional Multiattribute Preference Models

Journal of Marketing Research ◽

10.1177/002224377901600303 ◽

1979 ◽

Vol 16 (3) ◽

pp. 313-322 ◽

Cited By ~ 46

Author(s):

Arun K. Jain ◽

Franklin Acito ◽

Naresh K. Malhotra ◽

Vijay Mahajan

Keyword(s):

Parameter Estimation ◽

Data Collection ◽

Data Base ◽

Internal Validity ◽

Detailed Comparison ◽

Estimation Methods ◽

Parameter Estimates ◽

Alternative Approaches ◽

Preference Models ◽

Parameter Estimation Methods

Since 1971, interest in the use of decompositional multiattribute preference models in marketing has been increasing. The applications have varied in terms of the type of data used, behavior predicted, and methods used for estimating parameters. The authors examine the effect of different data collection and estimation procedures on parameter estimates and their stability and validity. An actual data base is used. A detailed comparison is made of the alternative approaches of parameter estimation and suggestions are given for the potential users of decompositional multiattribute preference models.

Download Full-text

Comparison between the maximum likelihood and the bayesian estimation methods for logistic regression model (case study: risk of low birth weight in Indonesia)

Journal of Physics Conference Series ◽

10.1088/1742-6596/2106/1/012001 ◽

2021 ◽

Vol 2106 (1) ◽

pp. 012001

Author(s):

P R Sihombing ◽

S R Rohimah ◽

A Kurnia

Keyword(s):

Logistic Regression ◽

Birth Weight ◽

Maximum Likelihood ◽

Low Birth Weight ◽

Regression Model ◽

Bayesian Estimation ◽

Logistic Regression Model ◽

Estimation Method ◽

Estimation Methods ◽

Model Case

Abstract This study aims to compare the efficacy of logistic regression model for identifying the risk factors of low-birth-weight babies in Indonesia using the maximum likelihood estimation (MLE)and the Bayesian estimation methods. The data used in this study is secondary data derived from the 2017 Indonesian Demographic Health Survey with a total sample of 16,344 newborn babies. Selection of the best logistic regression model was based on the smaller Bayesian Schwartz Information Criterion (BIC) value. The logistic regression model with the Bayesian estimation method has a smaller BIC value than the MLE method. Twin births, baby girl, maternal age at risk, birth spacing that is too close, iron deficiency, low education, low economy, inadequate drinking water sources have provided a higher risk of low-birth-weight incidence.

Download Full-text

Conditional Maximum Likelihood Estimation in Probability-Branched Multistage Designs

10.31234/osf.io/ew27f ◽

2021 ◽

Author(s):

Jan Steinfeld ◽

Alexander Robitzsch

Keyword(s):

Parameter Estimation ◽

Maximum Likelihood ◽

Likelihood Estimation ◽

Parameter Estimate ◽

Item Parameter ◽

Parameter Estimates ◽

Conditional Maximum Likelihood ◽

Item Parameter Estimates ◽

Item Parameter Estimation ◽

Conditional Maximum

This article describes the conditional maximum likelihood-based item parameter estimation in probabilistic multistage designs. In probabilistic multistage designs, the routing is not solely based on a raw score j and a cut score c as well as a rule for routing into a module such as j < c or j ≤ c but is based on a probability p(j) for each raw score j. It can be shown that the use of a conventional conditional maximum likelihood parameter estimate in multistage designs leads to severely biased item parameter estimates. Zwitser and Maris (2013) were able to show that with deterministic routing, the integration of the design into the item parameter estimation leads to unbiased estimates. This article extends this approach to probabilistic routing and, at the same time, represents a generalization. In a simulation study, it is shown that the item parameter estimation in probabilistic designs leads to unbiased item parameter estimates.

Download Full-text