Effects of Monotone and Nonmonotone Attrition on Parameter Estimates in Regression Models with Educational Data: Demographic Effects on Achievement, Aspirations, and Attitudes

1998 ◽  
Vol 33 (2) ◽  
pp. 555 ◽  
Author(s):  
David T. Burkam ◽  
Valerie E. Lee
Author(s):  
Jeremy Freese

This article presents a method and program for identifying poorly fitting observations for maximum-likelihood regression models for categorical dependent variables. After estimating a model, the program leastlikely will list the observations that have the lowest predicted probabilities of observing the value of the outcome category that was actually observed. For example, when run after estimating a binary logistic regression model, leastlikely will list the observations with a positive outcome that had the lowest predicted probabilities of a positive outcome and the observations with a negative outcome that had the lowest predicted probabilities of a negative outcome. These can be considered the observations in which the outcome is most surprising given the values of the independent variables and the parameter estimates and, like observations with large residuals in ordinary least squares regression, may warrant individual inspection. Use of the program is illustrated with examples using binary and ordered logistic regression.


2007 ◽  
Vol 13 (2) ◽  
pp. 261-272 ◽  
Author(s):  
Helmut Küchenhoff ◽  
Ralf Bender ◽  
Ingo Langner

1997 ◽  
Vol 1 (1) ◽  
pp. 71-80 ◽  
Author(s):  
P. S. P. Cowpertwait ◽  
P. E. O'Connell

Abstract. A single-site Neyman-Scott Poisson cluster model of rainfall, with convective and stratiform cells, is fitted to data for 112 sites scattered throughout the UK using harmonic variables to account for seasonality. The model is regionalised by regressing the estimates of the harmonic variables on site dependent variables (e.g. altitude) to enable rainfall to be simulated at any ungauged site in the UK. An assessment of the residual errors indicates that the regression models can be used with reasonable confidence for urban sites. Furthermore, the regional variations of the model parameter estimates are found to be in agreement with meteorological knowledge and observation. Simulated I h extreme rainfalls are found to compare favourably with observed historical values, although some lack-of-fit is evident for higher aggregation levels.


1989 ◽  
Vol 5 (3) ◽  
pp. 363-384 ◽  
Author(s):  
Russell Davidson ◽  
James G. MacKinnon

We consider several issues related to Durbin-Wu-Hausman tests; that is, tests based on the comparison of two sets of parameter estimates. We first review a number of results about these tests in linear regression models, discuss what determines their power, and propose a simple way to improve power in certain cases. We then show how in a general nonlinear setting they may be computed as “score” tests by means of slightly modified versions of any artificial linear regression that can be used to calculate Lagrange multiplier tests, and explore some of the implications of this result. In particular, we show how to create a variant of the information matrix test that tests for parameter consistency. We examine the conventional information matrix test and our new version in the context of binary-choice models, and provide a simple way to compute both tests using artificial regressions.


2003 ◽  
Vol 76 (1) ◽  
pp. 19-25 ◽  
Author(s):  
F Jaffrézic ◽  
P Minini

AbstractAdvantages of the use of test-day records for genetic evaluation of dairy cattle are now widely accepted. In particular, longitudinal models such as random regression avoid using ad hoc extrapolation procedures to reconstruct complete lactations as they provide individual predictions even for incomplete data. However, these predictions and parameter estimates obtained in the model do not take into account the lactation length. This can be an important drawback for phenotypic and genetic analysis of milk production of cows with shorter lactations. The aim of this paper is to propose a methodology that would correct these predictions, weighting them by the probability at each point in time of each cow being dried off. The proposed procedure is easy to implement and calculations are fast to compute. A simulation study and an application on real data for daily milk records show that the proposed methodology provides a more accurate estimation for individual cumulative production as well as genetic values, and avoids predicting negative productions at the end of the lactation as is often the case with random regression models.


2019 ◽  
Author(s):  
Ziqi Li ◽  
Alexander Stewart Fotheringham ◽  
Taylor M. Oshan ◽  
Levi John Wolf

Bandwidth, a key parameter in geographically weighted regression models, is closely related to the spatial scale at which the underlying spatially heterogeneous processes being examined take place. Generally, a single optimal bandwidth (geographically weighted regression) or a set of covariate-specific optimal bandwidths (multiscale geographically weighted regression) is chosen based on some criterion such as the Akaike Information Criterion (AIC) and then parameter estimation and inference are conditional on the choice of this bandwidth. In this paper, we find that bandwidth selection is subject to uncertainty in both single-scale and multi-scale geographically weighted regression models and demonstrate that this uncertainty can be measured and accounted for. Based on simulation studies and an empirical example of obesity rates in Phoenix, we show that bandwidth uncertainties can be quantitatively measured by Akaike weights, and confidence intervals for bandwidths can be obtained. Understanding bandwidth uncertainty offers important insights about the scales over which different processes operate, especially when comparing covariate-specific bandwidths. Additionally, unconditional parameter estimates can be computed based on Akaike weights accounts for bandwidth selection uncertainty.


2010 ◽  
Vol 67 (2) ◽  
pp. 218-222 ◽  
Author(s):  
Lídia Raquel de Carvalho ◽  
Sheila Zambello de Pinho ◽  
Martha Maria Mischan

In biologic experiments, in which growth curves are adjusted to sample data, treatments applied to the experimental material can affect the parameter estimates. In these cases the interest is to compare the growth functions, in order to distinguish treatments. Three methods that verify the equality of parameters in nonlinear regression models were compared: (i) developed by Carvalho in 1996, performing ANOVA on estimates of parameters of individual fits; (ii) suggested by Regazzi in 2003, using the likelihood ratio method; and (iii) constructing a pooled variance from individual variances. The parametric tests, F and Tukey, were employed when the parameter estimators were near to present the properties of linear model estimators, that is, unbiasedness, normal distribution and minimum variance. The first and second methods presented similar results, but the third method is simpler in calculations and uses all information contained in the original data.


1989 ◽  
Vol 19 (5) ◽  
pp. 664-673 ◽  
Author(s):  
Andrew J. R. Gillespie ◽  
Tiberius Cunia

Biomass tables are often constructed from cluster samples by means of ordinary least squares regression estimation procedures. These procedures assume that sample observations are uncorrelated, which ignores the intracluster correlation of cluster samples and results in underestimates of the model error. We tested alternative estimation procedures by simulation under a variety of cluster sampling methods, to determine combinations of sampling and estimation procedures that yield accurate parameter estimates and reliable estimates of error. Modified, generalized, and jack-knife least squares procedures gave accurate parameter and error estimates when sample trees were selected with equal probability. Regression models that did not include height as a predictor variable yielded biased parameter estimates when sample trees were selected with probability proportional to tree size. Models that included height did not yield biased estimates. There was no discernible gain in precision associated with sampling with probability proportional to size. Random coefficient regressions generally gave biased point estimates with poor precision, regardless of sampling method.


Sign in / Sign up

Export Citation Format

Share Document