Endogeneity in Probit Response Models

David A. Freedman; Jasjeet S. Sekhon

doi:10.1093/pan/mpp037

Endogeneity in Probit Response Models

Political Analysis ◽

10.1093/pan/mpp037 ◽

2010 ◽

Vol 18 (2) ◽

pp. 138-150 ◽

Cited By ~ 81

Author(s):

David A. Freedman ◽

Jasjeet S. Sekhon

Keyword(s):

Regression Models ◽

Likelihood Function ◽

Theoretical Perspective ◽

Probit Model ◽

Likelihood Methods ◽

Endogeneity Bias ◽

Response Models ◽

Software Packages ◽

Step Procedure ◽

Standard Software

We look at conventional methods for removing endogeneity bias in regression models, including the linear model and the probit model. It is known that the usual Heckman two-step procedure should not be used in the probit model: from a theoretical perspective, it is unsatisfactory, and likelihood methods are superior. However, serious numerical problems occur when standard software packages try to maximize the biprobit likelihood function, even if the number of covariates is small. We draw conclusions for statistical practice. Finally, we prove the conditions under which parameters in the model are identifiable. The conditions for identification are delicate; we believe these results are new.

Download Full-text

Zero-inflated-censored Weibull and gamma regression models to estimate wild boar population dispersal distance

Japanese Journal of Statistics and Data Science ◽

10.1007/s42081-021-00124-0 ◽

2021 ◽

Author(s):

Eduardo de Freitas Costa ◽

Silvana Schneider ◽

Giulia Bagatini Carlotto ◽

Tainá Cabalheiro ◽

Mauro Ribeiro de Oliveira Júnior

Keyword(s):

Wild Boar ◽

Regression Models ◽

Likelihood Function ◽

Complex Structure ◽

Dispersal Distance ◽

Wild Boar Population ◽

Gamma Distributions ◽

Excess Of Zeros ◽

Censored Observations ◽

Positive Effect

AbstractThe dynamics of the wild boar population has become a pressing issue not only for ecological purposes, but also for agricultural and livestock production. The data related to the wild boar dispersal distance can have a complex structure, including excess of zeros and right-censored observations, thus being challenging for modeling. In this sense, we propose two different zero-inflated-right-censored regression models, assuming Weibull and gamma distributions. First, we present the construction of the likelihood function, and then, we apply both models to simulated datasets, demonstrating that both regression models behave well. The simulation results point to the consistency and asymptotic unbiasedness of the developed methods. Afterwards, we adjusted both models to a simulated dataset of wild boar dispersal, including excess of zeros, right-censored observations, and two covariates: age and sex. We showed that the models were useful to extract inferences about the wild boar dispersal, correctly describing the data mimicking a situation where males disperse more than females, and age has a positive effect on the dispersal of the wild boars. These results are useful to overcome some limitations regarding inferences in zero-inflated-right-censored datasets, especially concerning the wild boar’s population. Users will be provided with an R function to run the proposed models.

Download Full-text

ON QUASI-LIKELIHOOD METHODS AND ESTIMATION FOR BUNCHING PROCESSES AND HETEROSCEDASTIC REGRESSION MODELS

Australian Journal of Statistics ◽

10.1111/j.1467-842x.1992.tb01353.x ◽

1992 ◽

Vol 34 (2) ◽

pp. 199-206 ◽

Cited By ~ 5

Author(s):

C.C. Heyde ◽

Y.-X. Lin

Keyword(s):

Regression Models ◽

Likelihood Methods ◽

Heteroscedastic Regression

Download Full-text

Modified ridge and other regularization criteria: A brief review on meaningful regression models

Model Assisted Statistics and Applications ◽

10.3233/mas-210536 ◽

2021 ◽

Vol 16 (3) ◽

pp. 225-227

Author(s):

Stan Lipovetsky

Keyword(s):

Decision Making ◽

Regression Models ◽

Latent Class ◽

Marketing Research ◽

Latent Class Models ◽

Response Models ◽

Modeling And Analysis ◽

Regularization Parameters ◽

Lp Metric ◽

Class Models

The work describes a series of techniques designed to obtain regression models resistant to multicollinearity and having some other features needed for meaningful results. These models include enhanced ridge-regressions with several regularization parameters, regressions by data segments and by levels of the dependent variable, latent class models, unitary response, models, orthogonal and equidistant regressions, minimization in Lp-metric, and other criteria and models. All the approaches have been practically implemented in various projects and found useful for decision making in economics, management, marketing research, and other fields requiring data modeling and analysis.

Download Full-text

When good ligands go bad

Acta Crystallographica Section A Foundations and Advances ◽

10.1107/s2053273314087300 ◽

2014 ◽

Vol 70 (a1) ◽

pp. C1269-C1269

Author(s):

Ethan Merritt

Keyword(s):

Task Force ◽

Structure Refinement ◽

Binding Pocket ◽

Software Packages ◽

Protein Model ◽

Protein Components ◽

First Time ◽

Standard Software ◽

New Protein

"Tools for validating structural models of proteins are relatively mature and widely implemented. New protein crystallographers are introduced early on to the importance of monitoring conformance with expected φ/ψ values, favored rotamers, and local stereochemistry. The protein model is validated by the PDB at the time of deposition using criteria that are also available in the standard software packages used to refine the model being deposited. By contrast, crystallographers are typically much less familiar with procedures to validate key non-protein components of the model – cofactors, substrates, inhibitors, etc. It has been estimated that as many as a third of all ligands in the PDB exhibit preventable errors of some sort, ranging from minor deviations in expected bond angles to wholly implausible placement in the binding pocket. Following recommendations from the wwPDB Validation Task Force, the PDB recently began validating ligand geometry as an integral part of deposition processing. This means that many crystallographers will soon receive for the first time a ""grade"" on the quality of ligands in the structure they have just deposited. Some will be surprised, as I was following my first PDB deposition of 2014, at how easily bad ligand geometry can slip through the cracks in supposedly robust structure refinement protocols that their lab has used for many years. I will illustrate use of current tools for generating ligand restraints to guide model refinement. One is the jligand+coot+cprodrg pipeline integrated into the CCP4 suite. Another is the Grade web server provided as a community resource by Global Phasing Ltd. Furthermore I will show examples from recent in-house refinements of how things can still go wrong even if you do use these tools, and how we recovered. The new PDB deposition checks may expose errors in your ligand descriptions after the fact. This presentation may help you avoid introducing those errors in the first place."

Download Full-text

Applications of molecular networks in biomedicine

Biology Methods and Protocols ◽

10.1093/biomethods/bpz012 ◽

2019 ◽

Vol 4 (1) ◽

Author(s):

Monica Chagoyen ◽

Juan A G Ranea ◽

Florencio Pazos

Keyword(s):

Single Gene ◽

Point Of View ◽

Molecular Networks ◽

Large Sets ◽

Software Packages ◽

Number Of Genes ◽

Complex Phenomena ◽

Network Approaches ◽

Molecular Components ◽

Standard Software

Abstract Due to the large interdependence between the molecular components of living systems, many phenomena, including those related to pathologies, cannot be explained in terms of a single gene or a small number of genes. Molecular networks, representing different types of relationships between molecular entities, embody these large sets of interdependences in a framework that allow their mining from a systemic point of view to obtain information. These networks, often generated from high-throughput omics datasets, are used to study the complex phenomena of human pathologies from a systemic point of view. Complementing the reductionist approach of molecular biology, based on the detailed study of a small number of genes, systemic approaches to human diseases consider that these are better reflected in large and intricate networks of relationships between genes. These networks, and not the single genes, provide both better markers for diagnosing diseases and targets for treating them. Network approaches are being used to gain insight into the molecular basis of complex diseases and interpret the large datasets associated with them, such as genomic variants. Network formalism is also suitable for integrating large, heterogeneous and multilevel datasets associated with diseases from the molecular level to organismal and epidemiological scales. Many of these approaches are available to nonexpert users through standard software packages.

Download Full-text

Random changepoint segmented regression with smooth transition

Statistical Methods in Medical Research ◽

10.1177/0962280220964953 ◽

2020 ◽

pp. 096228022096495

Author(s):

Julio M Singer ◽

Francisco MM Rocha ◽

Antonio Carlos Pedroso-de-Lima ◽

Giovani L Silva ◽

Giuliana C Coatti ◽

...

Keyword(s):

Stem Cells ◽

Mixed Models ◽

Regression Models ◽

Smooth Transition ◽

Segmented Regression ◽

Biological Aspects ◽

Analyse Data ◽

Standard Software ◽

Nonlinear Nature ◽

Lateral Sclerosis

We consider random changepoint segmented regression models to analyse data from a study conducted to verify whether treatment with stem cells may delay the onset of a symptom of amyotrophic lateral sclerosis in genetically modified mice. The proposed models capture the biological aspects of the data, accommodating a smooth transition between the periods with and without symptoms. An additional changepoint is considered to avoid negative predicted responses. Given the nonlinear nature of the model, we propose an algorithm to estimate the fixed parameters and to predict the random effects by fitting linear mixed models iteratively via standard software. We compare the variances obtained in the final step with bootstrapped and robust ones.

Download Full-text

Economic Liberalization via IMF Structural Adjustment: Sowing the Seeds of Civil War?

International Organization ◽

10.1017/s0020818310000068 ◽

2010 ◽

Vol 64 (2) ◽

pp. 339-356 ◽

Cited By ~ 31

Author(s):

Caroline A. Hartzell ◽

Matthew Hoddie ◽

Molly Bauer

Keyword(s):

Civil War ◽

Structural Adjustment ◽

Probit Model ◽

Economic Liberalization ◽

Economic Openness ◽

Endogeneity Bias ◽

Bivariate Probit Model ◽

Structural Adjustment Programs ◽

Imf Programs ◽

The Relationship

AbstractPrevious studies that have explored the effects of economic liberalization on civil war have employed aggregate measures of openness and have failed to account for potential endogeneity bias. In this research note, we suggest two improvements to the study of the relationship between liberalization and civil war. First, emphasizing that it is processes that systematically create new economic winners and losers rather than particular levels of economic openness that have the potential to generate conflict, we consider the effects of one oft-used means of liberalizing economies: the adoption by countries of International Monetary Fund (IMF) structural adjustment programs. Second, we use a bivariate probit model to address issues of endogeneity bias. Analyzing all data available for the period between 1970 and 1999, we identify an association between the adoption of IMF programs and the onset of civil war. This finding suggests that IMF programs to promote economic openness unintentionally may be creating an environment conducive to domestic conflict.

Download Full-text

Standard software packages for business information systems in customer driven manufacturing

Customer-driven Manufacturing ◽

10.1007/978-94-009-0075-2_27 ◽

1997 ◽

pp. 333-356

Author(s):

J. C. Wortmann ◽

D. R. Muntslag ◽

P. J. M. Timmermans

Keyword(s):

Information Systems ◽

Business Information Systems ◽

Business Information ◽

Software Packages ◽

Standard Software

Download Full-text

Methods for Stratified Cluster Sampling with Informative Stratification

Journal of Applied Mathematics and Decision Sciences ◽

10.1155/2007/56372 ◽

2007 ◽

Vol 2007 ◽

pp. 1-12

Author(s):

Alastair Scott ◽

Chris Wild

Keyword(s):

Maximum Likelihood ◽

Regression Models ◽

Population Distribution ◽

Estimating Equation ◽

Cluster Sampling ◽

Likelihood Methods ◽

Weighted Estimating Equation ◽

The Family ◽

Maximum Likelihood Methods ◽

Using Data

We look at fitting regression models using data from stratified cluster samples when the strata may depend in some way on the observed responses within clusters. One important subclass of examples is that of family studies in genetic epidemiology, where the probability of selecting a family into the study depends on the incidence of disease within the family. We develop the survey-weighted estimating equation approach for this problem, with particular emphasis on the estimation of superpopulation parameters. Full maximum likelihood for this class of problems involves modelling the population distribution of the covariates which is simply not feasible when there are a large number of potential covariates. We discuss efficient semiparametric maximum likelihood methods in which the covariate distribution is left completely unspecified. We further discuss the relative efficiencies of these two approaches.

Download Full-text

Markov Breaks in Regression Models

Journal of Time Series Econometrics ◽

10.1515/1941-1928.1111 ◽

2012 ◽

Vol 4 (1) ◽

Author(s):

Aaron Smith

Keyword(s):

Regression Models ◽

Yield Curve ◽

Likelihood Function ◽

Weighted Average ◽

Predictive Ability ◽

Error Variance ◽

Monte Carlo Integration ◽

Linear Regression Models ◽

Out Of Sample ◽

Low Dimensional

This article develops a new Markov breaks (MB) model for forecasting and making inference in linear regression models with breaks that are stochastic in both timing and magnitude. The MB model permits an arbitrarily large number of abrupt breaks in the regression coefficients and error variance, but it maintains a low-dimensional state space, and therefore it is computationally straightforward. In particular, the likelihood function can be computed analytically using a single iterative pass through the data and thereby avoids Monte Carlo integration. The model generates forecasts and conditional coefficient predictions using a probability weighted average over regressions that include progressively more historical data. I employ the MB model to study the predictive ability of the yield curve for quarterly GDP growth. I show evidence of breaks in the predictive relationship, and the MB model outperforms competing breaks models in an out-of-sample forecasting experiment.

Download Full-text