scholarly journals Endogeneity in Probit Response Models

2010 ◽  
Vol 18 (2) ◽  
pp. 138-150 ◽  
Author(s):  
David A. Freedman ◽  
Jasjeet S. Sekhon

We look at conventional methods for removing endogeneity bias in regression models, including the linear model and the probit model. It is known that the usual Heckman two-step procedure should not be used in the probit model: from a theoretical perspective, it is unsatisfactory, and likelihood methods are superior. However, serious numerical problems occur when standard software packages try to maximize the biprobit likelihood function, even if the number of covariates is small. We draw conclusions for statistical practice. Finally, we prove the conditions under which parameters in the model are identifiable. The conditions for identification are delicate; we believe these results are new.

Author(s):  
Eduardo de Freitas Costa ◽  
Silvana Schneider ◽  
Giulia Bagatini Carlotto ◽  
Tainá Cabalheiro ◽  
Mauro Ribeiro de Oliveira Júnior

AbstractThe dynamics of the wild boar population has become a pressing issue not only for ecological purposes, but also for agricultural and livestock production. The data related to the wild boar dispersal distance can have a complex structure, including excess of zeros and right-censored observations, thus being challenging for modeling. In this sense, we propose two different zero-inflated-right-censored regression models, assuming Weibull and gamma distributions. First, we present the construction of the likelihood function, and then, we apply both models to simulated datasets, demonstrating that both regression models behave well. The simulation results point to the consistency and asymptotic unbiasedness of the developed methods. Afterwards, we adjusted both models to a simulated dataset of wild boar dispersal, including excess of zeros, right-censored observations, and two covariates: age and sex. We showed that the models were useful to extract inferences about the wild boar dispersal, correctly describing the data mimicking a situation where males disperse more than females, and age has a positive effect on the dispersal of the wild boars. These results are useful to overcome some limitations regarding inferences in zero-inflated-right-censored datasets, especially concerning the wild boar’s population. Users will be provided with an R function to run the proposed models.


2021 ◽  
Vol 16 (3) ◽  
pp. 225-227
Author(s):  
Stan Lipovetsky

The work describes a series of techniques designed to obtain regression models resistant to multicollinearity and having some other features needed for meaningful results. These models include enhanced ridge-regressions with several regularization parameters, regressions by data segments and by levels of the dependent variable, latent class models, unitary response, models, orthogonal and equidistant regressions, minimization in Lp-metric, and other criteria and models. All the approaches have been practically implemented in various projects and found useful for decision making in economics, management, marketing research, and other fields requiring data modeling and analysis.


2014 ◽  
Vol 70 (a1) ◽  
pp. C1269-C1269
Author(s):  
Ethan Merritt

"Tools for validating structural models of proteins are relatively mature and widely implemented. New protein crystallographers are introduced early on to the importance of monitoring conformance with expected φ/ψ values, favored rotamers, and local stereochemistry. The protein model is validated by the PDB at the time of deposition using criteria that are also available in the standard software packages used to refine the model being deposited. By contrast, crystallographers are typically much less familiar with procedures to validate key non-protein components of the model – cofactors, substrates, inhibitors, etc. It has been estimated that as many as a third of all ligands in the PDB exhibit preventable errors of some sort, ranging from minor deviations in expected bond angles to wholly implausible placement in the binding pocket. Following recommendations from the wwPDB Validation Task Force, the PDB recently began validating ligand geometry as an integral part of deposition processing. This means that many crystallographers will soon receive for the first time a ""grade"" on the quality of ligands in the structure they have just deposited. Some will be surprised, as I was following my first PDB deposition of 2014, at how easily bad ligand geometry can slip through the cracks in supposedly robust structure refinement protocols that their lab has used for many years. I will illustrate use of current tools for generating ligand restraints to guide model refinement. One is the jligand+coot+cprodrg pipeline integrated into the CCP4 suite. Another is the Grade web server provided as a community resource by Global Phasing Ltd. Furthermore I will show examples from recent in-house refinements of how things can still go wrong even if you do use these tools, and how we recovered. The new PDB deposition checks may expose errors in your ligand descriptions after the fact. This presentation may help you avoid introducing those errors in the first place."


2019 ◽  
Vol 4 (1) ◽  
Author(s):  
Monica Chagoyen ◽  
Juan A G Ranea ◽  
Florencio Pazos

Abstract Due to the large interdependence between the molecular components of living systems, many phenomena, including those related to pathologies, cannot be explained in terms of a single gene or a small number of genes. Molecular networks, representing different types of relationships between molecular entities, embody these large sets of interdependences in a framework that allow their mining from a systemic point of view to obtain information. These networks, often generated from high-throughput omics datasets, are used to study the complex phenomena of human pathologies from a systemic point of view. Complementing the reductionist approach of molecular biology, based on the detailed study of a small number of genes, systemic approaches to human diseases consider that these are better reflected in large and intricate networks of relationships between genes. These networks, and not the single genes, provide both better markers for diagnosing diseases and targets for treating them. Network approaches are being used to gain insight into the molecular basis of complex diseases and interpret the large datasets associated with them, such as genomic variants. Network formalism is also suitable for integrating large, heterogeneous and multilevel datasets associated with diseases from the molecular level to organismal and epidemiological scales. Many of these approaches are available to nonexpert users through standard software packages.


2020 ◽  
pp. 096228022096495
Author(s):  
Julio M Singer ◽  
Francisco MM Rocha ◽  
Antonio Carlos Pedroso-de-Lima ◽  
Giovani L Silva ◽  
Giuliana C Coatti ◽  
...  

We consider random changepoint segmented regression models to analyse data from a study conducted to verify whether treatment with stem cells may delay the onset of a symptom of amyotrophic lateral sclerosis in genetically modified mice. The proposed models capture the biological aspects of the data, accommodating a smooth transition between the periods with and without symptoms. An additional changepoint is considered to avoid negative predicted responses. Given the nonlinear nature of the model, we propose an algorithm to estimate the fixed parameters and to predict the random effects by fitting linear mixed models iteratively via standard software. We compare the variances obtained in the final step with bootstrapped and robust ones.


2010 ◽  
Vol 64 (2) ◽  
pp. 339-356 ◽  
Author(s):  
Caroline A. Hartzell ◽  
Matthew Hoddie ◽  
Molly Bauer

AbstractPrevious studies that have explored the effects of economic liberalization on civil war have employed aggregate measures of openness and have failed to account for potential endogeneity bias. In this research note, we suggest two improvements to the study of the relationship between liberalization and civil war. First, emphasizing that it is processes that systematically create new economic winners and losers rather than particular levels of economic openness that have the potential to generate conflict, we consider the effects of one oft-used means of liberalizing economies: the adoption by countries of International Monetary Fund (IMF) structural adjustment programs. Second, we use a bivariate probit model to address issues of endogeneity bias. Analyzing all data available for the period between 1970 and 1999, we identify an association between the adoption of IMF programs and the onset of civil war. This finding suggests that IMF programs to promote economic openness unintentionally may be creating an environment conducive to domestic conflict.


2007 ◽  
Vol 2007 ◽  
pp. 1-12
Author(s):  
Alastair Scott ◽  
Chris Wild

We look at fitting regression models using data from stratified cluster samples when the strata may depend in some way on the observed responses within clusters. One important subclass of examples is that of family studies in genetic epidemiology, where the probability of selecting a family into the study depends on the incidence of disease within the family. We develop the survey-weighted estimating equation approach for this problem, with particular emphasis on the estimation of superpopulation parameters. Full maximum likelihood for this class of problems involves modelling the population distribution of the covariates which is simply not feasible when there are a large number of potential covariates. We discuss efficient semiparametric maximum likelihood methods in which the covariate distribution is left completely unspecified. We further discuss the relative efficiencies of these two approaches.


2012 ◽  
Vol 4 (1) ◽  
Author(s):  
Aaron Smith

This article develops a new Markov breaks (MB) model for forecasting and making inference in linear regression models with breaks that are stochastic in both timing and magnitude. The MB model permits an arbitrarily large number of abrupt breaks in the regression coefficients and error variance, but it maintains a low-dimensional state space, and therefore it is computationally straightforward. In particular, the likelihood function can be computed analytically using a single iterative pass through the data and thereby avoids Monte Carlo integration. The model generates forecasts and conditional coefficient predictions using a probability weighted average over regressions that include progressively more historical data. I employ the MB model to study the predictive ability of the yield curve for quarterly GDP growth. I show evidence of breaks in the predictive relationship, and the MB model outperforms competing breaks models in an out-of-sample forecasting experiment.


Sign in / Sign up

Export Citation Format

Share Document