scholarly journals Regression Discontinuity Designs Using Covariates

2019 ◽  
Vol 101 (3) ◽  
pp. 442-451 ◽  
Author(s):  
Sebastian Calonico ◽  
Matias D. Cattaneo ◽  
Max H. Farrell ◽  
Rocío Titiunik

We study regression discontinuity designs when covariates are included in the estimation. We examine local polynomial estimators that include discrete or continuous covariates in an additive separable way, but without imposing any parametric restrictions on the underlying population regression functions. We recommend a covariate-adjustment approach that retains consistency under intuitive conditions and characterize the potential for estimation and inference improvements. We also present new covariate-adjusted mean-squared error expansions and robust bias-corrected inference procedures, with heteroskedasticity-consistent and cluster-robust standard errors. We provide an empirical illustration and an extensive simulation study. All methods are implemented in R and Stata software packages.

2018 ◽  
Vol 8 (1) ◽  
Author(s):  
Otávio Bartalotti

AbstractIn regression discontinuity designs (RD), for a given bandwidth, researchers can estimate standard errors based on different variance formulas obtained under different asymptotic frameworks. In the traditional approach the bandwidth shrinks to zero as sample size increases; alternatively, the bandwidth could be treated as fixed. The main theoretical results for RD rely on the former, while most applications in the literature treat the estimates as parametric, implementing the usual heteroskedasticity-robust standard errors. This paper develops the “fixed-bandwidth” alternative asymptotic theory for RD designs, which sheds light on the connection between both approaches. I provide alternative formulas (approximations) for the bias and variance of common RD estimators, and conditions under which both approximations are equivalent. Simulations document the improvements in test coverage that fixed-bandwidth approximations achieve relative to traditional approximations, especially when there is local heteroskedasticity. Feasible estimators of fixed-bandwidth standard errors are easy to implement and are akin to treating RD estimators aslocallyparametric, validating the common empirical practice of using heteroskedasticity-robust standard errors in RD settings. Bias mitigation approaches are discussed and a novel bootstrap higher-order bias correction procedure based on the fixed bandwidth asymptotics is suggested.


2020 ◽  
Author(s):  
Zhuan Pei ◽  
David Lee ◽  
David Card ◽  
Andrea Weber

2018 ◽  
Vol 108 (8) ◽  
pp. 2277-2304 ◽  
Author(s):  
Michal Kolesár ◽  
Christoph Rothe

We consider inference in regression discontinuity designs when the running variable only takes a moderate number of distinct values. In particular, we study the common practice of using confidence intervals (CIs) based on standard errors that are clustered by the running variable as a means to make inference robust to model misspecification (Lee and Card 2008). We derive theoretical results and present simulation and empirical evidence showing that these CIs do not guard against model misspecification, and that they have poor coverage properties. We therefore recommend against using these CIs in practice. We instead propose two alternative CIs with guaranteed coverage properties under easily interpretable restrictions on the conditional expectation function. (JEL C13, C51, J13, J31, J64, J65)


2016 ◽  
Vol 5 (1) ◽  
Author(s):  
Patrick Button

AbstractParametric (polynomial) models are popular in research employing regression discontinuity designs and are required when data are discrete. However, researchers often choose a parametric model based on data inspection or pretesting. These approaches lead to standard errors and confidence intervals that are too small because they do not incorporate model uncertainty. I propose using Frequentist model averaging to incorporate model uncertainty into parametric models. My Monte Carlo experiments show that Frequentist model averaging leads to mean square error and coverage probability improvements over pretesting. An application to [Lee, D. S. 2008. “Randomized Experiments From Non-Random Selection in US House Elections.”


2011 ◽  
Vol 2011 ◽  
pp. 1-11 ◽  
Author(s):  
Liyun Su

This study attempts to characterize and predict stock index series in Shenzhen stock market using the concepts of multivariate local polynomial regression. Based on nonlinearity and chaos of the stock index time series, multivariate local polynomial prediction methods and univariate local polynomial prediction method, all of which use the concept of phase space reconstruction according to Takens' Theorem, are considered. To fit the stock index series, the single series changes into bivariate series. To evaluate the results, the multivariate predictor for bivariate time series based on multivariate local polynomial model is compared with univariate predictor with the same Shenzhen stock index data. The numerical results obtained by Shenzhen component index show that the prediction mean squared error of the multivariate predictor is much smaller than the univariate one and is much better than the existed three methods. Even if the last half of the training data are used in the multivariate predictor, the prediction mean squared error is smaller than the univariate predictor. Multivariate local polynomial prediction model for nonsingle time series is a useful tool for stock market price prediction.


Author(s):  
Matias D. Cattaneo ◽  
Rocío Titiunik ◽  
Gonzalo Vazquez-Bare

In this article, we introduce two commands, rdpow and rdsampsi, that conduct power calculations and survey sample selection when using local polynomial estimation and inference methods in regression-discontinuity designs. rdpow conducts power calculations using modern robust bias-corrected local polynomial inference procedures and allows for new hypothetical sample sizes and bandwidth selections, among other features. rdsampsi uses power calculations to compute the minimum sample size required to achieve a desired level of power, given estimated or user-supplied bandwidths, biases, and variances. Together, these commands are useful when devising new experiments or surveys in regression-discontinuity designs, which will later be analyzed using modern local polynomial techniques for estimation, inference, and falsification. Because our commands use the communitycontributed (and R) package rdrobust for the underlying bandwidths, biases, and variances estimation, all the options currently available in rdrobust can also be used for power calculations and sample-size selection, including preintervention covariate adjustment, clustered sampling, and many bandwidth selectors. Finally, we also provide companion R functions with the same syntax and capabilities.


2018 ◽  
Vol 28 (5) ◽  
pp. 1311-1327 ◽  
Author(s):  
Faisal M Zahid ◽  
Christian Heumann

Missing data is a common issue that can cause problems in estimation and inference in biomedical, epidemiological and social research. Multiple imputation is an increasingly popular approach for handling missing data. In case of a large number of covariates with missing data, existing multiple imputation software packages may not work properly and often produce errors. We propose a multiple imputation algorithm called mispr based on sequential penalized regression models. Each variable with missing values is assumed to have a different distributional form and is imputed with its own imputation model using the ridge penalty. In the case of a large number of predictors with respect to the sample size, the use of a quadratic penalty guarantees unique estimates for the parameters and leads to better predictions than the usual Maximum Likelihood Estimation (MLE), with a good compromise between bias and variance. As a result, the proposed algorithm performs well and provides imputed values that are better even for a large number of covariates with small samples. The results are compared with the existing software packages mice, VIM and Amelia in simulation studies. The missing at random mechanism was the main assumption in the simulation study. The imputation performance of the proposed algorithm is evaluated with mean squared imputation error and mean absolute imputation error. The mean squared error ([Formula: see text]), parameter estimates with their standard errors and confidence intervals are also computed to compare the performance in the regression context. The proposed algorithm is observed to be a good competitor to the existing algorithms, with smaller mean squared imputation error, mean absolute imputation error and mean squared error. The algorithm’s performance becomes considerably better than that of the existing algorithms with increasing number of covariates, especially when the number of predictors is close to or even greater than the sample size. Two real-life datasets are also used to examine the performance of the proposed algorithm using simulations.


2020 ◽  
Vol 11 (1) ◽  
pp. 1-39 ◽  
Author(s):  
Timothy B. Armstrong ◽  
Michal Kolesár

We consider the problem of constructing honest confidence intervals (CIs) for a scalar parameter of interest, such as the regression discontinuity parameter, in nonparametric regression based on kernel or local polynomial estimators. To ensure that our CIs are honest, we use critical values that take into account the possible bias of the estimator upon which the CIs are based. We show that this approach leads to CIs that are more efficient than conventional CIs that achieve coverage by undersmoothing or subtracting an estimate of the bias. We give sharp efficiency bounds of using different kernels, and derive the optimal bandwidth for constructing honest CIs. We show that using the bandwidth that minimizes the maximum mean‐squared error results in CIs that are nearly efficient and that in this case, the critical value depends only on the rate of convergence. For the common case in which the rate of convergence is n −2/5, the appropriate critical value for 95% CIs is 2.18, rather than the usual 1.96 critical value. We illustrate our results in a Monte Carlo analysis and an empirical application.


Sign in / Sign up

Export Citation Format

Share Document