Regression Discontinuity Designs Using Covariates

Sebastian Calonico; Matias D. Cattaneo; Max H. Farrell; Rocío Titiunik

doi:10.1162/rest_a_00760

Regression Discontinuity Designs Using Covariates

Review of Economics and Statistics ◽

10.1162/rest_a_00760 ◽

2019 ◽

Vol 101 (3) ◽

pp. 442-451 ◽

Cited By ~ 89

Author(s):

Sebastian Calonico ◽

Matias D. Cattaneo ◽

Max H. Farrell ◽

Rocío Titiunik

Keyword(s):

Mean Squared Error ◽

Regression Discontinuity ◽

Standard Errors ◽

Local Polynomial ◽

Squared Error ◽

Software Packages ◽

Continuous Covariates ◽

Regression Functions ◽

Regression Discontinuity Designs ◽

Robust Standard Errors

We study regression discontinuity designs when covariates are included in the estimation. We examine local polynomial estimators that include discrete or continuous covariates in an additive separable way, but without imposing any parametric restrictions on the underlying population regression functions. We recommend a covariate-adjustment approach that retains consistency under intuitive conditions and characterize the potential for estimation and inference improvements. We also present new covariate-adjusted mean-squared error expansions and robust bias-corrected inference procedures, with heteroskedasticity-consistent and cluster-robust standard errors. We provide an empirical illustration and an extensive simulation study. All methods are implemented in R and Stata software packages.

Download Full-text

Regression Discontinuity and Heteroskedasticity Robust Standard Errors: Evidence from a Fixed-Bandwidth Approximation

Journal of Econometric Methods ◽

10.1515/jem-2016-0007 ◽

2018 ◽

Vol 8 (1) ◽

Cited By ~ 1

Author(s):

Otávio Bartalotti

Keyword(s):

Bias Correction ◽

Asymptotic Theory ◽

Regression Discontinuity ◽

Traditional Approach ◽

Standard Errors ◽

Test Coverage ◽

Regression Discontinuity Designs ◽

The Common ◽

Theoretical Results ◽

Robust Standard Errors

AbstractIn regression discontinuity designs (RD), for a given bandwidth, researchers can estimate standard errors based on different variance formulas obtained under different asymptotic frameworks. In the traditional approach the bandwidth shrinks to zero as sample size increases; alternatively, the bandwidth could be treated as fixed. The main theoretical results for RD rely on the former, while most applications in the literature treat the estimates as parametric, implementing the usual heteroskedasticity-robust standard errors. This paper develops the “fixed-bandwidth” alternative asymptotic theory for RD designs, which sheds light on the connection between both approaches. I provide alternative formulas (approximations) for the bias and variance of common RD estimators, and conditions under which both approximations are equivalent. Simulations document the improvements in test coverage that fixed-bandwidth approximations achieve relative to traditional approximations, especially when there is local heteroskedasticity. Feasible estimators of fixed-bandwidth standard errors are easy to implement and are akin to treating RD estimators aslocallyparametric, validating the common empirical practice of using heteroskedasticity-robust standard errors in RD settings. Bias mitigation approaches are discussed and a novel bootstrap higher-order bias correction procedure based on the fixed bandwidth asymptotics is suggested.

Download Full-text

Local Polynomial Order in Regression Discontinuity Designs

10.3386/w27424 ◽

2020 ◽

Author(s):

Zhuan Pei ◽

David Lee ◽

David Card ◽

Andrea Weber

Keyword(s):

Regression Discontinuity ◽

Local Polynomial ◽

Regression Discontinuity Designs

Download Full-text

Inference in Regression Discontinuity Designs with a Discrete Running Variable

The American Economic Review ◽

10.1257/aer.20160945 ◽

2018 ◽

Vol 108 (8) ◽

pp. 2277-2304 ◽

Cited By ~ 43

Author(s):

Michal Kolesár ◽

Christoph Rothe

Keyword(s):

Confidence Intervals ◽

Conditional Expectation ◽

Regression Discontinuity ◽

Model Misspecification ◽

Standard Errors ◽

Moderate Number ◽

Regression Discontinuity Designs ◽

The Common ◽

Present Simulation ◽

Theoretical Results

We consider inference in regression discontinuity designs when the running variable only takes a moderate number of distinct values. In particular, we study the common practice of using confidence intervals (CIs) based on standard errors that are clustered by the running variable as a means to make inference robust to model misspecification (Lee and Card 2008). We derive theoretical results and present simulation and empirical evidence showing that these CIs do not guard against model misspecification, and that they have poor coverage properties. We therefore recommend against using these CIs in practice. We instead propose two alternative CIs with guaranteed coverage properties under easily interpretable restrictions on the conditional expectation function. (JEL C13, C51, J13, J31, J64, J65)

Download Full-text

Model Uncertainty and Model Averaging in Regression Discontinuity Designs

Journal of Econometric Methods ◽

10.1515/jem-2014-0016 ◽

2016 ◽

Vol 5 (1) ◽

Cited By ~ 1

Author(s):

Patrick Button

Keyword(s):

Model Uncertainty ◽

Regression Discontinuity ◽

Model Averaging ◽

Parametric Model ◽

Parametric Models ◽

Standard Errors ◽

Randomized Experiments ◽

Monte Carlo Experiments ◽

Regression Discontinuity Designs ◽

Frequentist Model Averaging

AbstractParametric (polynomial) models are popular in research employing regression discontinuity designs and are required when data are discrete. However, researchers often choose a parametric model based on data inspection or pretesting. These approaches lead to standard errors and confidence intervals that are too small because they do not incorporate model uncertainty. I propose using Frequentist model averaging to incorporate model uncertainty into parametric models. My Monte Carlo experiments show that Frequentist model averaging leads to mean square error and coverage probability improvements over pretesting. An application to [Lee, D. S. 2008. “Randomized Experiments From Non-Random Selection in US House Elections.”

Download Full-text

Multivariate Local Polynomial Regression with Application to Shenzhen Component Index

Discrete Dynamics in Nature and Society ◽

10.1155/2011/930958 ◽

2011 ◽

Vol 2011 ◽

pp. 1-11 ◽

Cited By ~ 7

Author(s):

Liyun Su

Keyword(s):

Time Series ◽

Stock Market ◽

Polynomial Regression ◽

Mean Squared Error ◽

Stock Index ◽

Local Polynomial Regression ◽

Local Polynomial ◽

Squared Error ◽

Multivariate Local Polynomial Regression ◽

Univariate Predictor

This study attempts to characterize and predict stock index series in Shenzhen stock market using the concepts of multivariate local polynomial regression. Based on nonlinearity and chaos of the stock index time series, multivariate local polynomial prediction methods and univariate local polynomial prediction method, all of which use the concept of phase space reconstruction according to Takens' Theorem, are considered. To fit the stock index series, the single series changes into bivariate series. To evaluate the results, the multivariate predictor for bivariate time series based on multivariate local polynomial model is compared with univariate predictor with the same Shenzhen stock index data. The numerical results obtained by Shenzhen component index show that the prediction mean squared error of the multivariate predictor is much smaller than the univariate one and is much better than the existed three methods. Even if the last half of the training data are used in the multivariate predictor, the prediction mean squared error is smaller than the univariate predictor. Multivariate local polynomial prediction model for nonsingle time series is a useful tool for stock market price prediction.

Download Full-text

Regression Discontinuity and Heteroskedasticity Robust Standard Errors: Evidence from a Fixed-Bandwidth Approximation

SSRN Electronic Journal ◽

10.2139/ssrn.3193314 ◽

2018 ◽

Author(s):

Otávio Bartalotti

Keyword(s):

Regression Discontinuity ◽

Standard Errors ◽

Robust Standard Errors

Download Full-text

Power calculations for regression-discontinuity designs

The Stata Journal Promoting communications on statistics and Stata ◽

10.1177/1536867x19830919 ◽

2019 ◽

Vol 19 (1) ◽

pp. 210-245 ◽

Cited By ~ 10

Author(s):

Matias D. Cattaneo ◽

Rocío Titiunik ◽

Gonzalo Vazquez-Bare

Keyword(s):

Sample Size ◽

Regression Discontinuity ◽

Sample Selection ◽

R Package ◽

Power Calculations ◽

Local Polynomial ◽

Regression Discontinuity Designs ◽

Local Polynomial Estimation ◽

Polynomial Estimation ◽

Inference Methods

In this article, we introduce two commands, rdpow and rdsampsi, that conduct power calculations and survey sample selection when using local polynomial estimation and inference methods in regression-discontinuity designs. rdpow conducts power calculations using modern robust bias-corrected local polynomial inference procedures and allows for new hypothetical sample sizes and bandwidth selections, among other features. rdsampsi uses power calculations to compute the minimum sample size required to achieve a desired level of power, given estimated or user-supplied bandwidths, biases, and variances. Together, these commands are useful when devising new experiments or surveys in regression-discontinuity designs, which will later be analyzed using modern local polynomial techniques for estimation, inference, and falsification. Because our commands use the communitycontributed (and R) package rdrobust for the underlying bandwidths, biases, and variances estimation, all the options currently available in rdrobust can also be used for power calculations and sample-size selection, including preintervention covariate adjustment, clustered sampling, and many bandwidth selectors. Finally, we also provide companion R functions with the same syntax and capabilities.

Download Full-text

Local Polynomial Order in Regression Discontinuity Designs

SSRN Electronic Journal ◽

10.2139/ssrn.3637725 ◽

2020 ◽

Author(s):

Zhuan Pei ◽

David Lee ◽

David E. Card ◽

Andrea Weber

Keyword(s):

Regression Discontinuity ◽

Local Polynomial ◽

Regression Discontinuity Designs

Download Full-text

Multiple imputation with sequential penalized regression

Statistical Methods in Medical Research ◽

10.1177/0962280218755574 ◽

2018 ◽

Vol 28 (5) ◽

pp. 1311-1327 ◽

Cited By ~ 2

Author(s):

Faisal M Zahid ◽

Christian Heumann

Keyword(s):

Missing Data ◽

Sample Size ◽

Multiple Imputation ◽

Missing Values ◽

Mean Squared Error ◽

Real Life ◽

Penalized Regression ◽

Parameter Estimates ◽

Squared Error ◽

Software Packages

Missing data is a common issue that can cause problems in estimation and inference in biomedical, epidemiological and social research. Multiple imputation is an increasingly popular approach for handling missing data. In case of a large number of covariates with missing data, existing multiple imputation software packages may not work properly and often produce errors. We propose a multiple imputation algorithm called mispr based on sequential penalized regression models. Each variable with missing values is assumed to have a different distributional form and is imputed with its own imputation model using the ridge penalty. In the case of a large number of predictors with respect to the sample size, the use of a quadratic penalty guarantees unique estimates for the parameters and leads to better predictions than the usual Maximum Likelihood Estimation (MLE), with a good compromise between bias and variance. As a result, the proposed algorithm performs well and provides imputed values that are better even for a large number of covariates with small samples. The results are compared with the existing software packages mice, VIM and Amelia in simulation studies. The missing at random mechanism was the main assumption in the simulation study. The imputation performance of the proposed algorithm is evaluated with mean squared imputation error and mean absolute imputation error. The mean squared error ([Formula: see text]), parameter estimates with their standard errors and confidence intervals are also computed to compare the performance in the regression context. The proposed algorithm is observed to be a good competitor to the existing algorithms, with smaller mean squared imputation error, mean absolute imputation error and mean squared error. The algorithm’s performance becomes considerably better than that of the existing algorithms with increasing number of covariates, especially when the number of predictors is close to or even greater than the sample size. Two real-life datasets are also used to examine the performance of the proposed algorithm using simulations.

Download Full-text

Simple and honest confidence intervals in nonparametric regression

Quantitative Economics ◽

10.3982/qe1199 ◽

2020 ◽

Vol 11 (1) ◽

pp. 1-39 ◽

Cited By ~ 6

Author(s):

Timothy B. Armstrong ◽

Michal Kolesár

Keyword(s):

Rate Of Convergence ◽

Confidence Intervals ◽

Nonparametric Regression ◽

Mean Squared Error ◽

Monte Carlo Analysis ◽

Critical Value ◽

Local Polynomial ◽

Squared Error ◽

The Common ◽

In Cis

We consider the problem of constructing honest confidence intervals (CIs) for a scalar parameter of interest, such as the regression discontinuity parameter, in nonparametric regression based on kernel or local polynomial estimators. To ensure that our CIs are honest, we use critical values that take into account the possible bias of the estimator upon which the CIs are based. We show that this approach leads to CIs that are more efficient than conventional CIs that achieve coverage by undersmoothing or subtracting an estimate of the bias. We give sharp efficiency bounds of using different kernels, and derive the optimal bandwidth for constructing honest CIs. We show that using the bandwidth that minimizes the maximum mean‐squared error results in CIs that are nearly efficient and that in this case, the critical value depends only on the rate of convergence. For the common case in which the rate of convergence is n −2/5, the appropriate critical value for 95% CIs is 2.18, rather than the usual 1.96 critical value. We illustrate our results in a Monte Carlo analysis and an empirical application.

Download Full-text