Simple and honest confidence intervals in nonparametric regression

Timothy B. Armstrong; Michal Kolesár

doi:10.3982/qe1199

Simple and honest confidence intervals in nonparametric regression

Quantitative Economics ◽

10.3982/qe1199 ◽

2020 ◽

Vol 11 (1) ◽

pp. 1-39 ◽

Cited By ~ 6

Author(s):

Timothy B. Armstrong ◽

Michal Kolesár

Keyword(s):

Rate Of Convergence ◽

Confidence Intervals ◽

Nonparametric Regression ◽

Mean Squared Error ◽

Monte Carlo Analysis ◽

Critical Value ◽

Local Polynomial ◽

Squared Error ◽

The Common ◽

In Cis

We consider the problem of constructing honest confidence intervals (CIs) for a scalar parameter of interest, such as the regression discontinuity parameter, in nonparametric regression based on kernel or local polynomial estimators. To ensure that our CIs are honest, we use critical values that take into account the possible bias of the estimator upon which the CIs are based. We show that this approach leads to CIs that are more efficient than conventional CIs that achieve coverage by undersmoothing or subtracting an estimate of the bias. We give sharp efficiency bounds of using different kernels, and derive the optimal bandwidth for constructing honest CIs. We show that using the bandwidth that minimizes the maximum mean‐squared error results in CIs that are nearly efficient and that in this case, the critical value depends only on the rate of convergence. For the common case in which the rate of convergence is n −2/5, the appropriate critical value for 95% CIs is 2.18, rather than the usual 1.96 critical value. We illustrate our results in a Monte Carlo analysis and an empirical application.

Download Full-text

Combining two soil property rasters using an adaptive gating approach

Soil Research ◽

10.1071/sr14275 ◽

2015 ◽

Vol 53 (8) ◽

pp. 907 ◽

Cited By ~ 5

Author(s):

David Clifford ◽

Yi Guo

Keyword(s):

Confidence Intervals ◽

Soil Property ◽

Mean Squared Error ◽

Geographic Space ◽

Covariate Information ◽

Squared Error ◽

Inverse Variance ◽

Machine Learning Approach ◽

Further Development ◽

Better Than

Given the wide variety of ways one can measure and record soil properties, it is not uncommon to have multiple overlapping predictive maps for a particular soil property. One is then faced with the challenge of choosing the best prediction at a particular point, either by selecting one of the maps, or by combining them together in some optimal manner. This question was recently examined in detail when Malone et al. (2014) compared four different methods for combining a digital soil mapping product with a disaggregation product based on legacy data. These authors also examined the issue of how to compute confidence intervals for the resulting map based on confidence intervals associated with the original input products. In this paper, we propose a new method to combine models called adaptive gating, which is inspired by the use of gating functions in mixture of experts, a machine learning approach to forming hierarchical classifiers. We compare it here with two standard approaches – inverse-variance weights and a regression based approach. One of the benefits of the adaptive gating approach is that it allows weights to vary based on covariate information or across geographic space. As such, this presents a method that explicitly takes full advantage of the spatial nature of the maps we are trying to blend. We also suggest a conservative method for combining confidence intervals. We show that the root mean-squared error of predictions from the adaptive gating approach is similar to that of other standard approaches under cross-validation. However under independent validation the adaptive gating approach works better than the alternatives and as such it warrants further study in other areas of application and further development to reduce its computational complexity.

Download Full-text

On the Use of a Modified Intersection of Confidence Intervals (MICIH) Kernel Density Estimation Approach

Athens Journal of Sciences ◽

10.30958/ajs.8-4-4 ◽

2021 ◽

Vol 8 (4) ◽

pp. 309-332

Author(s):

Efosa Michael Ogbeide ◽

Joseph Erunmwosa Osemwenkhae

Keyword(s):

Confidence Intervals ◽

Density Estimation ◽

Kernel Density Estimation ◽

Mean Squared Error ◽

Kernel Density ◽

Window Size ◽

National Bureau ◽

Squared Error ◽

Data Density ◽

Adaptive Kernel

Density estimation is an important aspect of statistics. Statistical inference often requires the knowledge of observed data density. A common method of density estimation is the kernel density estimation (KDE). It is a nonparametric estimation approach which requires a kernel function and a window size (smoothing parameter H). It aids density estimation and pattern recognition. So, this work focuses on the use of a modified intersection of confidence intervals (MICIH) approach in estimating density. The Nigerian crime rate data reported to the Police as reported by the National Bureau of Statistics was used to demonstrate this new approach. This approach in the multivariate kernel density estimation is based on the data. The main way to improve density estimation is to obtain a reduced mean squared error (MSE), the errors for this approach was evaluated. Some improvements were seen. The aim is to achieve adaptive kernel density estimation. This was achieved under a sufficiently smoothing technique. This adaptive approach was based on the bandwidths selection. The quality of the estimates obtained of the MICIH approach when applied, showed some improvements over the existing methods. The MICIH approach has reduced mean squared error and relative faster rate of convergence compared to some other approaches. The approach of MICIH has reduced points of discontinuities in the graphical densities the datasets. This will help to correct points of discontinuities and display adaptive density. Keywords: approach, bandwidth, estimate, error, kernel density

Download Full-text

Additive Interactive Regression Models: Circumvention of the Curse of Dimensionality

Econometric Theory ◽

10.1017/s0266466600005478 ◽

1990 ◽

Vol 6 (4) ◽

pp. 466-479 ◽

Cited By ~ 42

Author(s):

Donald W.K. Andrews ◽

Yoon-Jae Whang

Keyword(s):

Rate Of Convergence ◽

Nonparametric Regression ◽

Regression Models ◽

Regression Function ◽

Curse Of Dimensionality ◽

Finite Sample ◽

Squared Error ◽

Convergence Results ◽

Additive Regression ◽

Parametric Regression Models

This paper considers series estimators of additive interactive regression (AIR) models. AIR models are nonparametric regression models that generalize additive regression models by allowing interactions between different regressor variables. They place more restrictions on the regression function, however, than do fully nonparametric regression models. By doing so, they attempt to circumvent the curse of dimensionality that afflicts the estimation of fully non-parametric regression models.In this paper, we present a finite sample bound and asymptotic rate of convergence results for the mean average squared error of series estimators that show that AIR models do circumvent the curse of dimensionality. A lower bound on the rate of convergence of these estimators is shown to depend on the order of the AIR model and the smoothness of the regression function, but not on the dimension of the regressor vector. Series estimators with fixed and data-dependent truncation parameters are considered.

Download Full-text

A Note on the Adaptive Estimation of a Multiplicative Separable Regression Function

ISRN Applied Mathematics ◽

10.1155/2014/271303 ◽

2014 ◽

Vol 2014 ◽

pp. 1-7

Author(s):

Christophe Chesneau

Keyword(s):

Regression Model ◽

Rate Of Convergence ◽

Nonparametric Regression ◽

Adaptive Estimation ◽

Regression Function ◽

Nonparametric Regression Model ◽

Mean Integrated Squared Error ◽

Squared Error ◽

Integrated Squared Error ◽

Separable Regression

We investigate the estimation of a multiplicative separable regression function from a bidimensional nonparametric regression model with random design. We present a general estimator for this problem and study its mean integrated squared error (MISE) properties. A wavelet version of this estimator is developed. In some situations, we prove that it attains the standard unidimensional rate of convergence under the MISE over Besov balls.

Download Full-text

Using Dense SNPs and Large Families to Reduce Sequencing Costs

Internal Medicine Review ◽

10.18103/imr.v2i11.279 ◽

2016 ◽

Vol 2 (11) ◽

Author(s):

William Stewart

Keyword(s):

Confidence Intervals ◽

Language Impairment ◽

Specific Language Impairment ◽

Mean Squared Error ◽

Real Data ◽

Gene Location ◽

Pedigree Structure ◽

Squared Error ◽

Snp Data ◽

Large Families

<p>For modern linkage studies involving many small families, Stewart et al. (2009)[1] introduced an efficient estimator of disease gene location (denoted ) that averages location estimates from random subsamples of the dense SNP data. Their estimator has lower mean squared error than competing estimators and yields narrower confidence intervals (CIs) as well. However, when the number of families is small and the pedigree structure is large (possibly extended), the computational feasibility and statistical properties of are not known. We use simulation and real data to show that (1) for this extremely important but often overlooked study design, CIs based on are narrower than CIs based on a single subsample, and (2) the reduction in CI length is proportional to the square root of the expected Monte Carlo error. As a proof of principle, we applied to the dense SNP data of four large, extended, specific language impairment (SLI) pedigrees, and reduced the single subsample CI by 18%. In summary, confidence intervals based on should minimize re-sequencing costs beneath linkage peaks, and reduce the number of candidate genes to investigate.</p>

Download Full-text

Average Derivative Estimation from Biased Data

ISRN Probability and Statistics ◽

10.1155/2014/864530 ◽

2014 ◽

Vol 2014 ◽

pp. 1-7 ◽

Cited By ~ 1

Author(s):

Christophe Chesneau ◽

Maher Kachour ◽

Fabien Navarro

Keyword(s):

Rate Of Convergence ◽

Mean Squared Error ◽

Weighted Average ◽

Derivative Estimation ◽

Squared Error ◽

The Mean ◽

Average Derivative Estimation ◽

Average Derivative

We investigate the estimation of the density-weighted average derivative from biased data. An estimator integrating a plug-in approach and wavelet projections is constructed. We prove that it attains the parametric rate of convergence 1/n under the mean squared error.

Download Full-text

The change in estimate method for selecting confounders: A simulation study

Statistical Methods in Medical Research ◽

10.1177/09622802211034219 ◽

2021 ◽

pp. 096228022110342

Author(s):

Denis Talbot ◽

Awa Diop ◽

Mathilde Lavigne-Robichaud ◽

Chantal Brisson

Keyword(s):

Confidence Intervals ◽

Simulation Study ◽

Mean Squared Error ◽

Real Data ◽

Significance Test ◽

Simulation Studies ◽

Squared Error ◽

Estimate Method ◽

The Mean ◽

Work And Health

Background The change in estimate is a popular approach for selecting confounders in epidemiology. It is recommended in epidemiologic textbooks and articles over significance test of coefficients, but concerns have been raised concerning its validity. Few simulation studies have been conducted to investigate its performance. Methods An extensive simulation study was realized to compare different implementations of the change in estimate method. The implementations were also compared when estimating the association of body mass index with diastolic blood pressure in the PROspective Québec Study on Work and Health. Results All methods were susceptible to introduce important bias and to produce confidence intervals that included the true effect much less often than expected in at least some scenarios. Overall mixed results were obtained regarding the accuracy of estimators, as measured by the mean squared error. No implementation adequately differentiated confounders from non-confounders. In the real data analysis, none of the implementation decreased the estimated standard error. Conclusion Based on these results, it is questionable whether change in estimate methods are beneficial in general, considering their low ability to improve the precision of estimates without introducing bias and inability to yield valid confidence intervals or to identify true confounders.

Download Full-text

Multivariate Local Polynomial Regression with Application to Shenzhen Component Index

Discrete Dynamics in Nature and Society ◽

10.1155/2011/930958 ◽

2011 ◽

Vol 2011 ◽

pp. 1-11 ◽

Cited By ~ 7

Author(s):

Liyun Su

Keyword(s):

Time Series ◽

Stock Market ◽

Polynomial Regression ◽

Mean Squared Error ◽

Stock Index ◽

Local Polynomial Regression ◽

Local Polynomial ◽

Squared Error ◽

Multivariate Local Polynomial Regression ◽

Univariate Predictor

This study attempts to characterize and predict stock index series in Shenzhen stock market using the concepts of multivariate local polynomial regression. Based on nonlinearity and chaos of the stock index time series, multivariate local polynomial prediction methods and univariate local polynomial prediction method, all of which use the concept of phase space reconstruction according to Takens' Theorem, are considered. To fit the stock index series, the single series changes into bivariate series. To evaluate the results, the multivariate predictor for bivariate time series based on multivariate local polynomial model is compared with univariate predictor with the same Shenzhen stock index data. The numerical results obtained by Shenzhen component index show that the prediction mean squared error of the multivariate predictor is much smaller than the univariate one and is much better than the existed three methods. Even if the last half of the training data are used in the multivariate predictor, the prediction mean squared error is smaller than the univariate predictor. Multivariate local polynomial prediction model for nonsingle time series is a useful tool for stock market price prediction.

Download Full-text

Regression Discontinuity Designs Using Covariates

Review of Economics and Statistics ◽

10.1162/rest_a_00760 ◽

2019 ◽

Vol 101 (3) ◽

pp. 442-451 ◽

Cited By ~ 89

Author(s):

Sebastian Calonico ◽

Matias D. Cattaneo ◽

Max H. Farrell ◽

Rocío Titiunik

Keyword(s):

Mean Squared Error ◽

Regression Discontinuity ◽

Standard Errors ◽

Local Polynomial ◽

Squared Error ◽

Software Packages ◽

Continuous Covariates ◽

Regression Functions ◽

Regression Discontinuity Designs ◽

Robust Standard Errors

We study regression discontinuity designs when covariates are included in the estimation. We examine local polynomial estimators that include discrete or continuous covariates in an additive separable way, but without imposing any parametric restrictions on the underlying population regression functions. We recommend a covariate-adjustment approach that retains consistency under intuitive conditions and characterize the potential for estimation and inference improvements. We also present new covariate-adjusted mean-squared error expansions and robust bias-corrected inference procedures, with heteroskedasticity-consistent and cluster-robust standard errors. We provide an empirical illustration and an extensive simulation study. All methods are implemented in R and Stata software packages.

Download Full-text

How a Bayesian Might Estimate the Distribution of Cronbach’s Alpha From Ordinal-Dynamic Scaled Data

Methodology ◽

10.1027/1614-2241/a000008 ◽

2010 ◽

Vol 6 (2) ◽

pp. 71-82 ◽

Cited By ~ 4

Author(s):

Byron J. Gajewski ◽

Diane K. Boyle ◽

Sarah Thompson

Keyword(s):

Confidence Intervals ◽

Bayesian Approach ◽

Mean Squared Error ◽

Cronbach’S Alpha ◽

Interval Estimates ◽

Response Options ◽

Practical Advice ◽

Cronbach's Alpha ◽

Squared Error ◽

The Bayesian Approach

We demonstrate the utility of a Bayesian-based approach for calculating intervals of Cronbach’s alpha from a psychological instrument having ordinal responses with a dynamic scale. A small number of response options on an instrument will cause traditional-based interval estimates to be biased. Ordinal-based solutions are problematic because there is no clear mechanism for handling the dynamic scale. One way to remedy the bias is to adjust with a Bayesian approach. The Bayesian approach adjusts the bias and allows theoretically simple calculations of Cronbach’s alpha and intervals. We demonstrate the calculations of the Bayesian approach while at the same time offer a comparison to more traditional-based methods using both credible (or confidence) intervals and mean squared error. Practical advice is offered.

Download Full-text