scholarly journals Simple and honest confidence intervals in nonparametric regression

2020 ◽  
Vol 11 (1) ◽  
pp. 1-39 ◽  
Author(s):  
Timothy B. Armstrong ◽  
Michal Kolesár

We consider the problem of constructing honest confidence intervals (CIs) for a scalar parameter of interest, such as the regression discontinuity parameter, in nonparametric regression based on kernel or local polynomial estimators. To ensure that our CIs are honest, we use critical values that take into account the possible bias of the estimator upon which the CIs are based. We show that this approach leads to CIs that are more efficient than conventional CIs that achieve coverage by undersmoothing or subtracting an estimate of the bias. We give sharp efficiency bounds of using different kernels, and derive the optimal bandwidth for constructing honest CIs. We show that using the bandwidth that minimizes the maximum mean‐squared error results in CIs that are nearly efficient and that in this case, the critical value depends only on the rate of convergence. For the common case in which the rate of convergence is n −2/5, the appropriate critical value for 95% CIs is 2.18, rather than the usual 1.96 critical value. We illustrate our results in a Monte Carlo analysis and an empirical application.

Soil Research ◽  
2015 ◽  
Vol 53 (8) ◽  
pp. 907 ◽  
Author(s):  
David Clifford ◽  
Yi Guo

Given the wide variety of ways one can measure and record soil properties, it is not uncommon to have multiple overlapping predictive maps for a particular soil property. One is then faced with the challenge of choosing the best prediction at a particular point, either by selecting one of the maps, or by combining them together in some optimal manner. This question was recently examined in detail when Malone et al. (2014) compared four different methods for combining a digital soil mapping product with a disaggregation product based on legacy data. These authors also examined the issue of how to compute confidence intervals for the resulting map based on confidence intervals associated with the original input products. In this paper, we propose a new method to combine models called adaptive gating, which is inspired by the use of gating functions in mixture of experts, a machine learning approach to forming hierarchical classifiers. We compare it here with two standard approaches – inverse-variance weights and a regression based approach. One of the benefits of the adaptive gating approach is that it allows weights to vary based on covariate information or across geographic space. As such, this presents a method that explicitly takes full advantage of the spatial nature of the maps we are trying to blend. We also suggest a conservative method for combining confidence intervals. We show that the root mean-squared error of predictions from the adaptive gating approach is similar to that of other standard approaches under cross-validation. However under independent validation the adaptive gating approach works better than the alternatives and as such it warrants further study in other areas of application and further development to reduce its computational complexity.


2021 ◽  
Vol 8 (4) ◽  
pp. 309-332
Author(s):  
Efosa Michael Ogbeide ◽  
Joseph Erunmwosa Osemwenkhae

Density estimation is an important aspect of statistics. Statistical inference often requires the knowledge of observed data density. A common method of density estimation is the kernel density estimation (KDE). It is a nonparametric estimation approach which requires a kernel function and a window size (smoothing parameter H). It aids density estimation and pattern recognition. So, this work focuses on the use of a modified intersection of confidence intervals (MICIH) approach in estimating density. The Nigerian crime rate data reported to the Police as reported by the National Bureau of Statistics was used to demonstrate this new approach. This approach in the multivariate kernel density estimation is based on the data. The main way to improve density estimation is to obtain a reduced mean squared error (MSE), the errors for this approach was evaluated. Some improvements were seen. The aim is to achieve adaptive kernel density estimation. This was achieved under a sufficiently smoothing technique. This adaptive approach was based on the bandwidths selection. The quality of the estimates obtained of the MICIH approach when applied, showed some improvements over the existing methods. The MICIH approach has reduced mean squared error and relative faster rate of convergence compared to some other approaches. The approach of MICIH has reduced points of discontinuities in the graphical densities the datasets. This will help to correct points of discontinuities and display adaptive density. Keywords: approach, bandwidth, estimate, error, kernel density


1990 ◽  
Vol 6 (4) ◽  
pp. 466-479 ◽  
Author(s):  
Donald W.K. Andrews ◽  
Yoon-Jae Whang

This paper considers series estimators of additive interactive regression (AIR) models. AIR models are nonparametric regression models that generalize additive regression models by allowing interactions between different regressor variables. They place more restrictions on the regression function, however, than do fully nonparametric regression models. By doing so, they attempt to circumvent the curse of dimensionality that afflicts the estimation of fully non-parametric regression models.In this paper, we present a finite sample bound and asymptotic rate of convergence results for the mean average squared error of series estimators that show that AIR models do circumvent the curse of dimensionality. A lower bound on the rate of convergence of these estimators is shown to depend on the order of the AIR model and the smoothness of the regression function, but not on the dimension of the regressor vector. Series estimators with fixed and data-dependent truncation parameters are considered.


2014 ◽  
Vol 2014 ◽  
pp. 1-7
Author(s):  
Christophe Chesneau

We investigate the estimation of a multiplicative separable regression function from a bidimensional nonparametric regression model with random design. We present a general estimator for this problem and study its mean integrated squared error (MISE) properties. A wavelet version of this estimator is developed. In some situations, we prove that it attains the standard unidimensional rate of convergence under the MISE over Besov balls.


2016 ◽  
Vol 2 (11) ◽  
Author(s):  
William Stewart

<p>For modern linkage studies involving many small families, Stewart et al. (2009)[1] introduced an efficient estimator of disease gene location (denoted ) that averages location estimates from random subsamples of the dense SNP data. Their estimator has lower mean squared error than competing estimators and yields narrower confidence intervals (CIs) as well. However, when the number of families is small and the pedigree structure is large (possibly extended), the computational feasibility and statistical properties of  are not known. We use simulation and real data to show that (1) for this extremely important but often overlooked study design, CIs based on  are narrower than CIs based on a single subsample, and (2) the reduction in CI length is proportional to the square root of the expected Monte Carlo error. As a proof of principle, we applied  to the dense SNP data of four large, extended, specific language impairment (SLI) pedigrees, and reduced the single subsample CI by 18%. In summary, confidence intervals based on  should minimize re-sequencing costs beneath linkage peaks, and reduce the number of candidate genes to investigate.</p>


2014 ◽  
Vol 2014 ◽  
pp. 1-7 ◽  
Author(s):  
Christophe Chesneau ◽  
Maher Kachour ◽  
Fabien Navarro

We investigate the estimation of the density-weighted average derivative from biased data. An estimator integrating a plug-in approach and wavelet projections is constructed. We prove that it attains the parametric rate of convergence 1/n under the mean squared error.


2021 ◽  
pp. 096228022110342
Author(s):  
Denis Talbot ◽  
Awa Diop ◽  
Mathilde Lavigne-Robichaud ◽  
Chantal Brisson

Background The change in estimate is a popular approach for selecting confounders in epidemiology. It is recommended in epidemiologic textbooks and articles over significance test of coefficients, but concerns have been raised concerning its validity. Few simulation studies have been conducted to investigate its performance. Methods An extensive simulation study was realized to compare different implementations of the change in estimate method. The implementations were also compared when estimating the association of body mass index with diastolic blood pressure in the PROspective Québec Study on Work and Health. Results All methods were susceptible to introduce important bias and to produce confidence intervals that included the true effect much less often than expected in at least some scenarios. Overall mixed results were obtained regarding the accuracy of estimators, as measured by the mean squared error. No implementation adequately differentiated confounders from non-confounders. In the real data analysis, none of the implementation decreased the estimated standard error. Conclusion Based on these results, it is questionable whether change in estimate methods are beneficial in general, considering their low ability to improve the precision of estimates without introducing bias and inability to yield valid confidence intervals or to identify true confounders.


2011 ◽  
Vol 2011 ◽  
pp. 1-11 ◽  
Author(s):  
Liyun Su

This study attempts to characterize and predict stock index series in Shenzhen stock market using the concepts of multivariate local polynomial regression. Based on nonlinearity and chaos of the stock index time series, multivariate local polynomial prediction methods and univariate local polynomial prediction method, all of which use the concept of phase space reconstruction according to Takens' Theorem, are considered. To fit the stock index series, the single series changes into bivariate series. To evaluate the results, the multivariate predictor for bivariate time series based on multivariate local polynomial model is compared with univariate predictor with the same Shenzhen stock index data. The numerical results obtained by Shenzhen component index show that the prediction mean squared error of the multivariate predictor is much smaller than the univariate one and is much better than the existed three methods. Even if the last half of the training data are used in the multivariate predictor, the prediction mean squared error is smaller than the univariate predictor. Multivariate local polynomial prediction model for nonsingle time series is a useful tool for stock market price prediction.


2019 ◽  
Vol 101 (3) ◽  
pp. 442-451 ◽  
Author(s):  
Sebastian Calonico ◽  
Matias D. Cattaneo ◽  
Max H. Farrell ◽  
Rocío Titiunik

We study regression discontinuity designs when covariates are included in the estimation. We examine local polynomial estimators that include discrete or continuous covariates in an additive separable way, but without imposing any parametric restrictions on the underlying population regression functions. We recommend a covariate-adjustment approach that retains consistency under intuitive conditions and characterize the potential for estimation and inference improvements. We also present new covariate-adjusted mean-squared error expansions and robust bias-corrected inference procedures, with heteroskedasticity-consistent and cluster-robust standard errors. We provide an empirical illustration and an extensive simulation study. All methods are implemented in R and Stata software packages.


Methodology ◽  
2010 ◽  
Vol 6 (2) ◽  
pp. 71-82 ◽  
Author(s):  
Byron J. Gajewski ◽  
Diane K. Boyle ◽  
Sarah Thompson

We demonstrate the utility of a Bayesian-based approach for calculating intervals of Cronbach’s alpha from a psychological instrument having ordinal responses with a dynamic scale. A small number of response options on an instrument will cause traditional-based interval estimates to be biased. Ordinal-based solutions are problematic because there is no clear mechanism for handling the dynamic scale. One way to remedy the bias is to adjust with a Bayesian approach. The Bayesian approach adjusts the bias and allows theoretically simple calculations of Cronbach’s alpha and intervals. We demonstrate the calculations of the Bayesian approach while at the same time offer a comparison to more traditional-based methods using both credible (or confidence) intervals and mean squared error. Practical advice is offered.


Sign in / Sign up

Export Citation Format

Share Document