Ensemble Estimation of Information Divergence †

Kevin Moon; Kumar Sricharan; Kristjan Greenewald; Alfred Hero

doi:10.3390/e20080560

Ensemble Estimation of Information Divergence †

Entropy ◽

10.3390/e20080560 ◽

2018 ◽

Vol 20 (8) ◽

pp. 560 ◽

Cited By ~ 5

Author(s):

Kevin Moon ◽

Kumar Sricharan ◽

Kristjan Greenewald ◽

Alfred Hero

Keyword(s):

Mean Squared Error ◽

A Priori ◽

Kernel Density ◽

Classification Problem ◽

Tuning Parameter ◽

Support Set ◽

Squared Error ◽

Ensemble Estimation ◽

Information Divergence ◽

Leave One Out

Recent work has focused on the problem of nonparametric estimation of information divergence functionals between two continuous random variables. Many existing approaches require either restrictive assumptions about the density support set or difficult calculations at the support set boundary which must be known a priori. The mean squared error (MSE) convergence rate of a leave-one-out kernel density plug-in divergence functional estimator for general bounded density support sets is derived where knowledge of the support boundary, and therefore, the boundary correction is not required. The theory of optimally weighted ensemble estimation is generalized to derive a divergence estimator that achieves the parametric rate when the densities are sufficiently smooth. Guidelines for the tuning parameter selection and the asymptotic distribution of this estimator are provided. Based on the theory, an empirical estimator of Rényi-α divergence is proposed that greatly outperforms the standard kernel density plug-in estimator in terms of mean squared error, especially in high dimensions. The estimator is shown to be robust to the choice of tuning parameters. We show extensive simulation results that verify the theoretical results of our paper. Finally, we apply the proposed estimator to estimate the bounds on the Bayes error rate of a cell classification problem.

Download Full-text

On the Use of a Modified Intersection of Confidence Intervals (MICIH) Kernel Density Estimation Approach

Athens Journal of Sciences ◽

10.30958/ajs.8-4-4 ◽

2021 ◽

Vol 8 (4) ◽

pp. 309-332

Author(s):

Efosa Michael Ogbeide ◽

Joseph Erunmwosa Osemwenkhae

Keyword(s):

Confidence Intervals ◽

Density Estimation ◽

Kernel Density Estimation ◽

Mean Squared Error ◽

Kernel Density ◽

Window Size ◽

National Bureau ◽

Squared Error ◽

Data Density ◽

Adaptive Kernel

Density estimation is an important aspect of statistics. Statistical inference often requires the knowledge of observed data density. A common method of density estimation is the kernel density estimation (KDE). It is a nonparametric estimation approach which requires a kernel function and a window size (smoothing parameter H). It aids density estimation and pattern recognition. So, this work focuses on the use of a modified intersection of confidence intervals (MICIH) approach in estimating density. The Nigerian crime rate data reported to the Police as reported by the National Bureau of Statistics was used to demonstrate this new approach. This approach in the multivariate kernel density estimation is based on the data. The main way to improve density estimation is to obtain a reduced mean squared error (MSE), the errors for this approach was evaluated. Some improvements were seen. The aim is to achieve adaptive kernel density estimation. This was achieved under a sufficiently smoothing technique. This adaptive approach was based on the bandwidths selection. The quality of the estimates obtained of the MICIH approach when applied, showed some improvements over the existing methods. The MICIH approach has reduced mean squared error and relative faster rate of convergence compared to some other approaches. The approach of MICIH has reduced points of discontinuities in the graphical densities the datasets. This will help to correct points of discontinuities and display adaptive density. Keywords: approach, bandwidth, estimate, error, kernel density

Download Full-text

PREDICTION ERROR AS A CRITERION FOR OPERATOR LENGTH

Geophysics ◽

10.1190/1.1440167 ◽

1971 ◽

Vol 36 (2) ◽

pp. 261-265 ◽

Cited By ~ 8

Author(s):

James N. Galbraith

Keyword(s):

Prediction Error ◽

Mean Squared Error ◽

A Priori ◽

Nonincreasing Function ◽

Final Value ◽

Kolmogorov Spectrum ◽

Squared Error ◽

Levinson Algorithm ◽

The Mean ◽

Error Filtering

Prediction error filtering has been widely used for deconvolution. The mean squared error in prediction is a monotonically nonincreasing function of operator length, and the value of the error is readily available from the Wiener‐Levinson algorithm. In general, the value of this error for the infinitely long operator is not known a priori. It is shown that the final value of the error can be obtained by considering the Kolmogorov spectrum factorization. Simple criteria can then be established for operator effectiveness and length.

Download Full-text

Inequalities for mean squared error of multidimensional kernel density estimations

Moscow University Computational Mathematics and Cybernetics ◽

10.3103/s0278641910010036 ◽

2010 ◽

Vol 34 (1) ◽

pp. 16-21 ◽

Cited By ~ 2

Author(s):

V. G. Ushakov ◽

N. G. Ushakov

Keyword(s):

Mean Squared Error ◽

Kernel Density ◽

Squared Error

Download Full-text

QSAR study of amidino bis-benzimidazole derivatives as potent anti-malarial agents against Plasmodium falciparum

Chemical Papers ◽

10.2478/s11696-013-0398-5 ◽

2013 ◽

Vol 67 (11) ◽

Cited By ~ 12

Author(s):

Apilak Worachartcheewan ◽

Chanin Nantasenamat ◽

Chartchalerm Isarankura-Na-Ayudhya ◽

Virapong Prachayasittikul

Keyword(s):

Plasmodium Falciparum ◽

Correlation Coefficient ◽

Cross Validation ◽

Mean Squared Error ◽

Sum Of Squares ◽

Atomic Masses ◽

Root Mean Squared Error ◽

Squared Error ◽

Qsar Models ◽

Leave One Out

AbstractA data set of amidino bis-benzimidazoles, in particular 2′-arylsubstituted-1H,1′H-[2,5′]bisbenzimidazolyl-5-carboximidine derivatives with anti-malarial activity against Plasmodium falciparum was employed in investigating the quantitative structure-activity relationship (QSAR). Quantum chemical and molecular descriptors were obtained from B3LYP/6-31g(d) calculations and Dragon software, respectively. Significant variables, which included total energy (E T), highest occupied molecular orbital (HOMO), Moran autocorrelation-lag3/weighted by atomic masses (MATS3m), Geary autocorrelation-lag8/weighted by atomic masses (GATS8m), and 3D-MoRSEsignal 11/weighted by atomic Sanderson electronegativities (Mor11e), were used in the construction of QSAR models using multiple linear regression (MLR) and artificial neural network (ANN). The results indicated that the predictive models for both the MLR and ANN approaches using leave-one-out cross-validation afforded a good performance in modelling the anti-malarial activity against P. falciparum as observed by correlation coefficients of leave-one-out cross-validation (R LOO-CV) of 0.9760 and 0.9821, respectively, root mean squared error of leave-one-out cross-validation (RMSELOO-CV) of 0.1301 and 0.1102, respectively, and predictivity of leave-one-out cross-validation (Q LOO-CV2) of 0.9526 and 0.9645, respectively. Model validation was performed using an external testing set and the results suggested that the model provided good predictivity for both MLR and ANN models with correlation coefficient of the external set (R Ext) values of 0.9978 and 0.9844, respectively, root mean squared error of the external set (RMSEExt) of 0.0764 and 0.1302 respectively, and predictivity of the external set (Q Ext2) of 0.9956 and 0.9690, respectively. Furthermore, the robustness of the QSAR models is corroborated by a number of statistical parameters, comprising adjusted correlation coefficient (R Adj2), standard deviation (s), predicted residual sum of squares (PRESS), standard error of prediction (SDEP), total sum of squares deviation (SSY), and quality factor (Q). The QSAR models so constructed provide pertinent insights for the future design of anti-malarial agents.

Download Full-text

Minimum mean squared error equalization using a priori information

IEEE Transactions on Signal Processing ◽

10.1109/78.984761 ◽

2002 ◽

Vol 50 (3) ◽

pp. 673-683 ◽

Cited By ~ 674

Author(s):

M. Tuchler ◽

A.C. Singer ◽

R. Koetter

Keyword(s):

Mean Squared Error ◽

A Priori ◽

A Priori Information ◽

Minimum Mean Squared Error ◽

Squared Error ◽

Priori Information

Download Full-text

Reducing the mean squared error in kernel density estimation

Journal of the Korean Statistical Society ◽

10.1016/j.jkss.2012.12.003 ◽

2013 ◽

Vol 42 (3) ◽

pp. 387-397 ◽

Cited By ~ 3

Author(s):

Jinmi Kim ◽

Choongrak Kim

Keyword(s):

Density Estimation ◽

Kernel Density Estimation ◽

Mean Squared Error ◽

Kernel Density ◽

Squared Error ◽

The Mean

Download Full-text

Optimal choice between parametric and non-parametric bootstrap estimates

Mathematical Proceedings of the Cambridge Philosophical Society ◽

10.1017/s0305004100072121 ◽

1994 ◽

Vol 115 (2) ◽

pp. 335-363 ◽

Cited By ~ 7

Author(s):

Stephen Man Sing Lee

Keyword(s):

Mean Squared Error ◽

Parametric Bootstrap ◽

Tuning Parameter ◽

Squared Error ◽

Optimal Estimator ◽

Bootstrap Estimate ◽

The Mean ◽

Bootstrap Estimates ◽

Hybrid Estimator ◽

Non Parametric

AbstractA parametric bootstrap estimate (PB) may be more accurate than its non-parametric version (NB) if the parametric model upon which it is based is, at least approximately, correct. Construction of an optimal estimator based on both PB and NB is pursued with the aim of minimizing the mean squared error. Our approach is to pick an empirical estimate of the optimal tuning parameter ε∈[0, 1] which minimizes the mean square error of εNB+(1−ε) PB. The resulting hybrid estimator is shown to be more reliable than either PB or NB uniformly over a rich class of distributions. Theoretical asymptotic results show that the asymptotic error of this hybrid estimator is quite close in distribution to the smaller of the errors of PB and NB. All these errors typically have the same convergence rate of order . A particular example is also presented to illustrate the fact that this hybrid estimate can indeed be strictly better than either of the pure bootstrap estimates in terms of minimizing mean squared error. Two simulation studies were conducted to verify the theoretical results and demonstrate the good practical performance of the hybrid method.

Download Full-text

Identification of facet models by means of factor rotation: A simulation study and data analysis of a test for the Berlin Model of Intelligence Structure

10.31234/osf.io/zw3ub ◽

2019 ◽

Author(s):

André Beauducel ◽

Martin Kersting

Keyword(s):

Factor Analysis ◽

Simulation Study ◽

Simple Structure ◽

Mean Squared Error ◽

A Priori ◽

Minimum Entropy ◽

Factor Rotation ◽

Squared Error ◽

Number Of Factors ◽

Target Rotation

Until now there has been no successful exploration of a priori unknown faceted structure by means of exploratory factor analysis (EFA) of the measured variables (items or tasks). For this reason, we investigate by means of a simulation study how well methods for factor rotation can identify a two-facet orthogonal simple structure. Samples were generated from orthogonal two-facet population factor models with 4 (2 factors per facet) to 12 factors (6 factors per facet) and submitted to factor analysis with subsequent Varimax, Equamax, Parsimax, Factor Parsimony, Tandem I, Tandem II, Infomax, and McCammon’s Minimum Entropy rotation. As a benchmark, orthogonal target rotation of the sample loadings towards the corresponding faceted population loadings was also investigated. The conditions were sample size (n = 400, 1,000), number of factors (q = 4-12), and main loading size (l = .40, .50, .60). Mean congruence coefficients of the sample loading matrices with the corresponding population loading matrices and the root mean squared error between sample loading matrices and corresponding population loading matrices were used as dependent measures. For less than six factors Infomax and McCammon’s Minimum Entropy rotation and for six and more factors Tandem II rotation yielded the highest similarity of sample loading matrices with faceted population loading matrices. Analysis of data of 393 participants that performed a test for the Berlin Model of Intelligence Structure revealed that the faceted structure of this model could be found by means of target rotation of task aggregates corresponding to the cross-products of the facets. Moreover, McCammon’s Minimum Entropy rotation resulted in a loading pattern corresponding to the model, although the factor for figural intelligence was only weakly represented. Implications for the identification of faceted models by means of factor rotation are discussed.

Download Full-text

Use of reflectance spectroscopy to estimate the organic carbon and CaCO3 contents of soils

Agrokémia és Talajtan ◽

10.1556/agrokem.60.2012.2.5 ◽

2012 ◽

Vol 61 (2) ◽

pp. 277-290 ◽

Cited By ~ 1

Author(s):

Ádám Csorba ◽

Vince Láng ◽

László Fenyvesi ◽

Erika Michéli

Keyword(s):

Organic Carbon ◽

Least Squares ◽

Partial Least Squares ◽

Partial Least Squares Regression ◽

Mean Squared Error ◽

Reflectance Spectroscopy ◽

Least Squares Regression ◽

Root Mean Squared Error ◽

Squared Error

Napjainkban egyre nagyobb igény mutatkozik olyan technológiák és módszerek kidolgozására és alkalmazására, melyek lehetővé teszik a gyors, költséghatékony és környezetbarát talajadat-felvételezést és kiértékelést. Ezeknek az igényeknek felel meg a reflektancia spektroszkópia, mely az elektromágneses spektrum látható (VIS) és közeli infravörös (NIR) tartományában (350–2500 nm) végzett reflektancia-mérésekre épül. Figyelembe véve, hogy a talajokról felvett reflektancia spektrum információban nagyon gazdag, és a vizsgált tartományban számos talajalkotó rendelkezik karakterisztikus spektrális „ujjlenyomattal”, egyetlen görbéből lehetővé válik nagyszámú, kulcsfontosságú talajparaméter egyidejű meghatározása. Dolgozatunkban, a reflektancia spektroszkópia alapjaira helyezett, a talajok ösz-szetételének meghatározását célzó módszertani fejlesztés első lépéseit mutatjuk be. Munkánk során talajok szervesszén- és CaCO3-tartalmának megbecslését lehetővé tévő többváltozós matematikai-statisztikai módszerekre (részleges legkisebb négyzetek módszere, partial least squares regression – PLSR) épülő prediktív modellek létrehozását és tesztelését végeztük el. A létrehozott modellek tesztelése során megállapítottuk, hogy az eljárás mindkét talajparaméter esetében magas R2értéket [R2(szerves szén) = 0,815; R2(CaCO3) = 0,907] adott. A becslés pontosságát jelző közepes négyzetes eltérés (root mean squared error – RMSE) érték mindkét paraméter esetében közepesnek mondható [RMSE (szerves szén) = 0,467; RMSE (CaCO3) = 3,508], mely a reflektancia mérési előírások standardizálásával jelentősen javítható. Vizsgálataink alapján arra a következtetésre jutottunk, hogy a reflektancia spektroszkópia és a többváltozós kemometriai eljárások együttes alkalmazásával, gyors és költséghatékony adatfelvételezési és -értékelési módszerhez juthatunk.

Download Full-text

Minimax Mean-Squared Error Location Estimation Using TOA Measurements

IEICE Transactions on Communications ◽

10.1587/transcom.e93.b.2223 ◽

2010 ◽

Vol E93-B (8) ◽

pp. 2223-2225 ◽

Cited By ~ 2

Author(s):

Chih-Chang SHEN ◽

Ann-Chen CHANG

Keyword(s):

Mean Squared Error ◽

Location Estimation ◽

Squared Error ◽

Error Location

Download Full-text