scholarly journals Upgrading Model Selection Criteria with Goodness of Fit Tests for Practical Applications

Entropy ◽  
2020 ◽  
Vol 22 (4) ◽  
pp. 447
Author(s):  
Riccardo Rossi ◽  
Andrea Murari ◽  
Pasquale Gaudio ◽  
Michela Gelfusa

The Bayesian information criterion (BIC), the Akaike information criterion (AIC), and some other indicators derived from them are widely used for model selection. In their original form, they contain the likelihood of the data given the models. Unfortunately, in many applications, it is practically impossible to calculate the likelihood, and, therefore, the criteria have been reformulated in terms of descriptive statistics of the residual distribution: the variance and the mean-squared error of the residuals. These alternative versions are strictly valid only in the presence of additive noise of Gaussian distribution, not a completely satisfactory assumption in many applications in science and engineering. Moreover, the variance and the mean-squared error are quite crude statistics of the residual distributions. More sophisticated statistical indicators, capable of better quantifying how close the residual distribution is to the noise, can be profitably used. In particular, specific goodness of fit tests have been included in the expressions of the traditional criteria and have proved to be very effective in improving their discriminating capability. These improved performances have been demonstrated with a systematic series of simulations using synthetic data for various classes of functions and different noise statistics.

Entropy ◽  
2019 ◽  
Vol 21 (4) ◽  
pp. 394 ◽  
Author(s):  
Andrea Murari ◽  
Emmanuele Peluso ◽  
Francesco Cianfrani ◽  
Pasquale Gaudio ◽  
Michele Lungaroni

The most widely used forms of model selection criteria, the Bayesian Information Criterion (BIC) and the Akaike Information Criterion (AIC), are expressed in terms of synthetic indicators of the residual distribution: the variance and the mean-squared error of the residuals respectively. In many applications in science, the noise affecting the data can be expected to have a Gaussian distribution. Therefore, at the same level of variance and mean-squared error, models, whose residuals are more uniformly distributed, should be favoured. The degree of uniformity of the residuals can be quantified by the Shannon entropy. Including the Shannon entropy in the BIC and AIC expressions improves significantly these criteria. The better performances have been demonstrated empirically with a series of simulations for various classes of functions and for different levels and statistics of the noise. In presence of outliers, a better treatment of the errors, using the Geodesic Distance, has proved essential.


Author(s):  
Hanna Unterauer ◽  
Norbert Brunner ◽  
Manfred Kühleitner

Scientific growth literature often uses the models of Brody, Gompertz, Verhulst, and von Bertalanffy. The versatile five-parameter Bertalanffy-Pütter (BP) model generalizes them. Using the least-squares method, we fitted the BP model to mass-at-age data of 161 calves, cows, bulls, and oxen of cattle breeds that are common in Austria and Southern Germany. We used three measures to assess the goodness of fit: R-squared, normalized root-mean squared error, and the Akaike information criterion together with a correction for sample size. Although the BP model improved the fit of the linear growth model considerably in terms of R-squared, the better fit did not, in general, justify the use of its additional parameters, because most of the data had a non-sigmoidal character. In terms of the Akaike criterion, we could identify only a small core of data (15%) where sigmoidal models were indispensable.    


2020 ◽  
Vol 17 (1(Suppl.)) ◽  
pp. 0361
Author(s):  
Mustafa Ismaeel Naif Alheety

This paper considers and proposes new estimators that depend on the sample and on prior information in the case that they either are equally or are not equally important in the model. The prior information is described as linear stochastic restrictions. We study the properties and the performances of these estimators compared to other common estimators using the mean squared error as a criterion for the goodness of fit. A numerical example and a simulation study are proposed to explain the performance of the estimators.


2011 ◽  
Vol 60 (2) ◽  
pp. 248-255 ◽  
Author(s):  
Sangmun Shin ◽  
Funda Samanlioglu ◽  
Byung Rae Cho ◽  
Margaret M. Wiecek

2014 ◽  
Vol 2014 ◽  
pp. 1-13
Author(s):  
Qichang Xie ◽  
Meng Du

The essential task of risk investment is to select an optimal tracking portfolio among various portfolios. Statistically, this process can be achieved by choosing an optimal restricted linear model. This paper develops a statistical procedure to do this, based on selecting appropriate weights for averaging approximately restricted models. The method of weighted average least squares is adopted to estimate the approximately restricted models under dependent error setting. The optimal weights are selected by minimizing ak-class generalized information criterion (k-GIC), which is an estimate of the average squared error from the model average fit. This model selection procedure is shown to be asymptotically optimal in the sense of obtaining the lowest possible average squared error. Monte Carlo simulations illustrate that the suggested method has comparable efficiency to some alternative model selection techniques.


2018 ◽  
Vol 10 (12) ◽  
pp. 4863 ◽  
Author(s):  
Chao Huang ◽  
Longpeng Cao ◽  
Nanxin Peng ◽  
Sijia Li ◽  
Jing Zhang ◽  
...  

Photovoltaic (PV) modules convert renewable and sustainable solar energy into electricity. However, the uncertainty of PV power production brings challenges for the grid operation. To facilitate the management and scheduling of PV power plants, forecasting is an essential technique. In this paper, a robust multilayer perception (MLP) neural network was developed for day-ahead forecasting of hourly PV power. A generic MLP is usually trained by minimizing the mean squared loss. The mean squared error is sensitive to a few particularly large errors that can lead to a poor estimator. To tackle the problem, the pseudo-Huber loss function, which combines the best properties of squared loss and absolute loss, was adopted in this paper. The effectiveness and efficiency of the proposed method was verified by benchmarking against a generic MLP network with real PV data. Numerical experiments illustrated that the proposed method performed better than the generic MLP network in terms of root mean squared error (RMSE) and mean absolute error (MAE).


2016 ◽  
Vol 5 (1) ◽  
pp. 39 ◽  
Author(s):  
Abbas Najim Salman ◽  
Maymona Ameen

<p>This paper is concerned with minimax shrinkage estimator using double stage shrinkage technique for lowering the mean squared error, intended for estimate the shape parameter (a) of Generalized Rayleigh distribution in a region (R) around available prior knowledge (a<sub>0</sub>) about the actual value (a) as initial estimate in case when the scale parameter (l) is known .</p><p>In situation where the experimentations are time consuming or very costly, a double stage procedure can be used to reduce the expected sample size needed to obtain the estimator.</p><p>The proposed estimator is shown to have smaller mean squared error for certain choice of the shrinkage weight factor y(<strong>×</strong>) and suitable region R.</p><p>Expressions for Bias, Mean squared error (MSE), Expected sample size [E (n/a, R)], Expected sample size proportion [E(n/a,R)/n], probability for avoiding the second sample and percentage of overall sample saved  for the proposed estimator are derived.</p><p>Numerical results and conclusions for the expressions mentioned above were displayed when the consider estimator are testimator of level of significanceD.</p><p>Comparisons with the minimax estimator and with the most recent studies were made to shown the effectiveness of the proposed estimator.</p>


2020 ◽  
Vol 2020 ◽  
pp. 1-22
Author(s):  
Byung-Kwon Son ◽  
Do-Jin An ◽  
Joon-Ho Lee

In this paper, a passive localization of the emitter using noisy angle-of-arrival (AOA) measurements, called Brown DWLS (Distance Weighted Least Squares) algorithm, is considered. The accuracy of AOA-based localization is quantified by the mean-squared error. Various estimates of the AOA-localization algorithm have been derived (Doğançay and Hmam, 2008). Explicit expression of the location estimate of the previous study is used to get an analytic expression of the mean-squared error (MSE) of one of the various estimates. To validate the derived expression, we compare the MSE from the Monte Carlo simulation with the analytically derived MSE.


Sign in / Sign up

Export Citation Format

Share Document