Upgrading Model Selection Criteria with Goodness of Fit Tests for Practical Applications

Riccardo Rossi; Andrea Murari; Pasquale Gaudio; Michela Gelfusa

doi:10.3390/e22040447

Upgrading Model Selection Criteria with Goodness of Fit Tests for Practical Applications

Entropy ◽

10.3390/e22040447 ◽

2020 ◽

Vol 22 (4) ◽

pp. 447

Author(s):

Riccardo Rossi ◽

Andrea Murari ◽

Pasquale Gaudio ◽

Michela Gelfusa

Keyword(s):

Model Selection ◽

Goodness Of Fit ◽

Mean Squared Error ◽

Information Criterion ◽

Original Form ◽

Residual Distribution ◽

Practical Applications ◽

Goodness Of Fit Tests ◽

Squared Error ◽

The Mean

The Bayesian information criterion (BIC), the Akaike information criterion (AIC), and some other indicators derived from them are widely used for model selection. In their original form, they contain the likelihood of the data given the models. Unfortunately, in many applications, it is practically impossible to calculate the likelihood, and, therefore, the criteria have been reformulated in terms of descriptive statistics of the residual distribution: the variance and the mean-squared error of the residuals. These alternative versions are strictly valid only in the presence of additive noise of Gaussian distribution, not a completely satisfactory assumption in many applications in science and engineering. Moreover, the variance and the mean-squared error are quite crude statistics of the residual distributions. More sophisticated statistical indicators, capable of better quantifying how close the residual distribution is to the noise, can be profitably used. In particular, specific goodness of fit tests have been included in the expressions of the traditional criteria and have proved to be very effective in improving their discriminating capability. These improved performances have been demonstrated with a systematic series of simulations using synthetic data for various classes of functions and different noise statistics.

Download Full-text

On the Use of Entropy to Improve Model Selection Criteria

Entropy ◽

10.3390/e21040394 ◽

2019 ◽

Vol 21 (4) ◽

pp. 394 ◽

Cited By ~ 6

Author(s):

Andrea Murari ◽

Emmanuele Peluso ◽

Francesco Cianfrani ◽

Pasquale Gaudio ◽

Michele Lungaroni

Keyword(s):

Model Selection ◽

Shannon Entropy ◽

Selection Criteria ◽

Mean Squared Error ◽

Geodesic Distance ◽

Information Criterion ◽

Residual Distribution ◽

Model Selection Criteria ◽

Synthetic Indicators ◽

Squared Error

The most widely used forms of model selection criteria, the Bayesian Information Criterion (BIC) and the Akaike Information Criterion (AIC), are expressed in terms of synthetic indicators of the residual distribution: the variance and the mean-squared error of the residuals respectively. In many applications in science, the noise affecting the data can be expected to have a Gaussian distribution. Therefore, at the same level of variance and mean-squared error, models, whose residuals are more uniformly distributed, should be favoured. The degree of uniformity of the residuals can be quantified by the Shannon entropy. Including the Shannon entropy in the BIC and AIC expressions improves significantly these criteria. The better performances have been demonstrated empirically with a series of simulations for various classes of functions and for different levels and statistics of the noise. In presence of outliers, a better treatment of the errors, using the Geodesic Distance, has proved essential.

Download Full-text

Modelling the growth of rearing cattle

Czech Journal of Animal Science ◽

10.17221/98/2021-cjas ◽

2021 ◽

Author(s):

Hanna Unterauer ◽

Norbert Brunner ◽

Manfred Kühleitner

Keyword(s):

Goodness Of Fit ◽

Linear Growth ◽

Mean Squared Error ◽

Least Squares Method ◽

Information Criterion ◽

Squared Error ◽

Akaike Criterion ◽

Linear Growth Model ◽

Bp Model ◽

Scientific Growth

Scientific growth literature often uses the models of Brody, Gompertz, Verhulst, and von Bertalanffy. The versatile five-parameter Bertalanffy-Pütter (BP) model generalizes them. Using the least-squares method, we fitted the BP model to mass-at-age data of 161 calves, cows, bulls, and oxen of cattle breeds that are common in Austria and Southern Germany. We used three measures to assess the goodness of fit: R-squared, normalized root-mean squared error, and the Akaike information criterion together with a correction for sample size. Although the BP model improved the fit of the linear growth model considerably in terms of R-squared, the better fit did not, in general, justify the use of its additional parameters, because most of the data had a non-sigmoidal character. In terms of the Akaike criterion, we could identify only a small core of data (15%) where sigmoidal models were indispensable.

Download Full-text

New Versions of Liu-type Estimator in Weighted and non-weighted Mixed Regression Model

Baghdad Science Journal ◽

10.21123/bsj.2020.17.1(suppl.).0361 ◽

2020 ◽

Vol 17 (1(Suppl.)) ◽

pp. 0361

Author(s):

Mustafa Ismaeel Naif Alheety

Keyword(s):

Regression Model ◽

Simulation Study ◽

Goodness Of Fit ◽

Prior Information ◽

Mean Squared Error ◽

Numerical Example ◽

Squared Error ◽

Mixed Regression ◽

The Mean ◽

Stochastic Restrictions

This paper considers and proposes new estimators that depend on the sample and on prior information in the case that they either are equally or are not equally important in the model. The prior information is described as linear stochastic restrictions. We study the properties and the performances of these estimators compared to other common estimators using the mean squared error as a criterion for the goodness of fit. A numerical example and a simulation study are proposed to explain the performance of the estimators.

Download Full-text

Model selection criterion based on the prediction mean squared error in generalized estimating equations

Hiroshima Mathematical Journal ◽

10.32917/hmj/1544238030 ◽

2018 ◽

Vol 48 (3) ◽

pp. 307-334

Author(s):

Yu Inatsu ◽

Shinpei Imori

Keyword(s):

Model Selection ◽

Generalized Estimating Equations ◽

Mean Squared Error ◽

Selection Criterion ◽

Estimating Equations ◽

Model Selection Criterion ◽

Squared Error ◽

Generalized Estimating

Download Full-text

The Mean Squared Error of the Instrumental Variables Estimator When the Disturbance Has an Elliptical Distribution

Econometric Reviews ◽

10.1080/07474930500545488 ◽

2006 ◽

Vol 25 (1) ◽

pp. 117-138 ◽

Cited By ~ 1

Author(s):

Fernanda P. M. Peixe ◽

Alastair R. Hall ◽

Kostas Kyriakoulis

Keyword(s):

Instrumental Variables ◽

Mean Squared Error ◽

Elliptical Distribution ◽

Squared Error ◽

The Mean

Download Full-text

Computing trade-offs in robust design: Perspectives of the mean squared error

Computers & Industrial Engineering ◽

10.1016/j.cie.2010.11.006 ◽

2011 ◽

Vol 60 (2) ◽

pp. 248-255 ◽

Cited By ~ 30

Author(s):

Sangmun Shin ◽

Funda Samanlioglu ◽

Byung Rae Cho ◽

Margaret M. Wiecek

Keyword(s):

Robust Design ◽

Mean Squared Error ◽

Squared Error ◽

Trade Offs ◽

The Mean

Download Full-text

The Optimal Selection for Restricted Linear Models with Average Estimator

Abstract and Applied Analysis ◽

10.1155/2014/692472 ◽

2014 ◽

Vol 2014 ◽

pp. 1-13

Author(s):

Qichang Xie ◽

Meng Du

Keyword(s):

Model Selection ◽

Linear Models ◽

Weighted Average ◽

Selection Procedure ◽

Information Criterion ◽

Optimal Weights ◽

Model Average ◽

Squared Error ◽

Generalized Information Criterion ◽

Risk Investment

The essential task of risk investment is to select an optimal tracking portfolio among various portfolios. Statistically, this process can be achieved by choosing an optimal restricted linear model. This paper develops a statistical procedure to do this, based on selecting appropriate weights for averaging approximately restricted models. The method of weighted average least squares is adopted to estimate the approximately restricted models under dependent error setting. The optimal weights are selected by minimizing ak-class generalized information criterion (k-GIC), which is an estimate of the average squared error from the model average fit. This model selection procedure is shown to be asymptotically optimal in the sense of obtaining the lowest possible average squared error. Monte Carlo simulations illustrate that the suggested method has comparable efficiency to some alternative model selection techniques.

Download Full-text

Day-Ahead Forecasting of Hourly Photovoltaic Power Based on Robust Multilayer Perception

Sustainability ◽

10.3390/su10124863 ◽

2018 ◽

Vol 10 (12) ◽

pp. 4863 ◽

Cited By ~ 6

Author(s):

Chao Huang ◽

Longpeng Cao ◽

Nanxin Peng ◽

Sijia Li ◽

Jing Zhang ◽

...

Keyword(s):

Power Plants ◽

Mean Squared Error ◽

Absolute Error ◽

Multilayer Perception ◽

Squared Error ◽

The Mean ◽

Effectiveness And Efficiency ◽

Mlp Network ◽

Grid Operation ◽

Better Than

Photovoltaic (PV) modules convert renewable and sustainable solar energy into electricity. However, the uncertainty of PV power production brings challenges for the grid operation. To facilitate the management and scheduling of PV power plants, forecasting is an essential technique. In this paper, a robust multilayer perception (MLP) neural network was developed for day-ahead forecasting of hourly PV power. A generic MLP is usually trained by minimizing the mean squared loss. The mean squared error is sensitive to a few particularly large errors that can lead to a poor estimator. To tackle the problem, the pseudo-Huber loss function, which combines the best properties of squared loss and absolute loss, was adopted in this paper. The effectiveness and efficiency of the proposed method was verified by benchmarking against a generic MLP network with real PV data. Numerical experiments illustrated that the proposed method performed better than the generic MLP network in terms of root mean squared error (RMSE) and mean absolute error (MAE).

Download Full-text

On double stage minimax-shrinkage estimator for generalized Rayleigh model

International Journal of Applied Mathematical Research ◽

10.14419/ijamr.v5i1.5553 ◽

2016 ◽

Vol 5 (1) ◽

pp. 39 ◽

Cited By ~ 1

Author(s):

Abbas Najim Salman ◽

Maymona Ameen

Keyword(s):

Sample Size ◽

Shape Parameter ◽

Mean Squared Error ◽

Scale Parameter ◽

Rayleigh Distribution ◽

Shrinkage Estimator ◽

Squared Error ◽

Expected Sample Size ◽

Generalized Rayleigh Distribution ◽

The Mean

This paper is concerned with minimax shrinkage estimator using double stage shrinkage technique for lowering the mean squared error, intended for estimate the shape parameter (a) of Generalized Rayleigh distribution in a region (R) around available prior knowledge (a0) about the actual value (a) as initial estimate in case when the scale parameter (l) is known .In situation where the experimentations are time consuming or very costly, a double stage procedure can be used to reduce the expected sample size needed to obtain the estimator.The proposed estimator is shown to have smaller mean squared error for certain choice of the shrinkage weight factor y(×) and suitable region R.Expressions for Bias, Mean squared error (MSE), Expected sample size [E (n/a, R)], Expected sample size proportion [E(n/a,R)/n], probability for avoiding the second sample and percentage of overall sample saved for the proposed estimator are derived.Numerical results and conclusions for the expressions mentioned above were displayed when the consider estimator are testimator of level of significanceD.Comparisons with the minimax estimator and with the most recent studies were made to shown the effectiveness of the proposed estimator.

Download Full-text

Performance Analysis of AOA-Based Localization Using the LS Approach: Explicit Expression of Mean-Squared Error

Journal of Sensors ◽

10.1155/2020/9346142 ◽

2020 ◽

Vol 2020 ◽

pp. 1-22

Author(s):

Byung-Kwon Son ◽

Do-Jin An ◽

Joon-Ho Lee

Keyword(s):

Explicit Expression ◽

Mean Squared Error ◽

Weighted Least Squares ◽

Localization Algorithm ◽

Angle Of Arrival ◽

Squared Error ◽

Distance Weighted ◽

The Mean ◽

Passive Localization ◽

Location Estimate

In this paper, a passive localization of the emitter using noisy angle-of-arrival (AOA) measurements, called Brown DWLS (Distance Weighted Least Squares) algorithm, is considered. The accuracy of AOA-based localization is quantified by the mean-squared error. Various estimates of the AOA-localization algorithm have been derived (Doğançay and Hmam, 2008). Explicit expression of the location estimate of the previous study is used to get an analytic expression of the mean-squared error (MSE) of one of the various estimates. To validate the derived expression, we compare the MSE from the Monte Carlo simulation with the analytically derived MSE.

Download Full-text