A scalable estimate of the out‐of‐sample prediction error via approximate leave‐one‐out cross‐validation

Author(s):  
Kamiar Rahnama Rad ◽  
Arian Maleki
2018 ◽  
Author(s):  
Quentin Frederik Gronau ◽  
Eric-Jan Wagenmakers

Cross-validation (CV) is increasingly popular as a generic method to adjudicate between mathematical models of cognition and behavior. In order to measure model generalizability, CV quantifies out-of-sample predictive performance, and the CV preference goes to the model that predicted the out-of-sample data best. The advantages of CV include theoretic simplicity and practical feasibility. Despite its prominence, however, the limitations of CV are often underappreciated. Here we demonstrate the limitations of a particular form of CV --Bayesian leave-one-out cross-validation or LOO-- with three concrete examples. In each example, a data set of infinite size is perfectly in line with the predictions of a simple model (i.e., a general law or invariance). Nevertheless, LOO shows bounded and relatively modest support for the simple model. We conclude that CV is not a panacea for model selection.


2019 ◽  
Vol 56 (4) ◽  
pp. 514-528 ◽  
Author(s):  
Stijn van Weezel

This study exploits a sudden and abrupt decline in precipitation of the long rains season in the Horn of Africa to analyze the possible link between climate change and violent armed conflict. Following the 1998 El Niño there has been an overall reduction in precipitation levels – associated with sea-surface temperature changes in the Indian and Pacific Oceans – resulting in an increase in the number and severity of droughts. Given that the probable cause of this shift is anthropogenic forcing, it provides a unique opportunity to study the effect of climate change on society compared to statistical inference based on weather variation. Focusing on communal conflict in Ethiopia and Kenya between 1999 and 2014, exploiting cross-sectional variation across districts, the regression analysis links the precipitation decline to an additional 1.3 conflict events per district. The main estimates show that there is a negative correlation between precipitation and communal conflict with a probability of 0.90. Changing model specification to consider plausible alternative models and accommodate other identifying assumptions produces broadly similar results. The generaliziability of the link between precipitation decline and conflict breaks down when using out-of-sample cross-validation to test the external validity. A leave-one-out cross-validation exercise shows that accounting for climate contributes relatively little to improving the predictive performance of the model. This suggests that there are other more salient factors underlying communal violence in Ethiopia and Kenya. As such, in this case the link between climate and conflict should not be overstated.


2015 ◽  
Vol 2015 ◽  
pp. 1-9 ◽  
Author(s):  
Krzysztof Wiktorowicz ◽  
Krzysztof Przednowek ◽  
Lesław Lassota ◽  
Tomasz Krzeszowski

This paper presents the use of linear and nonlinear multivariable models as tools to support training process of race walkers. These models are calculated using data collected from race walkers’ training events and they are used to predict the result over a 3 km race based on training loads. The material consists of 122 training plans for 21 athletes. In order to choose the best model leave-one-out cross-validation method is used. The main contribution of the paper is to propose the nonlinear modifications for linear models in order to achieve smaller prediction error. It is shown that the best model is a modified LASSO regression with quadratic terms in the nonlinear part. This model has the smallest prediction error and simplified structure by eliminating some of the predictors.


2019 ◽  
Vol 76 (7) ◽  
pp. 2349-2361
Author(s):  
Benjamin Misiuk ◽  
Trevor Bell ◽  
Alec Aitken ◽  
Craig J Brown ◽  
Evan N Edinger

Abstract Species distribution models are commonly used in the marine environment as management tools. The high cost of collecting marine data for modelling makes them finite, especially in remote locations. Underwater image datasets from multiple surveys were leveraged to model the presence–absence and abundance of Arctic soft-shell clam (Mya spp.) to support the management of a local small-scale fishery in Qikiqtarjuaq, Nunavut, Canada. These models were combined to predict Mya abundance, conditional on presence throughout the study area. Results suggested that water depth was the primary environmental factor limiting Mya habitat suitability, yet seabed topography and substrate characteristics influence their abundance within suitable habitat. Ten-fold cross-validation and spatial leave-one-out cross-validation (LOO CV) were used to assess the accuracy of combined predictions and to test whether this was inflated by the spatial autocorrelation of transect sample data. Results demonstrated that four different measures of predictive accuracy were substantially inflated due to spatial autocorrelation, and the spatial LOO CV results were therefore adopted as the best estimates of performance.


2014 ◽  
Vol 79 (8) ◽  
pp. 965-975 ◽  
Author(s):  
Long Jiao ◽  
Xiaofei Wang ◽  
LI. Hua ◽  
Yunxia Wang

The quantitative structure property relationship (QSPR) for gas/particle partition coefficient, Kp, of polychlorinated biphenyls (PCBs) was investigated. Molecular distance-edge vector (MDEV) index was used as the structural descriptor of PCBs. The quantitative relationship between the MDEV index and log Kp was modeled by multivariate linear regression (MLR) and artificial neural network (ANN) respectively. Leave one out cross validation and external validation were carried out to assess the prediction ability of the developed models. When the MLR method is used, the root mean square relative error (RMSRE) of prediction for leave one out cross validation and external validation is 4.72 and 8.62 respectively. When the ANN method is employed, the prediction RMSRE of leave one out cross validation and external validation is 3.87 and 7.47 respectively. It is demonstrated that the developed models are practicable for predicting the Kp of PCBs. The MDEV index is shown to be quantitatively related to the Kp of PCBs.


2016 ◽  
Vol 2016 ◽  
pp. 1-7 ◽  
Author(s):  
Hong-Jhang Chen ◽  
Yii-Jeng Lin ◽  
Pei-Chen Wu ◽  
Wei-Hsiang Hsu ◽  
Wan-Chung Hu ◽  
...  

Traditional Chinese medicine (TCM) formulates treatment according to body constitution (BC) differentiation. Different constitutions have specific metabolic characteristics and different susceptibility to certain diseases. This study aimed to assess theYang-Xuconstitution using a body constitution questionnaire (BCQ) and clinical blood variables. A BCQ was employed to assess the clinical manifestation ofYang-Xu. The logistic regression model was conducted to explore the relationship between BC scores and biomarkers. Leave-one-out cross-validation (LOOCV) and K-fold cross-validation were performed to evaluate the accuracy of a predictive model in practice. Decision trees (DTs) were conducted to determine the possible relationships between blood biomarkers and BC scores. According to the BCQ analysis, 49% participants without any BC were classified as healthy subjects. Among them, 130 samples were selected for further analysis and divided into two groups. One group comprised healthy subjects without any BC (68%), while subjects of the other group, named as the sub-healthy group, had three BCs (32%). Six biomarkers, CRE, TSH, HB, MONO, RBC, and LH, were found to have the greatest impact on BCQ outcomes inYang-Xusubjects. This study indicated significant biochemical differences inYang-Xusubjects, which may provide a connection between blood variables and theYang-XuBC.


Author(s):  
Renzhe Xu ◽  
Yudong Chen ◽  
Tenglong Xiao ◽  
Jingli Wang ◽  
Xiong Wang

As an important tool to measure the current situation of the whole stock market, the stock index has always been the focus of researchers, especially for its prediction. This paper uses trend types, which are received by clustering price series under multiple time scale, combined with the day-of-the-week effect to construct a categorical feature combination. Based on the historical data of six kinds of Chinese stock indexes, the CatBoost model is used for training and predicting. Experimental results show that the out-of-sample prediction accuracy is 0.55, and the long–short trading strategy can obtain average annualized return of 34.43%, which is a great improvement compared with other classical classification algorithms. Under the rolling back-testing, the model can always obtain stable returns in each period of time from 2012 to 2020. Among them, the SSESC’s long–short strategy has the best performance with an annualized return of 40.85% and a sharp ratio of 1.53. Therefore, the trend information on multiple time-scale features based on feature engineering can be learned by the CatBoost model well, which has a guiding effect on predicting stock index trends.


2018 ◽  
Author(s):  
Παντελής Σταυρούλιας

Οι έγκυρες προβλέψεις χρηματοοικονομικών κρίσεων διασφάλιζαν ανέκαθεν την σταθερότητα τόσο ολόκληρου του χρηματοοικονομικού οικοδομήματος γενικότερα, όσο και του τραπεζικού τομέα ειδικότερα. Με την παρούσα διατριβή επιτυγχάνεται η πρόβλεψη συστημικών τραπεζικών κρίσεων για χώρες της EE-14 αρκετά τρίμηνα προτού αυτές γίνουν αντιληπτές με την χρησιμοποίηση των πιο διαδεδομένων μεταβλητών (μακροοικονομικών, τραπεζικών και αγοράς) μέσω δύο προσεγγίσεων, της δυαδικής και της πολυεπίπεδης. Ακολουθώντας τη δυαδική προσέγγιση, εξάγονται μοντέλα ταξινόμησης με την εφαρμογή της Διακριτής Ανάλυσης (Discriminant Analysis), της Γραμμικής Παλινδρόμησης (Linear Regression), της Λογιστικής Παλινδρόμησης (Logistic Regression) και της Παλινδρόμησης Πιθανοομάδας (Probit Regression), για την έγκαιρη πρόβλεψη των κρίσεων -12 έως -7 τρίμηνα πριν την εμφάνισή τους. Επιπροσθέτως, συγκρίνεται η απόδοση της ανωτέρω ανάλυσης χρησιμοποιώντας τις νεότερες και πλέον υποσχόμενες μεθόδους του Δέντρου Ταξινόμησης (Classification Tree), του Τυχαίου Δάσους (Random Forest) και της C5. Ταυτόχρονα προτείνεται ένα νέο μέτρο επιλογής κατωφλίων και απόδοσης προσαρμογής (GoF) των μοντέλων πρόβλεψης και μια νέα συνδυαστική (combined) μέθοδος ταξινόμησης. Προκειμένου να διερευνηθεί η απόδοση της ανωτέρω ανάλυσης, χρησιμοποιείται ο εκτός του δείγματος έλεγχος (out-of-sample testing) με τη μέθοδο της ανά χώρα σταυρωτής επικύρωσης (country-blocked cross validation). Σύμφωνα με τη μέθοδο αυτή, πραγματοποιείται η ανάλυση και εξάγονται τα μοντέλα πρόβλεψης με τη χρήση των δεκατριών από τις δεκατέσσερις χώρες του δείγματος (in-sample), εφαρμόζονται τα εξαγόμενα μοντέλα για την δέκατη τέταρτη χώρα που είχε εξαιρεθεί από το αρχικό δείγμα (out-of-sample) και ελέγχονται τα αποτελέσματα πρόβλεψης με τα πραγματικά δεδομένα της χώρας αυτής. Η παραπάνω διαδικασία επαναλαμβάνεται δεκατέσσερις φορές, αφήνοντας δηλαδή κάθε φορά μια χώρα εκτός δείγματος και τελικά εξάγεται ο μέσος όρος των επαναλήψεων. Στην παρούσα διατριβή, και χρησιμοποιώντας τον εκτός του δείγματος έλεγχο, επιτυγχάνεται η κατά 82.4% σωστή ταξινόμηση (Ακρίβεια – Accuracy), 78.4% ποσοστό Αληθινών Θετικών (Τrue Ρositive Rate - TPR) και 80.6% ποσοστό Θετικής Τιμής Πρόβλεψης (Positive Predictive Value - PPV). Σύμφωνα με την πολυεπίπεδη προσέγγιση, διακρίνονται δύο επίπεδα-περίοδοι πρόβλεψης των Συστημικών Τραπεζικών Κρίσεων. Το πρώτο επίπεδο ονομάζεται έγκαιρη πρόβλεψη (early warning) και αφορά περίοδο -12 έως -7 τρίμηνα πριν την έλευση της κρίσης ενώ το δεύτερο επίπεδο ονομάζεται καθυστερημένη πρόβλεψη (late warning) και αφορά περίοδο -6 έως -1 τρίμηνα πριν την έλευση της κρίσης. Για την πολυεπίπεδη αυτή ταξινόμηση, γίνεται χρήση των Νευρωνικών Δικτύων (Neural Networks), της Πολυωνυμικής Λογιστικής Παλινδρόμησης (Multinomial Logistic Regression) και της Πολυεπίπεδης Γραμμικής Διακριτής Ανάλυσης (Multinomial Discriminant Analysis). Εφαρμόζοντας τον ίδιο εκτός του δείγματος έλεγχο με την πρώτη προσέγγιση επιτυγχάνεται η κατά 85.7% σωστή ταξινόμηση με την βέλτιστη μέθοδο που αποδεικνύεται ότι είναι η Πολυεπίπεδη Γραμμική Διακριτή Ανάλυση. Εφαρμόζοντας την ανωτέρω ανάλυση, οι ενδιαφερόμενοι φορείς άσκησης πολιτικής (policy makers) μπορούν να ανιχνεύσουν την ύπαρξης κρίσης σε βάθος χρόνου έως τριών ετών με τα προτεινόμενα μοντέλα, χρησιμοποιώντας μόνο δεδομένα που υπάρχουν ελεύθερα προσβάσιμα στο κοινό, ασκώντας με τον τρόπο αυτό την κατάλληλη ανά περίπτωση μακροπροληπτική πολιτική (macroprudential policy).


Author(s):  
Jung-Han Wang ◽  
Mohamed A. Abdel-Aty ◽  
Jaeyoung Lee

The Highway Safety Manual (HSM) Part C provides a series of safety performance functions (SPFs) for different roadway conditions. The SPFs suggested in the HSM are formulated on the basis of exposure variables: the logarithms of the annual average daily traffic (AADT) on the major road and on the minor road under the base condition. In this research, data from 7,802 intersections in Florida were collected and processed. These intersections were categorized into seven types based on area type (rural or urban), number of legs (three or four), and number of approaches controlled by stop signs. Twenty-two SPF formulations, including the one suggested by the HSM, were developed for each intersection type for examination of the goodness-of-fit measures of the SPFs. In addition, the goodness of fit of each model of the 22 SPFs in each category was examined with 10-fold leave-one-out cross-validation (LOOCV). With a comparison of the delta values generated with the LOOCV method, it is suggested that the SPF with the logarithm of the total entering vehicle volume and the ratio of the AADT on the minor road and the AADT on the major road are important. In addition, the SPFs with the AADT on the major road and the AADT on the minor road and their logarithmic transformations are also important. Therefore, it is suggested that the future HSM compare these two SPF formulations—as suggested in the current research, along with the original SPF formulation in the manual—and select the one with the best model fit on the basis of the delta value using LOOCV.


Sign in / Sign up

Export Citation Format

Share Document