On calibration data selection: The case of stormwater quality regression models

Input variable selection and calibration data selection for storm water quality regression models

Water Science & Technology ◽

10.2166/wst.2013.222 ◽

2013 ◽

Vol 68 (1) ◽

pp. 50-58 ◽

Cited By ~ 6

Author(s):

Siao Sun ◽

Jean-Luc Bertrand-Krajewski

Keyword(s):

Water Quality ◽

Variable Selection ◽

Model Calibration ◽

Regression Models ◽

Selection Method ◽

Random Selection ◽

Data Selection ◽

Storm Water ◽

Calibration Data ◽

Model Input

Storm water quality models are useful tools in storm water management. Interest has been growing in analyzing existing data for developing models for urban storm water quality evaluations. It is important to select appropriate model inputs when many candidate explanatory variables are available. Model calibration and verification are essential steps in any storm water quality modeling. This study investigates input variable selection and calibration data selection in storm water quality regression models. The two selection problems are mutually interacted. A procedure is developed in order to fulfil the two selection tasks in order. The procedure firstly selects model input variables using a cross validation method. An appropriate number of variables are identified as model inputs to ensure that a model is neither overfitted nor underfitted. Based on the model input selection results, calibration data selection is studied. Uncertainty of model performances due to calibration data selection is investigated with a random selection method. An approach using the cluster method is applied in order to enhance model calibration practice based on the principle of selecting representative data for calibration. The comparison between results from the cluster selection method and random selection shows that the former can significantly improve performances of calibrated models. It is found that the information content in calibration data is important in addition to the size of calibration data.

Download Full-text

Calibration and validation of multiple regression models for stormwater quality prediction: data partitioning, effect of dataset size and characteristics

Water Science & Technology ◽

10.2166/wst.2005.0060 ◽

2005 ◽

Vol 52 (3) ◽

pp. 45-52 ◽

Cited By ~ 31

Author(s):

M. Mourad ◽

J.-L. Bertrand-Krajewski ◽

G. Chebbo

Keyword(s):

Multiple Regression ◽

Regression Models ◽

Data Partitioning ◽

Data Sets ◽

Calibration Data ◽

Stormwater Quality ◽

Calibration And Validation ◽

Multiple Regression Models ◽

Dataset Size ◽

Few Data

Two main issues regarding stormwater quality models have been investigated: i) the effect of calibration dataset size and characteristics on calibration and validation results; ii) the optimal split of available data into calibration and validation subsets. Data from 13 catchments have been used for three pollutants: BOD, COD and SS. Three multiple regression models were calibrated and validated. The use of different data sets and different models allows viewing general trends. It was found mainly that multiple regression models are case sensitive to calibration data. Few data used for calibration infers bad predictions despite good calibration results. It was also found that the random split of available data into halves for calibration and validation is not optimal. More data should be allocated to calibration. The proportion of data to be used for validation increases with the number of available data (N) and reaches about 35% for N around 55 measured events.

Download Full-text

Evaluation of appearance-based eye tracking calibration data selection

2020 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA) ◽

10.1109/icaica50127.2020.9181854 ◽

2020 ◽

Author(s):

Yuqing Li ◽

Yinwei Zhan ◽

Zhuo Yang

Keyword(s):

Eye Tracking ◽

Data Selection ◽

Calibration Data

Download Full-text

Development of Multiple Linear Regression Models for Predicting the Stormwater Quality of Urban Sub-Watersheds

Bulletin of Environmental Contamination and Toxicology ◽

10.1007/s00128-013-1160-y ◽

2013 ◽

Vol 92 (1) ◽

pp. 36-43 ◽

Cited By ~ 8

Author(s):

Amarpreet S. Arora ◽

Akepati S. Reddy

Keyword(s):

Linear Regression ◽

Multiple Linear Regression ◽

Regression Models ◽

Linear Regression Models ◽

Stormwater Quality ◽

Multiple Linear Regression Models

Download Full-text

Indices for calibration data selection of the rainfall-runoff model

Water Resources Research ◽

10.1029/2009wr008668 ◽

2010 ◽

Vol 46 (4) ◽

Cited By ~ 19

Author(s):

Jia Liu ◽

Dawei Han

Keyword(s):

Data Selection ◽

Calibration Data ◽

Rainfall Runoff ◽

Rainfall Runoff Model ◽

Runoff Model ◽

Selection Of

Download Full-text

Evaluation of Regression Models of Balance Calibration Data Using an Empirical Criterion

28th Aerodynamic Measurement Technology, Ground Testing, and Flight Testing Conference ◽

10.2514/6.2012-3323 ◽

2012 ◽

Author(s):

Norbert Ulbrich ◽

Thomas Volden

Keyword(s):

Regression Models ◽

Calibration Data ◽

Empirical Criterion ◽

Balance Calibration

Download Full-text

Stormwater quality models: sensitivity to calibration data

Water Science & Technology ◽

10.2166/wst.2005.0110 ◽

2005 ◽

Vol 52 (5) ◽

pp. 61-68 ◽

Cited By ~ 10

Author(s):

M. Mourad ◽

J.-L. Bertrand-Krajewski ◽

G. Chebbo

Keyword(s):

Suspended Solids ◽

Calibration Data ◽

Stormwater Quality ◽

Local Data ◽

Sewer System ◽

Error Sources ◽

Quality Models ◽

Sewer Systems ◽

Complex Models

Stormwater quality modelling is a useful tool in sewer systems management. Available models range from simple to detailed complex ones. The models need local data to be calibrated. In practice, calibration data are rather lacking. Only few measured events are commonly used. In this paper, the effect of the number and the variability of calibration data on models of various levels of complexity are investigated. The study is carried out on “Le Marais” catchment for suspended solids where 40 reliable measured events and good knowledge of the sewer system are available. The method used is based on resampling subsets of measured events among the 40 available ones. Three types of models were calibrated using subsets of events of different sizes and characteristics resampled among the 40 available ones. For each calibration, the model was validated against the remaining events to stand upon the quality of the model. It was found that the models are quite sensitive to calibration data, a problem neglected in practical studies. The use of more complex models does not necessarily improve modelling results since more problems and error sources are to be expected. The findings are specific to “Le Marais” catchment and the models used.

Download Full-text

Comparison of Artificial Neural Network and Regression Models in the Prediction of Urban Stormwater Quality

Water Environment Research ◽

10.2175/106143007x184591 ◽

2008 ◽

Vol 80 (1) ◽

pp. 4-9 ◽

Cited By ~ 7

Author(s):

D. May ◽

M. Sivakumar

Keyword(s):

Neural Network ◽

Artificial Neural Network ◽

Regression Models ◽

Stormwater Quality ◽

Urban Stormwater ◽

Artificial Neural ◽

Urban Stormwater Quality

Download Full-text

Design of a retention tank: comparison of stormwater quality models with various levels of complexity

Water Science & Technology ◽

10.2166/wst.2006.582 ◽

2006 ◽

Vol 54 (6-7) ◽

pp. 231-238 ◽

Cited By ~ 6

Author(s):

M. Mourad ◽

J.-L. Bertrand-Krajewski ◽

G. Chebbo

Keyword(s):

Simulation Models ◽

Data Sets ◽

Calibration Data ◽

Storm Events ◽

Stormwater Quality ◽

Event Mean Concentration ◽

Transport Modelling ◽

Detention Tank ◽

Systems Modelling ◽

Mean Concentration

Stormwater quality simulation models are useful tools for the design and management of sewer systems. Modelling results are highly sensitive to experimental data used for calibration. This sensitivity is examined for three modelling approaches of various complexities (site mean concentration approach, event mean concentration approach and build-up, washoff and transport modelling approach) applied to a typical case study (design of a dry detention tank), accounting for the variability of calibration data and their effect on simulation results. Calibrated models with different calibration data sets were used to simulate 3 years of rainfall with different retention tank specific volumes. Annual pollutant load interception efficiencies were determined. Simulations results revealed i) that there is no advantage in using the EMC model compared to the SMC model and ii) that the BWT model resulted in higher design ratios than those given by the SMC/hydraulic approach. For both EMC and BWT models, using an increasing number n of events for calibration leads to narrower confidence intervals for the design ratios. It is crucial for design ratios to account for successive storm events in chronological order and to account for the maximum allowable flow to be transferred to the downstream WWTP.

Download Full-text

Distribution-Based Calibration of a Stormwater Quality Model

Water ◽

10.3390/w10081027 ◽

2018 ◽

Vol 10 (8) ◽

pp. 1027 ◽

Cited By ~ 2

Author(s):

Dominik Leutnant ◽

Dirk Muschalla ◽

Mathias Uhl

Keyword(s):

Goodness Of Fit ◽

Experimental Models ◽

Model Parameters ◽

Calibration Data ◽

Stormwater Quality ◽

Quality Model ◽

Test Statistic ◽

Quality Models ◽

Parking Lot ◽

Flat Roof

Stormwater quality models are usually calibrated using observed pollutographs. As current models still rely on simplified model concepts for pollutant accumulation and wash-off, calibration results for continuous pollutant concentrations are highly uncertain. In this paper, we introduce an innovative calibration approach based on total suspended solids (TSS) event load distribution. The approach is applied on stormwater quality models for a flat roof and a parking lot for which reliable distributions are available. Exponential functions are employed for both TSS buildup and wash-off. Model parameters are calibrated by means of an evolutionary algorithm to minimize the distance between a parameterized lognormal distribution function and the cumulated distribution of simulated TSS event loads. Since TSS event load characteristics are probabilistically considered, the approach especially respects the stochasticity of TSS buildup and wash-off and, therefore, improves conventional stormwater quality calibration concepts. The results show that both experimental models were calibrated with high goodness-of-fit (Kolmogorov–Smirnov test statistic: 0.05). However, it is shown that events with high TSS event loads (>0.8 percentile) are generally underestimated. While this leads to a relative deviation of −28% of total TSS loads for the parking lot, the error is compensated for the flat roof (+5%). Calibrated model parameters generally tend to generate wash-off proportional to runoff, which is indicated by mass-volume curves. The approach itself is, in general, applicable and creates a new opportunity to calibrate stormwater quality models especially when calibration data is limited.

Download Full-text