From research to applications – Examples of operational ensemble post-processing in France using machine learning

Mapping Intimacies ◽

10.5194/npg-2019-65 ◽

2020 ◽

Author(s):

Maxime Taillardat ◽

Olivier Mestre

Keyword(s):

Machine Learning ◽

Heat Waves ◽

Computation Time ◽

Ensemble Prediction ◽

Post Processing ◽

Ensemble Forecasts ◽

Ensemble Prediction System ◽

Practical Applications ◽

Medium Resolution ◽

Hourly Rainfall

Abstract. Statistical post-processing of ensemble forecasts, from simple linear regressions to more sophisticated techniques, is now a well-known procedure in order to correct biased and misdispersed ensemble weather predictions. However, practical applications in National Weather Services is still in its infancy compared to deterministic post-processing. This paper presents two different applications of ensemble post-processing using machine learning at an industrial scale. The first is a station-based post-processing of surface temperature in a medium resolution ensemble system. The second is a gridded post-processing of hourly rainfall amounts in a high resolution ensemble prediction system. The techniques used rely on quantile regression forests (QRF) and ensemble copula coupling (ECC), chosen for their robustness and simplicity of training whatever the variable subject to calibration. Moreover, some variants of classical techniques used such as QRF or ECC have been developed in order to adjust to operational constraints. A forecast anomaly-based QRF is used for temperature for a better prediction of cold and heat waves. A variant of ECC for hourly rainfall is built, accounting for more realistic longer rainfall accumulations. It is shown that forecast quality as well as forecast value is improved compared to the raw ensemble. At last, comments about model size and computation time are made.

Download Full-text

From research to applications – examples of operational ensemble post-processing in France using machine learning

Nonlinear Processes in Geophysics ◽

10.5194/npg-27-329-2020 ◽

2020 ◽

Vol 27 (2) ◽

pp. 329-347 ◽

Cited By ~ 1

Author(s):

Maxime Taillardat ◽

Olivier Mestre

Keyword(s):

Machine Learning ◽

Heat Waves ◽

Computation Time ◽

Ensemble Prediction ◽

Post Processing ◽

Ensemble Forecasts ◽

Ensemble Prediction System ◽

Practical Applications ◽

Medium Resolution ◽

Hourly Rainfall

Abstract. Statistical post-processing of ensemble forecasts, from simple linear regressions to more sophisticated techniques, is now a well-known procedure for correcting biased and poorly dispersed ensemble weather predictions. However, practical applications in national weather services are still in their infancy compared to deterministic post-processing. This paper presents two different applications of ensemble post-processing using machine learning at an industrial scale. The first is a station-based post-processing of surface temperature and subsequent interpolation to a grid in a medium-resolution ensemble system. The second is a gridded post-processing of hourly rainfall amounts in a high-resolution ensemble prediction system. The techniques used rely on quantile regression forests (QRFs) and ensemble copula coupling (ECC), chosen for their robustness and simplicity of training regardless of the variable subject to calibration. Moreover, some variants of classical techniques used, such as QRF and ECC, were developed in order to adjust to operational constraints. A forecast anomaly-based QRF is used for temperature for a better prediction of cold and heat waves. A variant of ECC for hourly rainfall was built, accounting for more realistic longer rainfall accumulations. We show that both forecast quality and forecast value are improved compared to the raw ensemble. Finally, comments about model size and computation time are made.

Download Full-text

From research to applications – Examples of operational ensemble post-processing in France using machine learning

10.5194/egusphere-egu2020-7804 ◽

2020 ◽

Author(s):

Maxime Taillardat ◽

Olivier Mestre

Keyword(s):

Machine Learning ◽

Heat Waves ◽

Computation Time ◽

Ensemble Prediction ◽

Post Processing ◽

Ensemble Forecasts ◽

Ensemble Prediction System ◽

Practical Applications ◽

Medium Resolution ◽

Hourly Rainfall

Statistical post-processing of ensemble forecasts, from simple linear regressions to more sophisticated techniques, is now a well-known procedure in order to correct biased and misdispersed ensemble weather predictions. However, practical applications in National Weather Services is still in its infancy compared to deterministic post-processing. This paper presents two different applications of ensemble post-processing using machine learning at an industrial scale. The first is a station-based post-processing of surface temperature in a medium resolution ensemble system. The second is a gridded post-processing of hourly rainfall amounts in a high resolution ensemble prediction system. The techniques used rely on quantile regression forests (QRF) and ensemble copula coupling (ECC), chosen for their robustness and simplicity of training whatever the variable subject to calibration.Moreover, some variants of classical techniques used such as QRF or ECC have been developed in order to adjust to operational constraints. A forecast anomaly-based QRF is used for temperature for a better prediction of cold and heat waves. A variant of ECC for hourly rainfall is built, accounting for more realistic longer rainfall accumulations. It is shown that forecast quality as well as forecast value is improved compared to the raw ensemble. At last, comments about model size and computation time are made.

Download Full-text

Calibration of wind speed ensemble forecasts for power generation

Időjárás ◽

10.28974/idojaras.2021.4.4 ◽

2021 ◽

Vol 125 (4) ◽

pp. 609-624

Author(s):

Sándor Baran ◽

Ágnes Baran

Keyword(s):

Machine Learning ◽

Wind Speed ◽

Wind Power ◽

Weather Prediction ◽

Predictive Performance ◽

Ensemble Prediction ◽

Post Processing ◽

Ensemble Forecasts ◽

Ensemble Prediction System ◽

Truncated Normal

In the last decades, wind power became the second largest energy source in the EU covering 16% of its electricity demand. However, due to its volatility, accurate short range wind power predictions are required for successful integration of wind energy into the electrical grid. Accurate predictions of wind power require accurate hub height wind speed forecasts, where the state-of-the-art method is the probabilistic approach based on ensemble forecasts obtained from multiple runs of numerical weather prediction models. Nonetheless, ensemble forecasts are often uncalibrated and might also be biased, thus require some form of post-processing to improve their predictive performance. We propose a novel flexible machine learning approach for calibrating wind speed ensemble forecasts, which results in a truncated normal predictive distribution. In a case study based on 100m wind speed forecasts produced by the operational ensemble prediction system of the Hungarian Meteorological Service, the forecast skill of this method is compared with the predictive performance of three different ensemble model output statistics approaches and the raw ensemble forecasts. We show that compared with the raw ensemble, post-processing always improves the calibration of probabilistic and accuracy of point forecasts, and from the four competing methods, the novel machine learning based approach results in the best overall performance.

Download Full-text

Statistical and machine learning methods for postprocessing ensemble forecasts of wind gusts

10.5194/egusphere-egu21-1326 ◽

2021 ◽

Author(s):

Benedikt Schulz ◽

Sebastian Lerch

Keyword(s):

Neural Network ◽

Machine Learning ◽

Ensemble Prediction ◽

Gradient Boosting ◽

Learning Methods ◽

Ensemble Forecasts ◽

Ensemble Prediction System ◽

Additional Information ◽

Machine Learning Methods ◽

Wind Gusts

We conduct a systematic and comprehensive comparison of state-of-the-art postprocessing methods for ensemble forecasts of wind gusts. The compared approaches range from well-established techniques to novel neural network-based methods. Our study is based on a 6-year dataset of forecasts from the convection&#8208;permitting COSMO&#8208;DE ensemble prediction system, with hourly lead times up to 21 hours and forecasts of 57 meteorological variables, and corresponding observations from 175 weather stations over Germany. We find that simpler methods such as ensemble model output statistics (EMOS), member-by-member postprocessing and a novel isotonic distributional regression approach, which utilize ensemble forecasts of wind gusts as sole inputs, already result in improvement in terms of the mean CRPS of up to 40% compared to the raw ensemble predictions. This can be substantially improved upon by more complex machine learning methods such as gradient boosting-based extensions of EMOS, quantile regression forests, and variants of neural network-based approaches that are capable of incorporating additional information from the large variety of available predictor variables.

Download Full-text

Forest-Based and Semiparametric Methods for the Postprocessing of Rainfall Ensemble Forecasting

Weather and Forecasting ◽

10.1175/waf-d-18-0149.1 ◽

2019 ◽

Vol 34 (3) ◽

pp. 617-634 ◽

Cited By ~ 8

Author(s):

Maxime Taillardat ◽

Anne-Laure Fougères ◽

Philippe Naveau ◽

Olivier Mestre

Keyword(s):

Heavy Rainfall ◽

Hybrid Methods ◽

Ensemble Prediction ◽

Ensemble Forecasts ◽

Ensemble Prediction System ◽

Wide Range ◽

Selection Step ◽

Heavy Tailed ◽

Ensemble Model Output Statistics ◽

Model Output Statistics

Abstract To satisfy a wide range of end users, rainfall ensemble forecasts have to be skillful for both low precipitation and extreme events. We introduce local statistical postprocessing methods based on quantile regression forests and gradient forests with a semiparametric extension for heavy-tailed distributions. These hybrid methods make use of the forest-based outputs to fit a parametric distribution that is suitable to model jointly low, medium, and heavy rainfall intensities. Our goal is to improve ensemble quality and value for all rainfall intensities. The proposed methods are applied to daily 51-h forecasts of 6-h accumulated precipitation from 2012 to 2015 over France using the Météo-France ensemble prediction system called Prévision d’Ensemble ARPEGE (PEARP). They are verified with a cross-validation strategy and compete favorably with state-of-the-art methods like analog ensemble or ensemble model output statistics. Our methods do not assume any parametric links between the variables to calibrate and possible covariates. They do not require any variable selection step and can make use of more than 60 predictors available such as summary statistics on the raw ensemble, deterministic forecasts of other parameters of interest, or probabilities of convective rainfall. In addition to improvements in overall performance, hybrid forest-based procedures produced the largest skill improvements for forecasting heavy rainfall events.

Download Full-text

Increasing the horizontal resolution of ensemble forecasts at CMC

Nonlinear Processes in Geophysics ◽

10.5194/npg-10-463-2003 ◽

2003 ◽

Vol 10 (6) ◽

pp. 463-468 ◽

Cited By ~ 38

Author(s):

G. Pellerin ◽

L. Lefaivre ◽

P. Houtekamer ◽

C. Girard

Keyword(s):

Operating Characteristic ◽

Horizontal Resolution ◽

Ensemble Prediction ◽

Ensemble Size ◽

Ensemble Forecasts ◽

Relative Operating Characteristic ◽

Sea Level Pressure ◽

Ensemble Prediction System ◽

Probability Of Precipitation ◽

Model Approach

Abstract. Ensemble forecasts are run operationally since February 1998 at the Canadian Meteorological Centre, with outputs up to ten days. The ensemble size was increased from eight to sixteen members in August 1999. The method of producing the perturbed analyses consists of running independent assimilation cycles that use perturbed sets of observations and are driven by eight different models, mainly different in their physical parameterizations. Perturbed analyses are doubled by taking opposite pairs. A multi-model approach is then used to obtain the forecasts. The ensemble output has been used to generate several products. In view of increasing computing facilities, the ensemble prediction system horizontal resolution was increased to TL149 in June 2001. Heights at 500 hPa and mean sea-level pressure maps are regularly used. Charts of precipitation with the probability of precipitation being above various thresholds are also produced at each run. The probabilistic forecast of the 24-h accumulated precipitation has shown skill as demonstrated by the relative operating characteristic (ROC). Verifications of the ensemble forecasts will be presented.

Download Full-text

Wave Extremes in the Northeast Atlantic from Ensemble Forecasts

Journal of Climate ◽

10.1175/jcli-d-12-00738.1 ◽

2013 ◽

Vol 26 (19) ◽

pp. 7525-7540 ◽

Cited By ~ 25

Author(s):

Øyvind Breivik ◽

Ole Johan Aarnes ◽

Jean-Raymond Bidlot ◽

Ana Carrasco ◽

Øyvind Saetra

Keyword(s):

Ensemble Prediction ◽

Lead Times ◽

Ensemble Forecasts ◽

Ensemble Prediction System ◽

Northeast Atlantic ◽

The North ◽

Time Period ◽

The North Sea ◽

Mean And Variance ◽

Buoy Data

Abstract A method for estimating return values from ensembles of forecasts at advanced lead times is presented. Return values of significant wave height in the northeast Atlantic, the Norwegian Sea, and the North Sea are computed from archived +240-h forecasts of the ECMWF Ensemble Prediction System (EPS) from 1999 to 2009. Three assumptions are made: First, each forecast is representative of a 6-h interval and collectively the dataset is then comparable to a time period of 226 years. Second, the model climate matches the observed distribution, which is confirmed by comparing with buoy data. Third, the ensemble members are sufficiently uncorrelated to be considered independent realizations of the model climate. Anomaly correlations of 0.20 are found, but peak events (>P97) are entirely uncorrelated. By comparing return values from individual members with return values of subsamples of the dataset it is also found that the estimates follow the same distribution and appear unaffected by correlations in the ensemble. The annual mean and variance over the 11-yr archived period exhibit no significant departures from stationarity compared with a recent reforecast; that is, there is no spurious trend because of model upgrades. The EPS yields significantly higher return values than the 40-yr ECMWF Re-Analysis (ERA-40) and ECMWF Interim Re-Analysis (ERA-Interim) and is in good agreement with the high-resolution 10-km Norwegian Reanalyses (NORA10) hindcast, except in the lee of unresolved islands where EPS overestimates and in enclosed seas where it has low bias. Confidence intervals are half the width of those found for ERA-Interim because of the magnitude of the dataset.

Download Full-text

The Soverato flood in Southern Italy: performance of global and limited-area ensemble forecasts

Nonlinear Processes in Geophysics ◽

10.5194/npg-10-261-2003 ◽

2003 ◽

Vol 10 (3) ◽

pp. 261-274 ◽

Cited By ~ 16

Author(s):

A. Montani ◽

C. Marsigli ◽

F. Nerozzi ◽

T. Paccagnella ◽

S. Tibaldi ◽

...

Keyword(s):

High Resolution ◽

Southern Italy ◽

Ensemble Prediction ◽

Limited Area ◽

Prediction System ◽

Probabilistic Prediction ◽

Ensemble Forecasts ◽

Ensemble Prediction System ◽

Limited Area Model ◽

Area Model

Abstract. The predictability of the flood event affecting Soverato (Southern Italy) in September 2000 is investigated by considering three different configurations of ECMWF ensemble: the operational Ensemble Prediction System (EPS), the targeted EPS and a high-resolution version of EPS. For each configuration, three successive runs of ECMWF ensemble with the same verification time are grouped together so as to generate a highly-populated "super-ensemble". Then, five members are selected from the super-ensemble and used to provide initial and boundary conditions for the integrations with a limited-area model, whose runs generate a Limited-area Ensemble Prediction System (LEPS). The relative impact of targeting the initial perturbations against increasing the horizontal resolution is assessed for the global ensembles as well as for the properties transferred to LEPS integrations, the attention being focussed on the probabilistic prediction of rainfall over a localised area. At the 108, 84 and 60- hour forecast ranges, the overall performance of the global ensembles is not particularly accurate and the best results are obtained by the high-resolution version of EPS. The LEPS performance is very satisfactory in all configurations and the rainfall maps show probability peaks in the correct regions. LEPS products would have been of great assistance to issue flood risk alerts on the basis of limited-area ensemble forecasts. For the 60-hour forecast range, the sensitivity of the results to the LEPS ensemble size is discussed by comparing a 5-member against a 51-member LEPS, where the limited-area model is nested on all EPS members. Little sensitivity is found as concerns the detection of the regions most likely affected by heavy precipitation, the probability peaks being approximately the same in both configurations.

Download Full-text

Predictions of 2010’s Tropical Cyclones Using the GFS and Ensemble-Based Data Assimilation Methods

Monthly Weather Review ◽

10.1175/mwr-d-11-00079.1 ◽

2011 ◽

Vol 139 (10) ◽

pp. 3243-3247 ◽

Cited By ~ 55

Author(s):

Thomas M. Hamill ◽

Jeffrey S. Whitaker ◽

Daryl T. Kleist ◽

Michael Fiorino ◽

Stanley G. Benjamin

Keyword(s):

Data Assimilation ◽

Three Dimensional ◽

Ensemble Prediction ◽

Ensemble Forecasts ◽

Ensemble Prediction System ◽

Weather Forecasts ◽

Successful Testing ◽

Gfs Model ◽

Environmental Prediction ◽

Statistical Interpolation

Abstract Experimental ensemble predictions of tropical cyclone (TC) tracks from the ensemble Kalman filter (EnKF) using the Global Forecast System (GFS) model were recently validated for the 2009 Northern Hemisphere hurricane season by Hamill et al. A similar suite of tests is described here for the 2010 season. Two major changes were made this season: 1) a reduction in the resolution of the GFS model, from 2009’s T384L64 (~31 km at 25°N) to 2010’s T254L64 (~47 km at 25°N), and some changes in model physics; and 2) the addition of a limited test of deterministic forecasts initialized from a hybrid three-dimensional variational data assimilation (3D-Var)/EnKF method. The GFS/EnKF ensembles continued to produce reduced track errors relative to operational ensemble forecasts created by the National Centers for Environmental Prediction (NCEP), the Met Office (UKMO), and the Canadian Meteorological Centre (CMC). The GFS/EnKF was not uniformly as skillful as the European Centre for Medium-Range Weather Forecasts (ECMWF) ensemble prediction system. GFS/EnKF track forecasts had slightly higher error than ECMWF at longer leads, especially in the western North Pacific, and exhibited poorer calibration between spread and error than in 2009, perhaps in part because of lower model resolution. Deterministic forecasts from the hybrid were competitive with deterministic EnKF ensemble-mean forecasts and superior in track error to those initialized from the operational variational algorithm, the Gridpoint Statistical Interpolation (GSI). Pending further successful testing, the National Oceanic and Atmospheric Administration (NOAA) intends to implement the global hybrid system operationally for data assimilation.

Download Full-text

The Dynamics of an Extreme Precipitation Event in Northeastern Vietnam in 2015 and Its Predictability in the ECMWF Ensemble Prediction System

Weather and Forecasting ◽

10.1175/waf-d-16-0142.1 ◽

2017 ◽

Vol 32 (3) ◽

pp. 1041-1056 ◽

Cited By ~ 5

Author(s):

Roderick van der Linden ◽

Andreas H. Fink ◽

Joaquim G. Pinto ◽

Tan Phan-Van

Keyword(s):

Large Scale ◽

Extreme Event ◽

Coastal Region ◽

Economic Loss ◽

Precipitation Event ◽

Ensemble Prediction ◽

Ensemble Forecasts ◽

Ensemble Prediction System ◽

Upper Level ◽

Quang Ninh

Abstract A record-breaking rainfall event occurred in northeastern Vietnam in late July–early August 2015. The coastal region in Quang Ninh Province was hit severely, with station rainfall sums in the range of 1000–1500 mm. The heavy rainfall led to flooding and landslides, which resulted in an estimated economic loss of $108 million (U.S. dollars) and 32 fatalities. Using a multitude of data sources and ECMWF ensemble forecasts, the synoptic–dynamic development and practical predictability of the event is investigated in detail for the 4-day period from 1200 UTC 25 July to 1200 UTC 29 July 2015, during which the major portion of the rainfall was observed. A slowly moving upper-level subtropical trough and the associated surface low in the northern Gulf of Tonkin promoted sustained moisture convergence and convection over northeastern Vietnam. The humidity was advected in a moisture transport band lying across the Indochina Peninsula and emanating from a tropical storm over the Bay of Bengal. Analyses of the ECMWF ensemble forecasts clearly showed a sudden emergence of the predictability of the extreme event at lead times of 3 days that was associated with the correct forecasts of the intensity and location of the subtropical trough in the 51 ensemble members. Thus, the Quang Ninh event is a good example in which the predictability of tropical convection arises from large-scale synoptic forcing; in the present case it was due to a tropical–extratropical interaction that has not been documented before for the region and season.

Download Full-text