Scoring Probabilistic Forecasts: The Importance of Being Proper

Jochen Bröcker; Leonard A. Smith

doi:10.1175/waf966.1

Scoring Probabilistic Forecasts: The Importance of Being Proper

Weather and Forecasting ◽

10.1175/waf966.1 ◽

2007 ◽

Vol 22 (2) ◽

pp. 382-388 ◽

Cited By ~ 100

Author(s):

Jochen Bröcker ◽

Leonard A. Smith

Keyword(s):

Continuous Variable ◽

Forecast Skill ◽

Current Concept ◽

Weather Forecasts ◽

Local Score ◽

Probabilistic Forecasts ◽

Probability Forecasts

Abstract Questions remain regarding how the skill of operational probabilistic forecasts is most usefully evaluated or compared, even though probability forecasts have been a long-standing aim in meteorological forecasting. This paper explains the importance of employing proper scores when selecting between the various measures of forecast skill. It is demonstrated that only proper scores provide internally consistent evaluations of probability forecasts, justifying the focus on proper scores independent of any attempt to influence the behavior of a forecaster. Another property of scores (i.e., locality) is discussed. Several scores are examined in this light. There is, effectively, only one proper, local score for probability forecasts of a continuous variable. It is also noted that operational needs of weather forecasts suggest that the current concept of a score may be too narrow; a possible generalization is motivated and discussed in the context of propriety and locality.

Download Full-text

Probabilistic Forecasts Using Analogs in the Idealized Lorenz96 Setting

Monthly Weather Review ◽

10.1175/2010mwr3542.1 ◽

2011 ◽

Vol 139 (6) ◽

pp. 1960-1971 ◽

Cited By ~ 12

Author(s):

Jakob W. Messner ◽

Georg J. Mayr

Keyword(s):

Logistic Regression ◽

Systematic Errors ◽

Lead Times ◽

Model Output ◽

Weather Forecasts ◽

Probabilistic Forecasts ◽

Direct Model ◽

Nwp Model

Abstract Three methods to make probabilistic weather forecasts by using analogs are presented and tested. The basic idea of these methods is that finding similar NWP model forecasts to the current one in an archive of past forecasts and taking the corresponding analyses as prediction should remove all systematic errors of the model. Furthermore, this statistical postprocessing can convert NWP forecasts to forecasts for point locations and easily turn deterministic forecasts into probabilistic ones. These methods are tested in the idealized Lorenz96 system and compared to a benchmark bracket formed by ensemble relative frequencies from direct model output and logistic regression. The analog methods excel at longer lead times.

Download Full-text

Verification of Ensemble-Based Uncertainty Circles around Tropical Cyclone Track Forecasts

Weather and Forecasting ◽

10.1175/waf-d-11-00007.1 ◽

2011 ◽

Vol 26 (5) ◽

pp. 664-676 ◽

Cited By ~ 21

Author(s):

Thierry Dupont ◽

Matthieu Plu ◽

Philippe Caroff ◽

Ghislain Faure

Keyword(s):

Tropical Cyclone ◽

Large Error ◽

Ensemble Prediction ◽

Probabilistic Forecasting ◽

Cyclone Track ◽

Ensemble Prediction System ◽

Weather Forecasts ◽

Probabilistic Forecasts ◽

Medium Range ◽

Uncertainty Information

Abstract Several tropical cyclone forecasting centers issue uncertainty information with regard to their official track forecasts, generally using the climatological distribution of position error. However, such methods are not able to convey information that depends on the situation. The purpose of the present study is to assess the skill of the Ensemble Prediction System (EPS) from the European Centre for Medium-Range Weather Forecasts (ECMWF) at measuring the uncertainty of up to 3-day track forecasts issued by the Regional Specialized Meteorological Centre (RSMC) La Réunion in the southwestern Indian Ocean. The dispersion of cyclone positions in the EPS is extracted and translated at the RSMC forecast position. The verification relies on existing methods for probabilistic forecasts that are presently adapted to a cyclone-position metric. First, the probability distribution of forecast positions is compared to the climatological distribution using Brier scores. The probabilistic forecasts have better scores than the climatology, particularly after applying a simple calibration scheme. Second, uncertainty circles are built by fixing the probability at 75%. Their skill at detecting small and large error values is assessed. The circles have some skill for large errors up to the 3-day forecast (and maybe after); but the detection of small radii is skillful only up to 2-day forecasts. The applied methodology may be used to assess and to compare the skill of different probabilistic forecasting systems of cyclone position.

Download Full-text

Probabilistic forecasting of wind power production losses in cold climates: a case study

Wind Energy Science ◽

10.5194/wes-3-667-2018 ◽

2018 ◽

Vol 3 (2) ◽

pp. 667-680 ◽

Cited By ~ 7

Author(s):

Jennie Molinder ◽

Heiner Körnich ◽

Esbjörn Olsson ◽

Hans Bergström ◽

Anna Sjöblom

Keyword(s):

Wind Power ◽

Energy Production ◽

Spatial Representation ◽

Forecast Skill ◽

Cold Climates ◽

Probabilistic Forecasting ◽

Probabilistic Forecasts ◽

Forecasting System ◽

Production Losses

Abstract. The problem of icing on wind turbines in cold climates is addressed using probabilistic forecasting to improve next-day forecasts of icing and related production losses. A case study of probabilistic forecasts was generated for a 2-week period. Uncertainties in initial and boundary conditions are represented with an ensemble forecasting system, while uncertainties in the spatial representation are included with a neighbourhood method. Using probabilistic forecasting instead of one single forecast was shown to improve the forecast skill of the ice-related production loss forecasts and hence the icing forecasts. The spread of the multiple forecasts can be used as an estimate of the forecast uncertainty and of the likelihood for icing and severe production losses. Best results, both in terms of forecast skill and forecasted uncertainty, were achieved using both the ensemble forecast and the neighbourhood method combined. This demonstrates that the application of probabilistic forecasting for wind power in cold climates can be valuable when planning next-day energy production, in the usage of de-icing systems and for site safety.

Download Full-text

Probabilistic forecasting of wind power production losses in cold climates: A case study

10.5194/wes-2017-28 ◽

2017 ◽

Cited By ~ 1

Author(s):

Jennie P. Söderman ◽

Heiner Körnich ◽

Esbjörn Olsson ◽

Hans Bergström ◽

Anna Sjöblom

Keyword(s):

Wind Power ◽

Energy Production ◽

Cold Climate ◽

Forecast Skill ◽

Cold Climates ◽

Probabilistic Forecasting ◽

Probabilistic Forecasts ◽

Forecasting System ◽

Production Losses

Abstract. The problem of icing on wind turbines in cold climates is addressed using probabilistic forecasting to improve next- day forecasts of icing and related production losses. A case study of probabilistic forecasts was generated for a two- week period. Uncertainties in initial and boundary conditions are represented with an ensemble forecasting system, while uncertainties in the spatial representation are included with a neighbourhood method. Using probabilistic forecasting instead of one single forecast was shown to improve the forecast skill of the ice-related production loss forecasts and hence the icing forecasts. The spread of the multiple forecasts can be used as an estimate of the forecast uncertainty and of the likelihood for icing and severe production losses. Best results, both in terms of forecast skill and forecasted uncertainty, were achieved using both the ensemble forecast and the neighbourhood method combined. This demonstrates that the application of probabilistic forecasting for wind power in cold climate can be valuable when planning next-day energy production, in the usage of de-icing systems, and for site safety.

Download Full-text

Neural Networks for Postprocessing Ensemble Weather Forecasts

Monthly Weather Review ◽

10.1175/mwr-d-18-0187.1 ◽

2018 ◽

Vol 146 (11) ◽

pp. 3885-3900 ◽

Cited By ~ 43

Author(s):

Stephan Rasp ◽

Sebastian Lerch

Keyword(s):

Neural Network ◽

Neural Networks ◽

Predictive Distribution ◽

Predictor Variables ◽

Specific Information ◽

Observation Data ◽

Neural Network Approach ◽

Weather Forecasts ◽

Probabilistic Forecasts ◽

Trained Neural Network

Abstract Ensemble weather predictions require statistical postprocessing of systematic errors to obtain reliable and accurate probabilistic forecasts. Traditionally, this is accomplished with distributional regression models in which the parameters of a predictive distribution are estimated from a training period. We propose a flexible alternative based on neural networks that can incorporate nonlinear relationships between arbitrary predictor variables and forecast distribution parameters that are automatically learned in a data-driven way rather than requiring prespecified link functions. In a case study of 2-m temperature forecasts at surface stations in Germany, the neural network approach significantly outperforms benchmark postprocessing methods while being computationally more affordable. Key components to this improvement are the use of auxiliary predictor variables and station-specific information with the help of embeddings. Furthermore, the trained neural network can be used to gain insight into the importance of meteorological variables, thereby challenging the notion of neural networks as uninterpretable black boxes. Our approach can easily be extended to other statistical postprocessing and forecasting problems. We anticipate that recent advances in deep learning combined with the ever-increasing amounts of model and observation data will transform the postprocessing of numerical weather forecasts in the coming decade.

Download Full-text

The Madden–Julian Oscillation and Boreal Winter Forecast Skill: An Analysis of NCEP CFSv2 Reforecasts

Journal of Climate ◽

10.1175/jcli-d-15-0149.1 ◽

2015 ◽

Vol 28 (15) ◽

pp. 6297-6307 ◽

Cited By ~ 12

Author(s):

Charles Jones ◽

Abheera Hazra ◽

Leila M. V. Carvalho

Keyword(s):

Forecast Skill ◽

Western Pacific ◽

Boreal Winter ◽

Lead Times ◽

Madden Julian Oscillation ◽

Main Mode ◽

Weather Forecasts ◽

Synoptic Variability ◽

The Impact ◽

The Western Pacific

Abstract The Madden–Julian oscillation (MJO) is the main mode of tropical intraseasonal variations and bridges weather and climate. Because the MJO has a slow eastward propagation and longer time scale relative to synoptic variability, significant interest exists in exploring the predictability of the MJO and its influence on extended-range weather forecasts (i.e., 2–4-week lead times). This study investigates the impact of the MJO on the forecast skill in Northern Hemisphere extratropics during boreal winter. Several 45-day forecasts of geopotential height (500 hPa) from NCEP Climate Forecast System version 2 (CFSv2) reforecasts are used (1 November–31 March 1999–2010). The variability of the MJO expressed as different amplitudes, durations, and recurrence (i.e., primary and successive events) and their influence on forecast skill is analyzed and compared against inactive periods (i.e., null cases). In general, forecast skill during enhanced MJO convection over the western Pacific is systematically higher than in inactive days. When the enhanced MJO convection is over the Maritime Continent, forecasts are lower than in null cases, suggesting potential model deficiencies in accurately forecasting the eastward propagation of the MJO over that region and the associated extratropical response. In contrast, forecasts are more skillful than null cases when the enhanced convection is over the western Pacific and during long, intense, and successive MJO events. These results underscore the importance of the MJO as a potential source of predictability on 2–4-week lead times.

Download Full-text

Exploring the links between hydrological forecast skill and multiple flood hazard drivers in southern Africa

10.5194/egusphere-egu2020-17754 ◽

2020 ◽

Author(s):

Andrea Ficchi ◽

Hannah Cloke ◽

Ervin Zsoter ◽

Christel Prudhomme ◽

Liz Stephens

Keyword(s):

Tropical Cyclones ◽

Southern Africa ◽

Flood Hazard ◽

Extreme Rainfall ◽

Probability Of Detection ◽

Forecast Skill ◽

Management Service ◽

Flood Events ◽

Weather Forecasts ◽

Mesoscale Convective

Severe flooding in southern Africa is caused by a variety of meteorological hazards including intense tropical cyclones and depressions, mesoscale convective complexes and persistent lows which bring extreme rainfall and flood events with different characteristics. Little is known about the relative predictability of flooding associated to these different drivers, especially in operational forecasting systems. Understanding the limits of predictability for the different drivers of flooding is important to provide evidence of forecast capabilities to end-users and decision-makers and build trust in the use of the forecasting systems.Here we explore the skill of probabilistic flood forecasts from the operational Copernicus Emergency Management Service Global Flood Awareness System (GloFAS v2) over southern Africa. GloFAS provides real-time hydrological forecasts up to 30 days ahead by coupling ensemble weather forecasts from the European Centre for Medium-Range Weather Forecasts (ECMWF) with hydrological modelling. The GloFAS flood forecasts are openly available and can support humanitarians and other international organisations to trigger action before a devastating flood occurs.Using hydrological records of past flood events over the last 20 years, the GloFAS forecast skill is assessed by analysing the probability of detection of the events over different lead-times from 1 to 30 days, as well as the consistency and accuracy of predictions of event-based characteristics such as the flood timing and duration. We stratify the analysis by the multi hazard drivers of flooding with a focus on the distinction between tropical cyclones and other types of meteorological events. We suggest that such a stratified analysis of forecast skill can help modellers better understand the sources of predictability in flood forecasts and can support humanitarians to define specific trigger levels for forecast-based action for different types of flood events.

Download Full-text

Evaluation of the IRI'S “Net Assessment” Seasonal Climate Forecasts: 1997–2001

Bulletin of the American Meteorological Society ◽

10.1175/bams-84-12-1761 ◽

2003 ◽

Vol 84 (12) ◽

pp. 1761-1782 ◽

Cited By ~ 111

Author(s):

L. Goddard ◽

A. G. Barnston ◽

S. J. Mason

Keyword(s):

Skill Score ◽

Forecast Skill ◽

Seasonal Temperature ◽

Seasonal Climate ◽

Prediction Tools ◽

Temperature And Precipitation ◽

Climate Forecasts ◽

Probabilistic Forecasts ◽

International Research Institute ◽

The Individual

The International Research Institute for Climate Prediction (IRI) net assessment seasonal temperature and precipitation forecasts are evaluated for the 4-yr period from October–December 1997 to October–December 2001. These probabilistic forecasts represent the human distillation of seasonal climate predictions from various sources. The ranked probability skill score (RPSS) serves as the verification measure. The evaluation is offered as time-averaged spatial maps of the RPSS as well as area-averaged time series. A key element of this evaluation is the examination of the extent to which the consolidation of several predictions, accomplished here subjectively by the forecasters, contributes to or detracts from the forecast skill possible from any individual prediction tool. Overall, the skills of the net assessment forecasts for both temperature and precipitation are positive throughout the 1997–2001 period. The skill may have been enhanced during the peak of the 1997/98 El Niño, particularly for tropical precipitation, although widespread positive skill exists even at times of weak forcing from the tropical Pacific. The temporally averaged RPSS for the net assessment temperature forecasts appears lower than that for the AGCMs. Over time, however, the IRI forecast skill is more consistently positive than that of the AGCMs. The IRI precipitation forecasts generally have lower skill than the temperature forecasts, but the forecast probabilities for precipitation are found to be appropriate to the frequency of the observed outcomes, and thus reliable. Over many regions where the precipitation variability is known to be potentially predictable, the net assessment precipitation forecasts exhibit more spatially coherent areas of positive skill than most, if not all, prediction tools. On average, the IRI net assessment forecasts appear to perform better than any of the individual objective prediction tools.

Download Full-text

Assessment of Radio Occultation Observations from the COSMIC-2 Mission with a Simplified Observing System Simulation Experiment Configuration

Monthly Weather Review ◽

10.1175/mwr-d-16-0475.1 ◽

2017 ◽

Vol 145 (9) ◽

pp. 3581-3597 ◽

Cited By ~ 9

Author(s):

L. Cucurull ◽

R. Li ◽

T. R. Peevey

Keyword(s):

Radio Occultation ◽

Weather Forecasting ◽

System Simulation ◽

Weather Prediction ◽

Forecast Model ◽

Forecast Skill ◽

Weather Forecasts ◽

Global Coverage ◽

Prediction Systems ◽

Observing System Simulation Experiment

The mainstay of the global radio occultation (RO) system, the COSMIC constellation of six satellites launched in April 2006, is already past the end of its nominal lifetime and the number of soundings is rapidly declining because the constellation is degrading. For about the last decade, COSMIC profiles have been collected and their retrievals assimilated in numerical weather prediction systems to improve operational weather forecasts. The success of RO in increasing forecast skill and COSMIC’s aging constellation have motivated planning for the COSMIC-2 mission, a 12-satellite constellation to be deployed in two launches. The first six satellites (COSMIC-2A) are expected to be deployed in December 2017 in a low-inclination orbit for dense equatorial coverage, while the second six (COSMIC-2B) are expected to be launched later in a high-inclination orbit for global coverage. To evaluate the potential benefits from COSMIC-2, an earlier version of the NCEP’s operational forecast model and data assimilation system is used to conduct a series of observing system simulation experiments with simulated soundings from the COSMIC-2 mission. In agreement with earlier studies using real RO observations, the benefits from assimilating COSMIC-2 observations are found to be most significant in the Southern Hemisphere. No or very little gain in forecast skill is found by adding COSMIC-2A to COSMIC-2B, making the launch of COSMIC-2B more important for terrestrial global weather forecasting than that of COSMIC-2A. Furthermore, results suggest that further improvement in forecast skill might better be obtained with the addition of more RO observations with global coverage and other types of observations.

Download Full-text

Origin and Impact of Initialization Shocks in Coupled Atmosphere–Ocean Forecasts*

Monthly Weather Review ◽

10.1175/mwr-d-15-0076.1 ◽

2015 ◽

Vol 143 (11) ◽

pp. 4631-4644 ◽

Cited By ~ 41

Author(s):

David P. Mulholland ◽

Patrick Laloyaux ◽

Keith Haines ◽

Magdalena Alonso Balmaseda

Keyword(s):

Data Assimilation ◽

Coupled System ◽

Atmospheric Temperature ◽

Ocean Model ◽

Forecast Skill ◽

Weather Forecasts ◽

Model Component ◽

Medium Range ◽

Series Of Experiments ◽

Coupled Data Assimilation

Abstract Current methods for initializing coupled atmosphere–ocean forecasts often rely on the use of separate atmosphere and ocean analyses, the combination of which can leave the coupled system imbalanced at the beginning of the forecast, potentially accelerating the development of errors. Using a series of experiments with the European Centre for Medium-Range Weather Forecasts coupled system, the magnitude and extent of these so-called initialization shocks is quantified, and their impact on forecast skill measured. It is found that forecasts initialized by separate oceanic and atmospheric analyses do exhibit initialization shocks in lower atmospheric temperature, when compared to forecasts initialized using a coupled data assimilation method. These shocks result in as much as a doubling of root-mean-square error on the first day of the forecast in some regions, and in increases that are sustained for the duration of the 10-day forecasts performed here. However, the impacts of this choice of initialization on forecast skill, assessed using independent datasets, were found to be negligible, at least over the limited period studied. Larger initialization shocks are found to follow a change in either the atmosphere or ocean model component between the analysis and forecast phases: changes in the ocean component can lead to sea surface temperature shocks of more than 0.5 K in some equatorial regions during the first day of the forecast. Implications for the development of coupled forecast systems, particularly with respect to coupled data assimilation methods, are discussed.

Download Full-text