Scoring Probabilistic Forecasts: The Importance of Being Proper

2007 ◽  
Vol 22 (2) ◽  
pp. 382-388 ◽  
Author(s):  
Jochen Bröcker ◽  
Leonard A. Smith

Abstract Questions remain regarding how the skill of operational probabilistic forecasts is most usefully evaluated or compared, even though probability forecasts have been a long-standing aim in meteorological forecasting. This paper explains the importance of employing proper scores when selecting between the various measures of forecast skill. It is demonstrated that only proper scores provide internally consistent evaluations of probability forecasts, justifying the focus on proper scores independent of any attempt to influence the behavior of a forecaster. Another property of scores (i.e., locality) is discussed. Several scores are examined in this light. There is, effectively, only one proper, local score for probability forecasts of a continuous variable. It is also noted that operational needs of weather forecasts suggest that the current concept of a score may be too narrow; a possible generalization is motivated and discussed in the context of propriety and locality.

2011 ◽  
Vol 139 (6) ◽  
pp. 1960-1971 ◽  
Author(s):  
Jakob W. Messner ◽  
Georg J. Mayr

Abstract Three methods to make probabilistic weather forecasts by using analogs are presented and tested. The basic idea of these methods is that finding similar NWP model forecasts to the current one in an archive of past forecasts and taking the corresponding analyses as prediction should remove all systematic errors of the model. Furthermore, this statistical postprocessing can convert NWP forecasts to forecasts for point locations and easily turn deterministic forecasts into probabilistic ones. These methods are tested in the idealized Lorenz96 system and compared to a benchmark bracket formed by ensemble relative frequencies from direct model output and logistic regression. The analog methods excel at longer lead times.


2011 ◽  
Vol 26 (5) ◽  
pp. 664-676 ◽  
Author(s):  
Thierry Dupont ◽  
Matthieu Plu ◽  
Philippe Caroff ◽  
Ghislain Faure

Abstract Several tropical cyclone forecasting centers issue uncertainty information with regard to their official track forecasts, generally using the climatological distribution of position error. However, such methods are not able to convey information that depends on the situation. The purpose of the present study is to assess the skill of the Ensemble Prediction System (EPS) from the European Centre for Medium-Range Weather Forecasts (ECMWF) at measuring the uncertainty of up to 3-day track forecasts issued by the Regional Specialized Meteorological Centre (RSMC) La Réunion in the southwestern Indian Ocean. The dispersion of cyclone positions in the EPS is extracted and translated at the RSMC forecast position. The verification relies on existing methods for probabilistic forecasts that are presently adapted to a cyclone-position metric. First, the probability distribution of forecast positions is compared to the climatological distribution using Brier scores. The probabilistic forecasts have better scores than the climatology, particularly after applying a simple calibration scheme. Second, uncertainty circles are built by fixing the probability at 75%. Their skill at detecting small and large error values is assessed. The circles have some skill for large errors up to the 3-day forecast (and maybe after); but the detection of small radii is skillful only up to 2-day forecasts. The applied methodology may be used to assess and to compare the skill of different probabilistic forecasting systems of cyclone position.


2018 ◽  
Vol 3 (2) ◽  
pp. 667-680 ◽  
Author(s):  
Jennie Molinder ◽  
Heiner Körnich ◽  
Esbjörn Olsson ◽  
Hans Bergström ◽  
Anna Sjöblom

Abstract. The problem of icing on wind turbines in cold climates is addressed using probabilistic forecasting to improve next-day forecasts of icing and related production losses. A case study of probabilistic forecasts was generated for a 2-week period. Uncertainties in initial and boundary conditions are represented with an ensemble forecasting system, while uncertainties in the spatial representation are included with a neighbourhood method. Using probabilistic forecasting instead of one single forecast was shown to improve the forecast skill of the ice-related production loss forecasts and hence the icing forecasts. The spread of the multiple forecasts can be used as an estimate of the forecast uncertainty and of the likelihood for icing and severe production losses. Best results, both in terms of forecast skill and forecasted uncertainty, were achieved using both the ensemble forecast and the neighbourhood method combined. This demonstrates that the application of probabilistic forecasting for wind power in cold climates can be valuable when planning next-day energy production, in the usage of de-icing systems and for site safety.


2017 ◽  
Author(s):  
Jennie P. Söderman ◽  
Heiner Körnich ◽  
Esbjörn Olsson ◽  
Hans Bergström ◽  
Anna Sjöblom

Abstract. The problem of icing on wind turbines in cold climates is addressed using probabilistic forecasting to improve next- day forecasts of icing and related production losses. A case study of probabilistic forecasts was generated for a two- week period. Uncertainties in initial and boundary conditions are represented with an ensemble forecasting system, while uncertainties in the spatial representation are included with a neighbourhood method. Using probabilistic forecasting instead of one single forecast was shown to improve the forecast skill of the ice-related production loss forecasts and hence the icing forecasts. The spread of the multiple forecasts can be used as an estimate of the forecast uncertainty and of the likelihood for icing and severe production losses. Best results, both in terms of forecast skill and forecasted uncertainty, were achieved using both the ensemble forecast and the neighbourhood method combined. This demonstrates that the application of probabilistic forecasting for wind power in cold climate can be valuable when planning next-day energy production, in the usage of de-icing systems, and for site safety.


2018 ◽  
Vol 146 (11) ◽  
pp. 3885-3900 ◽  
Author(s):  
Stephan Rasp ◽  
Sebastian Lerch

Abstract Ensemble weather predictions require statistical postprocessing of systematic errors to obtain reliable and accurate probabilistic forecasts. Traditionally, this is accomplished with distributional regression models in which the parameters of a predictive distribution are estimated from a training period. We propose a flexible alternative based on neural networks that can incorporate nonlinear relationships between arbitrary predictor variables and forecast distribution parameters that are automatically learned in a data-driven way rather than requiring prespecified link functions. In a case study of 2-m temperature forecasts at surface stations in Germany, the neural network approach significantly outperforms benchmark postprocessing methods while being computationally more affordable. Key components to this improvement are the use of auxiliary predictor variables and station-specific information with the help of embeddings. Furthermore, the trained neural network can be used to gain insight into the importance of meteorological variables, thereby challenging the notion of neural networks as uninterpretable black boxes. Our approach can easily be extended to other statistical postprocessing and forecasting problems. We anticipate that recent advances in deep learning combined with the ever-increasing amounts of model and observation data will transform the postprocessing of numerical weather forecasts in the coming decade.


2015 ◽  
Vol 28 (15) ◽  
pp. 6297-6307 ◽  
Author(s):  
Charles Jones ◽  
Abheera Hazra ◽  
Leila M. V. Carvalho

Abstract The Madden–Julian oscillation (MJO) is the main mode of tropical intraseasonal variations and bridges weather and climate. Because the MJO has a slow eastward propagation and longer time scale relative to synoptic variability, significant interest exists in exploring the predictability of the MJO and its influence on extended-range weather forecasts (i.e., 2–4-week lead times). This study investigates the impact of the MJO on the forecast skill in Northern Hemisphere extratropics during boreal winter. Several 45-day forecasts of geopotential height (500 hPa) from NCEP Climate Forecast System version 2 (CFSv2) reforecasts are used (1 November–31 March 1999–2010). The variability of the MJO expressed as different amplitudes, durations, and recurrence (i.e., primary and successive events) and their influence on forecast skill is analyzed and compared against inactive periods (i.e., null cases). In general, forecast skill during enhanced MJO convection over the western Pacific is systematically higher than in inactive days. When the enhanced MJO convection is over the Maritime Continent, forecasts are lower than in null cases, suggesting potential model deficiencies in accurately forecasting the eastward propagation of the MJO over that region and the associated extratropical response. In contrast, forecasts are more skillful than null cases when the enhanced convection is over the western Pacific and during long, intense, and successive MJO events. These results underscore the importance of the MJO as a potential source of predictability on 2–4-week lead times.


2020 ◽  
Author(s):  
Andrea Ficchi ◽  
Hannah Cloke ◽  
Ervin Zsoter ◽  
Christel Prudhomme ◽  
Liz Stephens

<p>Severe flooding in southern Africa is caused by a variety of meteorological hazards including intense tropical cyclones and depressions, mesoscale convective complexes and persistent lows which bring extreme rainfall and flood events with different characteristics. Little is known about the relative predictability of flooding associated to these different drivers, especially in operational forecasting systems. Understanding the limits of predictability for the different drivers of flooding is important to provide evidence of forecast capabilities to end-users and decision-makers and build trust in the use of the forecasting systems.</p><p>Here we explore the skill of probabilistic flood forecasts from the operational Copernicus Emergency Management Service Global Flood Awareness System (GloFAS v2) over southern Africa. GloFAS provides real-time hydrological forecasts up to 30 days ahead by coupling ensemble weather forecasts from the European Centre for Medium-Range Weather Forecasts (ECMWF) with hydrological modelling. The GloFAS flood forecasts are openly available and can support humanitarians and other international organisations to trigger action before a devastating flood occurs.</p><p>Using hydrological records of past flood events over the last 20 years, the GloFAS forecast skill is assessed by analysing the probability of detection of the events over different lead-times from 1 to 30 days, as well as the consistency and accuracy of predictions of event-based characteristics such as the flood timing and duration. We stratify the analysis by the multi hazard drivers of flooding with a focus on the distinction between tropical cyclones and other types of meteorological events. We suggest that such a stratified analysis of forecast skill can help modellers better understand the sources of predictability in flood forecasts and can support humanitarians to define specific trigger levels for forecast-based action for different types of flood events.</p>


2003 ◽  
Vol 84 (12) ◽  
pp. 1761-1782 ◽  
Author(s):  
L. Goddard ◽  
A. G. Barnston ◽  
S. J. Mason

The International Research Institute for Climate Prediction (IRI) net assessment seasonal temperature and precipitation forecasts are evaluated for the 4-yr period from October–December 1997 to October–December 2001. These probabilistic forecasts represent the human distillation of seasonal climate predictions from various sources. The ranked probability skill score (RPSS) serves as the verification measure. The evaluation is offered as time-averaged spatial maps of the RPSS as well as area-averaged time series. A key element of this evaluation is the examination of the extent to which the consolidation of several predictions, accomplished here subjectively by the forecasters, contributes to or detracts from the forecast skill possible from any individual prediction tool. Overall, the skills of the net assessment forecasts for both temperature and precipitation are positive throughout the 1997–2001 period. The skill may have been enhanced during the peak of the 1997/98 El Niño, particularly for tropical precipitation, although widespread positive skill exists even at times of weak forcing from the tropical Pacific. The temporally averaged RPSS for the net assessment temperature forecasts appears lower than that for the AGCMs. Over time, however, the IRI forecast skill is more consistently positive than that of the AGCMs. The IRI precipitation forecasts generally have lower skill than the temperature forecasts, but the forecast probabilities for precipitation are found to be appropriate to the frequency of the observed outcomes, and thus reliable. Over many regions where the precipitation variability is known to be potentially predictable, the net assessment precipitation forecasts exhibit more spatially coherent areas of positive skill than most, if not all, prediction tools. On average, the IRI net assessment forecasts appear to perform better than any of the individual objective prediction tools.


2017 ◽  
Vol 145 (9) ◽  
pp. 3581-3597 ◽  
Author(s):  
L. Cucurull ◽  
R. Li ◽  
T. R. Peevey

The mainstay of the global radio occultation (RO) system, the COSMIC constellation of six satellites launched in April 2006, is already past the end of its nominal lifetime and the number of soundings is rapidly declining because the constellation is degrading. For about the last decade, COSMIC profiles have been collected and their retrievals assimilated in numerical weather prediction systems to improve operational weather forecasts. The success of RO in increasing forecast skill and COSMIC’s aging constellation have motivated planning for the COSMIC-2 mission, a 12-satellite constellation to be deployed in two launches. The first six satellites (COSMIC-2A) are expected to be deployed in December 2017 in a low-inclination orbit for dense equatorial coverage, while the second six (COSMIC-2B) are expected to be launched later in a high-inclination orbit for global coverage. To evaluate the potential benefits from COSMIC-2, an earlier version of the NCEP’s operational forecast model and data assimilation system is used to conduct a series of observing system simulation experiments with simulated soundings from the COSMIC-2 mission. In agreement with earlier studies using real RO observations, the benefits from assimilating COSMIC-2 observations are found to be most significant in the Southern Hemisphere. No or very little gain in forecast skill is found by adding COSMIC-2A to COSMIC-2B, making the launch of COSMIC-2B more important for terrestrial global weather forecasting than that of COSMIC-2A. Furthermore, results suggest that further improvement in forecast skill might better be obtained with the addition of more RO observations with global coverage and other types of observations.


2015 ◽  
Vol 143 (11) ◽  
pp. 4631-4644 ◽  
Author(s):  
David P. Mulholland ◽  
Patrick Laloyaux ◽  
Keith Haines ◽  
Magdalena Alonso Balmaseda

Abstract Current methods for initializing coupled atmosphere–ocean forecasts often rely on the use of separate atmosphere and ocean analyses, the combination of which can leave the coupled system imbalanced at the beginning of the forecast, potentially accelerating the development of errors. Using a series of experiments with the European Centre for Medium-Range Weather Forecasts coupled system, the magnitude and extent of these so-called initialization shocks is quantified, and their impact on forecast skill measured. It is found that forecasts initialized by separate oceanic and atmospheric analyses do exhibit initialization shocks in lower atmospheric temperature, when compared to forecasts initialized using a coupled data assimilation method. These shocks result in as much as a doubling of root-mean-square error on the first day of the forecast in some regions, and in increases that are sustained for the duration of the 10-day forecasts performed here. However, the impacts of this choice of initialization on forecast skill, assessed using independent datasets, were found to be negligible, at least over the limited period studied. Larger initialization shocks are found to follow a change in either the atmosphere or ocean model component between the analysis and forecast phases: changes in the ocean component can lead to sea surface temperature shocks of more than 0.5 K in some equatorial regions during the first day of the forecast. Implications for the development of coupled forecast systems, particularly with respect to coupled data assimilation methods, are discussed.


Sign in / Sign up

Export Citation Format

Share Document