rank histogram
Recently Published Documents


TOTAL DOCUMENTS

15
(FIVE YEARS 0)

H-INDEX

8
(FIVE YEARS 0)

Author(s):  
Jeffrey L. Anderson

An extension to standard ensemble Kalman filter algorithms that can improve performance for non-Gaussian prior distributions, non-Gaussian likelihoods, and bounded state variables is described. The algorithm exploits the capability of the rank histogram filter (RHF) to represent arbitrary prior distributions for observed variables. The rank histogram algorithm can be applied directly to state variables to produce posterior marginal ensembles without the need for regression that is part of standard ensemble filters. These marginals are used to adjust the marginals obtained from a standard ensemble filter that uses regression to update state variables. The final posterior ensemble is obtained by doing an ordered replacement of the posterior marginal ensemble values from a standard ensemble filter with the values obtained from the rank histogram method applied directly to state variables; the algorithm is referred to as the Marginal Adjustment Rank Histogram Filter (MARHF). Applications to idealized bivariate problems and low-order dynamical systems show that the MARHF can produce better results than standard ensemble methods for priors that are non-Gaussian. Like the original RHF, the MARHF can also make use of arbitrary non-Gaussian observation likelihoods. The MARHF also has advantages for problems with bounded state variables, for instance the concentration of an atmospheric tracer. Bounds can be automatically respected in the posterior ensembles. With an efficient implementation of the MARHF, the additional cost has better scaling than the standard RHF.


2019 ◽  
Vol 147 (8) ◽  
pp. 2847-2860 ◽  
Author(s):  
Jeffrey L. Anderson

Abstract It is possible to describe many variants of ensemble Kalman filters without loss of generality as the impact of a single observation on a single state variable. For most ensemble algorithms commonly applied to Earth system models, the computation of increments for the observation variable ensemble can be treated as a separate step from computing increments for the state variable ensemble. The state variable increments are normally computed from the observation increments by linear regression using the prior bivariate ensemble of the state and observation variable. Here, a new method that replaces the standard regression with a regression using the bivariate rank statistics is described. This rank regression is expected to be most effective when the relation between a state variable and an observation is nonlinear. The performance of standard versus rank regression is compared for both linear and nonlinear forward operators (also known as observation operators) using a low-order model. Rank regression in combination with a rank histogram filter in observation space produces better analyses than standard regression for cases with nonlinear forward operators and relatively large analysis error. Standard regression, in combination with either a rank histogram filter or an ensemble Kalman filter in observation space, produces the best results in other situations.


2019 ◽  
Vol 147 (2) ◽  
pp. 763-769 ◽  
Author(s):  
D. S. Wilks

Abstract Quantitative evaluation of the flatness of the verification rank histogram can be approached through formal hypothesis testing. Traditionally, the familiar χ2 test has been used for this purpose. Recently, two alternatives—the reliability index (RI) and an entropy statistic (Ω)—have been suggested in the literature. This paper presents approximations to the sampling distributions of these latter two rank histogram flatness metrics, and compares the statistical power of tests based on the three statistics, in a controlled setting. The χ2 test is generally most powerful (i.e., most sensitive to violations of the null hypothesis of rank uniformity), although for overdispersed ensembles and small sample sizes, the test based on the entropy statistic Ω is more powerful. The RI-based test is preferred only for unbiased forecasts with small ensembles and very small sample sizes.


2017 ◽  
Vol 145 (9) ◽  
pp. 3529-3544 ◽  
Author(s):  
Joseph Bellier ◽  
Isabella Zin ◽  
Guillaume Bontron

In the verification field, stratification is the process of dividing the sample of forecast–observation pairs into quasi-homogeneous subsets, in order to learn more on how forecasts behave under specific conditions. A general framework for stratification is presented for the case of ensemble forecasts of continuous scalar variables. Distinction is made between forecast-based, observation-based, and external-based stratification, depending on the criterion on which the sample is stratified. The formalism is applied to two widely used verification measures: the continuous ranked probability score (CRPS) and the rank histogram. For both, new graphical representations that synthesize the added information are proposed. Based on the definition of calibration, it is shown that the rank histogram should be used within a forecast-based stratification, while an observation-based stratification leads to significantly nonflat histograms for calibrated forecasts. Nevertheless, as previous studies have warned, statistical artifacts created by a forecast-based stratification may still occur, thus a graphical test to detect them is suggested. To illustrate potential insights about forecast behavior that can be gained from stratification, a numerical example with two different datasets of mean areal precipitation forecasts is presented.


2017 ◽  
Vol 145 (5) ◽  
pp. 1679-1690 ◽  
Author(s):  
Mahsa Mirzargar ◽  
Jeffrey L. Anderson

Abstract Various generalizations of the univariate rank histogram have been proposed to inspect the reliability of an ensemble forecast or analysis in multidimensional spaces. Multivariate rank histograms provide insightful information about the misspecification of genuinely multivariate features such as the correlation between various variables in a multivariate ensemble. However, the interpretation of patterns in a multivariate rank histogram should be handled with care. The purpose of this paper is to focus on multivariate rank histograms designed based on the concept of data depth and outline some important considerations that should be accounted for when using such multivariate rank histograms. To generate correct multivariate rank histograms using the concept of data depth, the datatype of the ensemble should be taken into account to define a proper preranking function. This paper demonstrates how and why some preranking functions might not be suitable for multivariate or vector-valued ensembles and proposes preranking functions based on the concept of simplicial depth that are applicable to both multivariate points and vector-valued ensembles. In addition, there exists an inherent identifiability issue associated with center-outward preranking functions used to generate multivariate rank histograms. This problem can be alleviated by complementing the multivariate rank histogram with other well-known multivariate statistical inference tools based on rank statistics such as the depth-versus-depth (DD) plot. Using a synthetic example, it is shown that the DD plot is less sensitive to sample size compared to multivariate rank histograms.


2014 ◽  
Vol 21 (4) ◽  
pp. 869-885 ◽  
Author(s):  
S. Metref ◽  
E. Cosme ◽  
C. Snyder ◽  
P. Brasseur

Abstract. One challenge of geophysical data assimilation is to address the issue of non-Gaussianities in the distributions of the physical variables ensuing, in many cases, from nonlinear dynamical models. Non-Gaussian ensemble analysis methods fall into two categories, those remapping the ensemble particles by approximating the best linear unbiased estimate, for example, the ensemble Kalman filter (EnKF), and those resampling the particles by directly applying Bayes' rule, like particle filters. In this article, it is suggested that the most common remapping methods can only handle weakly non-Gaussian distributions, while the others suffer from sampling issues. In between those two categories, a new remapping method directly applying Bayes' rule, the multivariate rank histogram filter (MRHF), is introduced as an extension of the rank histogram filter (RHF) first introduced by Anderson (2010). Its performance is evaluated and compared with several data assimilation methods, on different levels of non-Gaussianity with the Lorenz 63 model. The method's behavior is then illustrated on a simple density estimation problem using ensemble simulations from a coupled physical–biogeochemical model of the North Atlantic ocean. The MRHF performs well with low-dimensional systems in strongly non-Gaussian regimes.


2012 ◽  
Vol 140 (5) ◽  
pp. 1558-1571 ◽  
Author(s):  
Stefan Siegert ◽  
Jochen Bröcker ◽  
Holger Kantz

Abstract The application of forecast ensembles to probabilistic weather prediction has spurred considerable interest in their evaluation. Such ensembles are commonly interpreted as Monte Carlo ensembles meaning that the ensemble members are perceived as random draws from a distribution. Under this interpretation, a reasonable property to ask for is statistical consistency, which demands that the ensemble members and the verification behave like draws from the same distribution. A widely used technique to assess statistical consistency of a historical dataset is the rank histogram, which uses as a criterion the number of times that the verification falls between pairs of members of the ordered ensemble. Ensemble evaluation is rendered more specific by stratification, which means that ensembles that satisfy a certain condition (e.g., a certain meteorological regime) are evaluated separately. Fundamental relationships between Monte Carlo ensembles, their rank histograms, and random sampling from the probability simplex according to the Dirichlet distribution are pointed out. Furthermore, the possible benefits and complications of ensemble stratification are discussed. The main conclusion is that a stratified Monte Carlo ensemble might appear inconsistent with the verification even though the original (unstratified) ensemble is consistent. The apparent inconsistency is merely a result of stratification. Stratified rank histograms are thus not necessarily flat. This result is demonstrated by perfect ensemble simulations and supplemented by mathematical arguments. Possible methods to avoid or remove artifacts that stratification induces in the rank histogram are suggested.


2011 ◽  
Vol 29 (7) ◽  
pp. 1295-1303 ◽  
Author(s):  
I. Soltanzadeh ◽  
M. Azadi ◽  
G. A. Vakili

Abstract. Using Bayesian Model Averaging (BMA), an attempt was made to obtain calibrated probabilistic numerical forecasts of 2-m temperature over Iran. The ensemble employs three limited area models (WRF, MM5 and HRM), with WRF used with five different configurations. Initial and boundary conditions for MM5 and WRF are obtained from the National Centers for Environmental Prediction (NCEP) Global Forecast System (GFS) and for HRM the initial and boundary conditions come from analysis of Global Model Europe (GME) of the German Weather Service. The resulting ensemble of seven members was run for a period of 6 months (from December 2008 to May 2009) over Iran. The 48-h raw ensemble outputs were calibrated using BMA technique for 120 days using a 40 days training sample of forecasts and relative verification data. The calibrated probabilistic forecasts were assessed using rank histogram and attribute diagrams. Results showed that application of BMA improved the reliability of the raw ensemble. Using the weighted ensemble mean forecast as a deterministic forecast it was found that the deterministic-style BMA forecasts performed usually better than the best member's deterministic forecast.


2011 ◽  
Vol 139 (1) ◽  
pp. 295-310 ◽  
Author(s):  
Caren Marzban ◽  
Ranran Wang ◽  
Fanyou Kong ◽  
Stephen Leyton

Abstract The rank histogram (RH) is a visual tool for assessing the reliability of ensemble forecasts (i.e., the degree to which the forecasts and the observations have the same distribution). But it is already known that in certain situations it conveys misleading information. Here, it is shown that a temporal correlation can lead to a misleading RH, but such a correlation contributes only to the sampling variability of the RH, and so it is accounted for by producing a RH that explicitly displays sampling variability. A simulation is employed to show that the variance within each ensemble member (i.e., climatological variance), the correlation between ensemble members, and the correlation between the observations and the forecasts, all have a confounding effect on the RH, making it difficult to use the RH for assessing the climatological component of forecast reliability. It is proposed that a “residual” quantile–quantile plot (denoted R-Q-Q plot) is better suited than the RH for assessing the climatological component of forecast reliability. Then, the RH and R-Q-Q plots for temperature and wind speed forecasts at 90 stations across the continental United States are computed. A wide range of forecast reliability is noted. For some stations, the nonreliability of the forecasts can be attributed to bias and/or under-or overclimatological dispersion. For others, the difference between the distributions can be traced to lighter or heavier tails in the distributions, while for other stations the distributions of the forecasts and the observations appear to be completely different. A spatial signature is also noted and discussed briefly.


Sign in / Sign up

Export Citation Format

Share Document