The Computational Complexity and Parallel Scalability of Atmospheric Data Assimilation Algorithms

2004 ◽  
Vol 21 (11) ◽  
pp. 1689-1700 ◽  
Author(s):  
P. M. Lyster ◽  
J. Guo ◽  
T. Clune ◽  
J. W. Larson

Abstract This paper quantifies the computational complexity and parallel scalability of two algorithms for four-dimensional data assimilation (4DDA) at NASA's Global Modeling and Assimilation Office (GMAO). The first, the Goddard Earth Observing System Data Assimilation System (GEOS DAS), uses an atmospheric general circulation model (GCM) and an observation-space-based analysis system, the Physical-Space Statistical Analysis System (PSAS). GEOS DAS is very similar to global meteorological weather forecasting data assimilation systems but is used at NASA for climate research. The second, the Kalman filter, uses a more consistent algorithm to determine the forecast error covariance matrix than does GEOS DAS. For atmospheric assimilation, the gridded dynamical fields typically have more than 106 variables; therefore, the full error covariance matrix may be in excess of a teraword. For the Kalman filter this problem will require petaflop s−1 computing to achieve effective throughput for scientific research.

2011 ◽  
Vol 139 (11) ◽  
pp. 3389-3404 ◽  
Author(s):  
Thomas Milewski ◽  
Michel S. Bourqui

Abstract A new stratospheric chemical–dynamical data assimilation system was developed, based upon an ensemble Kalman filter coupled with a Chemistry–Climate Model [i.e., the intermediate-complexity general circulation model Fast Stratospheric Ozone Chemistry (IGCM-FASTOC)], with the aim to explore the potential of chemical–dynamical coupling in stratospheric data assimilation. The system is introduced here in a context of a perfect-model, Observing System Simulation Experiment. The system is found to be sensitive to localization parameters, and in the case of temperature (ozone), assimilation yields its best performance with horizontal and vertical decorrelation lengths of 14 000 km (5600 km) and 70 km (14 km). With these localization parameters, the observation space background-error covariance matrix is underinflated by only 5.9% (overinflated by 2.1%) and the observation-error covariance matrix by only 1.6% (0.5%), which makes artificial inflation unnecessary. Using optimal localization parameters, the skills of the system in constraining the ensemble-average analysis error with respect to the true state is tested when assimilating synthetic Michelson Interferometer for Passive Atmospheric Sounding (MIPAS) retrievals of temperature alone and ozone alone. It is found that in most cases background-error covariances produced from ensemble statistics are able to usefully propagate information from the observed variable to other ones. Chemical–dynamical covariances, and in particular ozone–wind covariances, are essential in constraining the dynamical fields when assimilating ozone only, as the radiation in the stratosphere is too slow to transfer ozone analysis increments to the temperature field over the 24-h forecast window. Conversely, when assimilating temperature, the chemical–dynamical covariances are also found to help constrain the ozone field, though to a much lower extent. The uncertainty in forecast/analysis, as defined by the variability in the ensemble, is large compared to the analysis error, which likely indicates some amount of noise in the covariance terms, while also reducing the risk of filter divergence.


2011 ◽  
Vol 139 (7) ◽  
pp. 2046-2060 ◽  
Author(s):  
Tijana Janjić ◽  
Lars Nerger ◽  
Alberta Albertella ◽  
Jens Schröter ◽  
Sergey Skachko

Abstract Ensemble Kalman filter methods are typically used in combination with one of two localization techniques. One technique is covariance localization, or direct forecast error localization, in which the ensemble-derived forecast error covariance matrix is Schur multiplied with a chosen correlation matrix. The second way of localization is by domain decomposition. Here, the assimilation is split into local domains in which the assimilation update is performed independently. Domain localization is frequently used in combination with filter algorithms that use the analysis error covariance matrix for the calculation of the gain like the ensemble transform Kalman filter (ETKF) and the singular evolutive interpolated Kalman filter (SEIK). However, since the local assimilations are performed independently, smoothness of the analysis fields across the subdomain boundaries becomes an issue of concern. To address the problem of smoothness, an algorithm is introduced that uses domain localization in combination with a Schur product localization of the forecast error covariance matrix for each local subdomain. On a simple example, using the Lorenz-40 system, it is demonstrated that this modification can produce results comparable to those obtained with direct forecast error localization. In addition, these results are compared to the method that uses domain localization in combination with weighting of observations. In the simple example, the method using weighting of observations is less accurate than the new method, particularly if the observation errors are small. Domain localization with weighting of observations is further examined in the case of assimilation of satellite data into the global finite-element ocean circulation model (FEOM) using the local SEIK filter. In this example, the use of observational weighting improves the accuracy of the analysis. In addition, depending on the correlation function used for weighting, the spectral properties of the solution can be improved.


2011 ◽  
Vol 139 (2) ◽  
pp. 511-522 ◽  
Author(s):  
Steven J. Greybush ◽  
Eugenia Kalnay ◽  
Takemasa Miyoshi ◽  
Kayo Ide ◽  
Brian R. Hunt

Abstract In ensemble Kalman filter (EnKF) data assimilation, localization modifies the error covariance matrices to suppress the influence of distant observations, removing spurious long-distance correlations. In addition to allowing efficient parallel implementation, this takes advantage of the atmosphere’s lower dimensionality in local regions. There are two primary methods for localization. In B localization, the background error covariance matrix elements are reduced by a Schur product so that correlations between grid points that are far apart are removed. In R localization, the observation error covariance matrix is multiplied by a distance-dependent function, so that far away observations are considered to have infinite error. Successful numerical weather prediction depends upon well-balanced initial conditions to avoid spurious propagation of inertial-gravity waves. Previous studies note that B localization can disrupt the relationship between the height gradient and the wind speed of the analysis increments, resulting in an analysis that can be significantly ageostrophic. This study begins with a comparison of the accuracy and geostrophic balance of EnKF analyses using no localization, B localization, and R localization with simple one-dimensional balanced waves derived from the shallow-water equations, indicating that the optimal length scale for R localization is shorter than for B localization, and that for the same length scale R localization is more balanced. The comparison of localization techniques is then expanded to the Simplified Parameterizations, Primitive Equation Dynamics (SPEEDY) global atmospheric model. Here, natural imbalance of the slow manifold must be contrasted with undesired imbalance introduced by data assimilation. Performance of the two techniques is comparable, also with a shorter optimal localization distance for R localization than for B localization.


2013 ◽  
Vol 30 (5) ◽  
pp. 1303-1312 ◽  
Author(s):  
Xiaogu Zheng ◽  
Guocan Wu ◽  
Shupeng Zhang ◽  
Xiao Liang ◽  
Yongjiu Dai ◽  
...  

2010 ◽  
Vol 138 (3) ◽  
pp. 932-950 ◽  
Author(s):  
Jean-Michel Brankart ◽  
Emmanuel Cosme ◽  
Charles-Emmanuel Testut ◽  
Pierre Brasseur ◽  
Jacques Verron

Abstract In Kalman filter applications, an adaptive parameterization of the error statistics is often necessary to avoid filter divergence, and prevent error estimates from becoming grossly inconsistent with the real error. With the classic formulation of the Kalman filter observational update, optimal estimates of general adaptive parameters can only be obtained at a numerical cost that is several times larger than the cost of the state observational update. In this paper, it is shown that there exists a few types of important parameters for which optimal estimates can be computed at a negligible numerical cost, as soon as the computation is performed using a transformed algorithm that works in the reduced control space defined by the square root or ensemble representation of the forecast error covariance matrix. The set of parameters that can be efficiently controlled includes scaling factors for the forecast error covariance matrix, scaling factors for the observation error covariance matrix, or even a scaling factor for the observation error correlation length scale. As an application, the resulting adaptive filter is used to estimate the time evolution of ocean mesoscale signals using observations of the ocean dynamic topography. To check the behavior of the adaptive mechanism, this is done in the context of idealized experiments, in which model error and observation error statistics are known. This ideal framework is particularly appropriate to explore the ill-conditioned situations (inadequate prior assumptions or uncontrollability of the parameters) in which adaptivity can be misleading. Overall, the experiments show that, if used correctly, the efficient optimal adaptive algorithm proposed in this paper introduces useful supplementary degrees of freedom in the estimation problem, and that the direct control of these statistical parameters by the observations increases the robustness of the error estimates and thus the optimality of the resulting Kalman filter.


2020 ◽  
Author(s):  
Lewis Sampson ◽  
Jose M. Gonzalez-Ondina ◽  
Georgy Shapiro

<p>Data assimilation (DA) is a critical component for most state-of-the-art ocean prediction systems, which optimally combines model data and observational measurements to obtain an improved estimate of the modelled variables, by minimizing a cost function. The calculation requires the knowledge of the background error covariance matrix (BECM) as a weight for the quality of the model results, and an observational error covariance matrix (OECM) which weights the observational data.</p><p>Computing the BECM would require knowing the true values of the physical variables, which is not feasible. Instead, the BECM is estimated from model results and observations by using methods like National Meteorological Centre (NMC) or the Hollingsworth and Lönnberg (1984) (H-L). These methods have some shortcomings which make them unfit in some situations, which includes being fundamentally one-dimensional and making a suboptimal use of observations.</p><p>We have produced a novel method for error estimation, using an analysis of observations minus background data (innovations), which attempts to improve on some of these shortcomings. In particular, our method better infers information from observations, requiring less data to produce statistically robust results. We do this by minimizing a linear combination of functions to fit the data using a specifically tailored inner product, referred to as an inner product analysis (IPA).</p><p>We are able to produce quality BECM estimations even in data sparse domains, with notably better results in conditions of scarce observational data. By using a sample of observations, with decreasing sample size, we show that the stability and efficiency of our method, when compared to that of the H-L approach, does not deteriorate nearly as much as the number of data points decrease. We have found that we are able to continually produce error estimates with a reduced set of data, whereas the H-L method will begin to produce spurious values for smaller samples.</p><p>Our method works very well in combination with standard tools like NEMOVar by providing the required standard deviations and length-scales ratios. We have successfully ran this in the Arabian Sea for multiple seasons and compared the results with the H-L (in optimal conditions, when plenty of data is available), spatially the methods perform equally well. When we look at the root mean square error (RMSE) we see very similar performances, with each method giving better results for some seasons and worse for others.</p>


2015 ◽  
Vol 8 (3) ◽  
pp. 669-696 ◽  
Author(s):  
G. Descombes ◽  
T. Auligné ◽  
F. Vandenberghe ◽  
D. M. Barker ◽  
J. Barré

Abstract. The specification of state background error statistics is a key component of data assimilation since it affects the impact observations will have on the analysis. In the variational data assimilation approach, applied in geophysical sciences, the dimensions of the background error covariance matrix (B) are usually too large to be explicitly determined and B needs to be modeled. Recent efforts to include new variables in the analysis such as cloud parameters and chemical species have required the development of the code to GENerate the Background Errors (GEN_BE) version 2.0 for the Weather Research and Forecasting (WRF) community model. GEN_BE allows for a simpler, flexible, robust, and community-oriented framework that gathers methods used by some meteorological operational centers and researchers. We present the advantages of this new design for the data assimilation community by performing benchmarks of different modeling of B and showing some of the new features in data assimilation test cases. As data assimilation for clouds remains a challenge, we present a multivariate approach that includes hydrometeors in the control variables and new correlated errors. In addition, the GEN_BE v2.0 code is employed to diagnose error parameter statistics for chemical species, which shows that it is a tool flexible enough to implement new control variables. While the generation of the background errors statistics code was first developed for atmospheric research, the new version (GEN_BE v2.0) can be easily applied to other domains of science and chosen to diagnose and model B. Initially developed for variational data assimilation, the model of the B matrix may be useful for variational ensemble hybrid methods as well.


2020 ◽  
Author(s):  
Ross Noel Bannister

Abstract. Following the development of the simplified atmospheric convective-scale "toy" model (the ABC model, named after its three key parameters: the pure gravity wave frequency, A, the controller of the acoustic wave speed, B, and the constant of proportionality between pressure and density perturbations, C), this paper introduces its associated variational data assimilation system, ABC-DA. The purpose of ABC-DA is to permit quick and efficient research into data assimilation methods suitable for convective scale systems. The system can also be used as an aid to teach and demonstrate data assimilation principles. ABC-DA is flexible, configurable and is efficient enough to be run on a personal computer. The system can run a number of assimilation methods (currently 3DVar and 3DFGAT have been implemented), with user configurable observation networks. Observation operators for direct observations and wind speeds are part of the system, although these can be expanded relatively easily. A key feature of any data assimilation system is how it specifies the background error covariance matrix. ABC-DA uses a control variable transform method to allow this to be done efficiently. This version of ABC-DA mirrors many operational configurations, by modelling multivariate error covariances with uncorrelated control parameters, and spatial error covariances with special uncorrelated spatial patterns separately for each parameter. The software developed (amongst other things) does model runs, calibration tasks associated with the background error covariance matrix, testing and diagnostic tasks, single data assimilation runs, multi-cycle assimilation/forecast experiments, and has associated visualisation software. As a demonstration, the system is used to tackle a scientific question concerning the role of geostrophic balance (GB) to model background error covariances between mass and wind fields. This question arises because, although GB is a very useful mechanism that is successfully exploited in larger scale assimilation systems, its use is questionable at convective scales due to the typically larger Rossby numbers where GB is not so relevant. A series of identical twin experiments is done in cycled assimilation configurations. One experiment exploits GB to represent mass-wind covariances in a mirror of an operational set-up (with use of an additional vertical regression (VR) step, as used operationally). This experiment performs badly where assimilation error accumulates over time. Two further experiments are done: one that does not use GB, and another that does but without the VR step. Turning off GB impairs the performance, and turning off VR improves the performance in general. It is concluded that there is scope to further improve the way that the background error covariance matrices are calibrated, with some directions discussed.


2020 ◽  
Vol 13 (7) ◽  
pp. 3145-3177
Author(s):  
Dai Koshin ◽  
Kaoru Sato ◽  
Kazuyuki Miyazaki ◽  
Shingo Watanabe

Abstract. A data assimilation system with a four-dimensional local ensemble transform Kalman filter (4D-LETKF) is developed to make a new analysis dataset for the atmosphere up to the lower thermosphere using the Japanese Atmospherics General Circulation model for Upper Atmosphere Research. The time period from 10 January to 20 February 2017, when an international radar network observation campaign was performed, is focused on. The model resolution is T42L124, which can resolve phenomena at synoptic and larger scales. A conventional observation dataset provided by the National Centers for Environmental Prediction, PREPBUFR, and satellite temperature data from the Aura Microwave Limb Sounder (MLS) for the stratosphere and mesosphere are assimilated. First, the performance of the forecast model is improved by modifying the vertical profile of the horizontal diffusion coefficient and modifying the source intensity in the non-orographic gravity wave parameterization by comparing it with radar wind observations in the mesosphere. Second, the MLS observational bias is estimated as a function of the month and latitude and removed before the data assimilation. Third, data assimilation parameters, such as the degree of gross error check, localization length, inflation factor, and assimilation window, are optimized based on a series of sensitivity tests. The effect of increasing the ensemble member size is also examined. The obtained global data are evaluated by comparison with the Modern-Era Retrospective analysis for Research and Applications version 2 (MERRA-2) reanalysis data covering pressure levels up to 0.1 hPa and by the radar mesospheric observations, which are not assimilated.


Sign in / Sign up

Export Citation Format

Share Document