Correcting observation model error in data assimilation

Franz Hamilton; Tyrus Berry; Timothy Sauer

doi:10.1063/1.5087151

Correcting Biased Observation Model Error in Data Assimilation

Monthly Weather Review ◽

10.1175/mwr-d-16-0428.1 ◽

2017 ◽

Vol 145 (7) ◽

pp. 2833-2853 ◽

Cited By ~ 13

Author(s):

Tyrus Berry ◽

John Harlim

Keyword(s):

Radiative Transfer ◽

Data Assimilation ◽

Learning Community ◽

Model Error ◽

Radiative Transfer Model ◽

Transfer Model ◽

Error Estimator ◽

Observation Model ◽

Nonparametric Likelihood ◽

Forecasting System

While the formulation of most data assimilation schemes assumes an unbiased observation model error, in real applications model error with nontrivial biases is unavoidable. A practical example is errors in the radiative transfer model (which is used to assimilate satellite measurements) in the presence of clouds. Together with the dynamical model error, the result is that many (in fact 99%) of the cloudy observed measurements are not being used although they may contain useful information. This paper presents a novel nonparametric Bayesian scheme that is able to learn the observation model error distribution and correct the bias in incoming observations. This scheme can be used in tandem with any data assimilation forecasting system. The proposed model error estimator uses nonparametric likelihood functions constructed with data-driven basis functions based on the theory of kernel embeddings of conditional distributions developed in the machine learning community. Numerically, positive results are shown with two examples. The first example is designed to produce a bimodality in the observation model error (typical of “cloudy” observations) by introducing obstructions to the observations that occur randomly in space and time. The second example, which is physically more realistic, is to assimilate cloudy satellite brightness temperature–like quantities, generated from a stochastic multicloud model for tropical convection and a simple radiative transfer model.

Download Full-text

Bayesian Hierarchical Model Characterization of Model Error in Ocean Data Assimilation and Forecasts

10.21236/ada568491 ◽

2012 ◽

Author(s):

Ralph F. Milliff ◽

Christopher K. Wikle ◽

L. M. Berliner ◽

Radu Herbei

Keyword(s):

Data Assimilation ◽

Hierarchical Model ◽

Bayesian Hierarchical Model ◽

Model Error ◽

Bayesian Hierarchical ◽

Ocean Data Assimilation

Download Full-text

Using machine learning to correct model error in data assimilation and forecast applications

Quarterly Journal of the Royal Meteorological Society ◽

10.1002/qj.4116 ◽

2021 ◽

Author(s):

Alban Farchi ◽

Patrick Laloyaux ◽

Massimo Bonavita ◽

Marc Bocquet

Keyword(s):

Machine Learning ◽

Data Assimilation ◽

Model Error ◽

Correct Model

Download Full-text

Parameter Estimation Using Ensemble-Based Data Assimilation in the Presence of Model Error

Monthly Weather Review ◽

10.1175/mwr-d-14-00017.1 ◽

2015 ◽

Vol 143 (5) ◽

pp. 1568-1582 ◽

Cited By ~ 21

Author(s):

Juan Ruiz ◽

Manuel Pulido

Keyword(s):

Parameter Estimation ◽

Data Assimilation ◽

Model Error ◽

Forecast Skill ◽

Moist Convection ◽

Treatment Techniques ◽

Correction Technique ◽

Error Treatment ◽

Data Assimilation System ◽

Assimilation System

Abstract This work explores the potential of online parameter estimation as a technique for model error treatment under an imperfect model scenario, in an ensemble-based data assimilation system, using a simple atmospheric general circulation model, and an observing system simulation experiment (OSSE) approach. Model error is introduced in the imperfect model scenario by changing the value of the parameters associated with different schemes. The parameters of the moist convection scheme are the only ones to be estimated in the data assimilation system. In this work, parameter estimation is compared and combined with techniques that account for the lack of ensemble spread and for the systematic model error. The OSSEs show that when parameter estimation is combined with model error treatment techniques, multiplicative and additive inflation or a bias correction technique, parameter estimation produces a further improvement of analysis quality and medium-range forecast skill with respect to the OSSEs with model error treatment techniques without parameter estimation. The improvement produced by parameter estimation is mainly a consequence of the optimization of the parameter values. The estimated parameters do not converge to the value used to generate the observations in the imperfect model scenario; however, the analysis error is reduced and the forecast skill is improved.

Download Full-text

TREATMENT OF THE ERROR DUE TO UNRESOLVED SCALES IN SEQUENTIAL DATA ASSIMILATION

International Journal of Bifurcation and Chaos ◽

10.1142/s0218127411030775 ◽

2011 ◽

Vol 21 (12) ◽

pp. 3619-3626 ◽

Cited By ~ 7

Author(s):

ALBERTO CARRASSI ◽

STÉPHANE VANNITSEM

Keyword(s):

Dynamical System ◽

Kalman Filter ◽

Data Assimilation ◽

Large Scale ◽

Model Error ◽

Environmental Modeling ◽

Sequential Data ◽

Chaotic Dynamical System ◽

Error Covariance ◽

Sequential Data Assimilation

In this paper, a method to account for model error due to unresolved scales in sequential data assimilation, is proposed. An equation for the model error covariance required in the extended Kalman filter update is derived along with an approximation suitable for application with large scale dynamics typical in environmental modeling. This approach is tested in the context of a low order chaotic dynamical system. The results show that the filter skill is significantly improved by implementing the proposed scheme for the treatment of the unresolved scales.

Download Full-text

Deterministic Treatment of Model Error in Geophysical Data Assimilation

Mathematical Paradigms of Climate Science - Springer INdAM Series ◽

10.1007/978-3-319-39092-5_9 ◽

2016 ◽

pp. 175-213 ◽

Cited By ~ 5

Author(s):

Alberto Carrassi ◽

Stéphane Vannitsem

Keyword(s):

Data Assimilation ◽

Model Error ◽

Geophysical Data

Download Full-text

Using machine learning to correct model error in data assimilation and forecast applications

10.5194/egusphere-egu21-4007 ◽

2021 ◽

Cited By ~ 2

Author(s):

Alban Farchi ◽

Patrick Laloyaux ◽

Massimo Bonavita ◽

Marc Bocquet

Keyword(s):

Machine Learning ◽

Data Assimilation ◽

Data Science ◽

Weather Prediction ◽

Realistic Model ◽

Model Error ◽

Underlying Assumption ◽

Correct Model ◽

Recent Developments ◽

Spatiotemporal Processes

Recent developments in machine learning (ML) have demonstrated impressive skills in reproducing complex spatiotemporal processes. However, contrary to data assimilation (DA), the underlying assumption behind ML methods is that the system is fully observed and without noise, which is rarely the case in numerical weather prediction. In order to circumvent this issue, it is possible to embed the ML problem into a DA formalism characterised by a cost function similar to that of the weak-constraint 4D-Var (Bocquet et al., 2019; Bocquet et al., 2020). In practice ML and DA are combined to solve the problem: DA is used to estimate the state of the system while ML is used to estimate the full model.&#160;In realistic systems, the model dynamics can be very complex and it may not be possible to reconstruct it from scratch. An alternative could be to learn the model error of an already existent model using the same approach combining DA and ML. In this presentation, we test the feasibility of this method using a quasi geostrophic (QG) model. After a brief description of the QG model model, we introduce a realistic model error to be learnt. We then asses the potential of ML methods to reconstruct this model error, first with perfect (full and noiseless) observation and then with sparse and noisy observations. We show in either case to what extent the trained ML models correct the mid-term forecasts. Finally, we show how the trained ML models can be used in a DA system and to what extent they correct the analysis.Bocquet, M., Brajard, J., Carrassi, A., and Bertino, L.: Data assimilation as a learning tool to infer ordinary differential equation representations of dynamical models, Nonlin. Processes Geophys., 26, 143&#8211;162, 2019Bocquet, M., Brajard, J., Carrassi, A., and Bertino, L.: Bayesian inference of chaotic dynamics by merging data assimilation, machine learning and expectation-maximization, Foundations of Data Science, 2 (1), 55-80, 2020Farchi, A., Laloyaux, P., Bonavita, M., and Bocquet, M.: Using machine learning to correct model error in data assimilation and forecast applications, arxiv:2010.12605, submitted.&#160;

Download Full-text

Ensemble Data Assimilation Using a Unified Representation of Model Error

Monthly Weather Review ◽

10.1175/mwr-d-15-0270.1 ◽

2015 ◽

Vol 144 (1) ◽

pp. 213-224 ◽

Cited By ~ 12

Author(s):

Chiara Piccolo ◽

Mike Cullen

Keyword(s):

Data Assimilation ◽

Covariance Structure ◽

Model Error ◽

Model Errors ◽

Stochastic Forcing ◽

Ensemble Forecasts ◽

Verification Methods ◽

Physically Based ◽

Set Up ◽

Initial Uncertainty

Abstract A natural way to set up an ensemble forecasting system is to use a model with additional stochastic forcing representing the model error and to derive the initial uncertainty by using an ensemble of analyses generated with this model. Current operational practice has tended to separate the problems of generating initial uncertainty and forecast uncertainty. Thus, in ensemble forecasts, it is normal to use physically based stochastic forcing terms to represent model errors, while in generating analysis uncertainties, artificial inflation methods are used to ensure that the analysis spread is sufficient given the observations. In this paper a more unified approach is tested that uses the same stochastic forcing in the analyses and forecasts and estimates the model error forcing from data assimilation diagnostics. This is shown to be successful if there are sufficient observations. Ensembles used in data assimilation have to be reliable in a broader sense than the usual forecast verification methods; in particular, they need to have the correct covariance structure, which is demonstrated.

Download Full-text

EnKF and 4D-Var data assimilation with chemical transport model BASCOE (version 05.06)

Geoscientific Model Development ◽

10.5194/gmd-9-2893-2016 ◽

2016 ◽

Vol 9 (8) ◽

pp. 2893-2908 ◽

Cited By ~ 10

Author(s):

Sergey Skachko ◽

Richard Ménard ◽

Quentin Errera ◽

Yves Christophe ◽

Simon Chabrillat

Keyword(s):

Data Assimilation ◽

Transport Model ◽

Chemical Transport ◽

Error Variance ◽

Model Error ◽

Chemical Processes ◽

Observation Error ◽

Chemical Transport Model ◽

Error Covariance ◽

Chemical Tracer

Abstract. We compare two optimized chemical data assimilation systems, one based on the ensemble Kalman filter (EnKF) and the other based on four-dimensional variational (4D-Var) data assimilation, using a comprehensive stratospheric chemistry transport model (CTM). This work is an extension of the Belgian Assimilation System for Chemical ObsErvations (BASCOE), initially designed to work with a 4D-Var data assimilation. A strict comparison of both methods in the case of chemical tracer transport was done in a previous study and indicated that both methods provide essentially similar results. In the present work, we assimilate observations of ozone, HCl, HNO3, H2O and N2O from EOS Aura-MLS data into the BASCOE CTM with a full description of stratospheric chemistry. Two new issues related to the use of the full chemistry model with EnKF are taken into account. One issue is a large number of error variance parameters that need to be optimized. We estimate an observation error variance parameter as a function of pressure level for each observed species using the Desroziers method. For comparison purposes, we apply the same estimate procedure in the 4D-Var data assimilation, where both scale factors of the background and observation error covariance matrices are estimated using the Desroziers method. However, in EnKF the background error covariance is modelled using the full chemistry model and a model error term which is tuned using an adjustable parameter. We found that it is adequate to have the same value of this parameter based on the chemical tracer formulation that is applied for all observed species. This is an indication that the main source of model error in chemical transport model is due to the transport. The second issue in EnKF with comprehensive atmospheric chemistry models is the noise in the cross-covariance between species that occurs when species are weakly chemically related at the same location. These errors need to be filtered out in addition to a localization based on distance. The performance of two data assimilation methods was assessed through an 8-month long assimilation of limb sounding observations from EOS Aura MLS. This paper discusses the differences in results and their relation to stratospheric chemical processes. Generally speaking, EnKF and 4D-Var provide results of comparable quality but differ substantially in the presence of model error or observation biases. If the erroneous chemical modelling is associated with moderately fast chemical processes, but whose lifetimes are longer than the model time step, then EnKF performs better, while 4D-Var develops spurious increments in the chemically related species. If, however, the observation biases are significant, then 4D-Var is more robust and is able to reject erroneous observations while EnKF does not.

Download Full-text

The Onsager–Machlup functional for data assimilation

Nonlinear Processes in Geophysics ◽

10.5194/npg-24-701-2017 ◽

2017 ◽

Vol 24 (4) ◽

pp. 701-712 ◽

Cited By ~ 2

Author(s):

Nozomi Sugiura

Keyword(s):

Data Assimilation ◽

Prior Distribution ◽

Numerical Experiments ◽

Time Estimation ◽

Model Error ◽

Theoretical Studies ◽

Drift Term ◽

Estimation Problems ◽

Large Systems ◽

A New Technique

Abstract. When taking the model error into account in data assimilation, one needs to evaluate the prior distribution represented by the Onsager–Machlup functional. Through numerical experiments, this study clarifies how the prior distribution should be incorporated into cost functions for discrete-time estimation problems. Consistent with previous theoretical studies, the divergence of the drift term is essential in weak-constraint 4D-Var (w4D-Var), but it is not necessary in Markov chain Monte Carlo with the Euler scheme. Although the former property may cause difficulties when implementing w4D-Var in large systems, this paper proposes a new technique for estimating the divergence term and its derivative.

Download Full-text