Estimation of Bayesian a posteriori probabilities with an autonomously learning neural network

Author(s):  
C.P. Lim
Keyword(s):  
2018 ◽  
Vol 11 (8) ◽  
pp. 4627-4643 ◽  
Author(s):  
Simon Pfreundschuh ◽  
Patrick Eriksson ◽  
David Duncan ◽  
Bengt Rydberg ◽  
Nina Håkansson ◽  
...  

Abstract. A neural-network-based method, quantile regression neural networks (QRNNs), is proposed as a novel approach to estimating the a posteriori distribution of Bayesian remote sensing retrievals. The advantage of QRNNs over conventional neural network retrievals is that they learn to predict not only a single retrieval value but also the associated, case-specific uncertainties. In this study, the retrieval performance of QRNNs is characterized and compared to that of other state-of-the-art retrieval methods. A synthetic retrieval scenario is presented and used as a validation case for the application of QRNNs to Bayesian retrieval problems. The QRNN retrieval performance is evaluated against Markov chain Monte Carlo simulation and another Bayesian method based on Monte Carlo integration over a retrieval database. The scenario is also used to investigate how different hyperparameter configurations and training set sizes affect the retrieval performance. In the second part of the study, QRNNs are applied to the retrieval of cloud top pressure from observations by the Moderate Resolution Imaging Spectroradiometer (MODIS). It is shown that QRNNs are not only capable of achieving similar accuracy to standard neural network retrievals but also provide statistically consistent uncertainty estimates for non-Gaussian retrieval errors. The results presented in this work show that QRNNs are able to combine the flexibility and computational efficiency of the machine learning approach with the theoretically sound handling of uncertainties of the Bayesian framework. Together with this article, a Python implementation of QRNNs is released through a public repository to make the method available to the scientific community.


2021 ◽  
Author(s):  
Yifei Guan ◽  
Ashesh Chattopadhyay ◽  
Adam Subel ◽  
Pedram Hassanzadeh

<p>In large eddy simulations (LES), the subgrid-scale effects are modeled by physics-based or data-driven methods. This work develops a convolutional neural network (CNN) to model the subgrid-scale effects of a two-dimensional turbulent flow. The model is able to capture both the inter-scale forward energy transfer and backscatter in both a priori and a posteriori analyses. The LES-CNN model outperforms the physics-based eddy-viscosity models and the previous proposed local artificial neural network (ANN) models in both short-term prediction and long-term statistics. Transfer learning is implemented to generalize the method for turbulence modeling at higher Reynolds numbers. Encoder-decoder network architecture is proposed to generalize the model to a higher computational grid resolution.</p>


2018 ◽  
Author(s):  
Simon Pfreundschuh ◽  
Patrick Eriksson ◽  
David Duncan ◽  
Bengt Rydberg ◽  
Nina Håkansson ◽  
...  

Abstract. This work is concerned with the retrieval of physical quantities from remote sensing measurements. A neural network based method, Quantile Regression Neural Networks (QRNNs), is proposed as a novel approach to estimate the a posteriori distribution of Bayesian remote sensing retrievals. The advantage of QRNNs over conventional neural network retrievals is that they not only learn to predict a single retrieval value but also the associated, case specific uncertainties. In this study, the retrieval performance of QRNNs is characterized and compared to that of other state-of-the-art retrieval methods. A synthetic retrieval scenario is presented and used as a validation case for the application of QRNNs to Bayesian retrieval problems. The QRNN retrieval performance is evaluated against Markov chain Monte Carlo simulation and another Bayesian method based on Monte Carlo integration over a retrieval database. The scenario is also used to investigate how different hyperparameter configurations and training set sizes affect the retrieval performance. In the second part of the study, QRNNs are applied to the retrieval of cloud top pressure from observations by the moderate resolution imaging spectroradiometer (MODIS). It is shown that QRNNs are not only capable of achieving similar accuracy as standard neural network retrievals, but also provide statistically consistent uncertainty estimates for non-Gaussian retrieval errors. The results presented in this work show that QRNNs are able to combine the flexibility and computational efficiency of the machine learning approach with the theoretically sound handling of uncertainties of the Bayesian framework. Together with this article, a Python implementation of QRNNs is released through a public repository to make the method available to the scientific community.


Geophysics ◽  
2011 ◽  
Vol 76 (2) ◽  
pp. E45-E58 ◽  
Author(s):  
Mohammad S. Shahraeeni ◽  
Andrew Curtis

We have developed an extension of the mixture-density neural network as a computationally efficient probabilistic method to solve nonlinear inverse problems. In this method, any postinversion (a posteriori) joint probability density function (PDF) over the model parameters is represented by a weighted sum of multivariate Gaussian PDFs. A mixture-density neural network estimates the weights, mean vector, and covariance matrix of the Gaussians given any measured data set. In one study, we have jointly inverted compressional- and shear-wave velocity for the joint PDF of porosity, clay content, and water saturation in a synthetic, fluid-saturated, dispersed sand-shale system. Results show that if the method is applied appropriately, the joint PDF estimated by the neural network is comparable to the Monte Carlo sampled a posteriori solution of the inverse problem. However, the computational cost of training and using the neural network is much lower than inversion by sampling (more than a factor of 104 in this case and potentially a much larger factor for 3D seismic inversion). To analyze the performance of the method on real exploration geophysical data, we have jointly inverted P-wave impedance and Poisson’s ratio logs for the joint PDF of porosity and clay content. Results show that the posterior model PDF of porosity and clay content is a good estimate of actual porosity and clay-content log values. Although the results may vary from one field to another, this fast, probabilistic method of solving nonlinear inverse problems can be applied to invert well logs and large seismic data sets for petrophysical parameters in any field.


1991 ◽  
Vol 3 (4) ◽  
pp. 461-483 ◽  
Author(s):  
Michael D. Richard ◽  
Richard P. Lippmann

Many neural network classifiers provide outputs which estimate Bayesian a posteriori probabilities. When the estimation is accurate, network outputs can be treated as probabilities and sum to one. Simple proofs show that Bayesian probabilities are estimated when desired network outputs are 1 of M (one output unity, all others zero) and a squared-error or cross-entropy cost function is used. Results of Monte Carlo simulations performed using multilayer perceptron (MLP) networks trained with backpropagation, radial basis function (RBF) networks, and high-order polynomial networks graphically demonstrate that network outputs provide good estimates of Bayesian probabilities. Estimation accuracy depends on network complexity, the amount of training data, and the degree to which training data reflect true likelihood distributions and a priori class probabilities. Interpretation of network outputs as Bayesian probabilities allows outputs from multiple networks to be combined for higher level decision making, simplifies creation of rejection thresholds, makes it possible to compensate for differences between pattern class probabilities in training and test data, allows outputs to be used to minimize alternative risk functions, and suggests alternative measures of network performance.


1990 ◽  
Vol 2 (2) ◽  
pp. 216-225 ◽  
Author(s):  
Reza Shadmehr ◽  
David Z. D'Argenio

The feasibility of developing a neural network to perform nonlinear Bayesian estimation from sparse data is explored using an example from clinical pharmacology. The problem involves estimating parameters of a dynamic model describing the pharmacokinetics of the bronchodilator theophylline from limited plasma concentration measurements of the drug obtained in a patient. The estimation performance of a backpropagation trained network is compared to that of the maximum likelihood estimator as well as the maximum a posteriori probability estimator. In the example considered, the estimator prediction errors (model parameters and outputs) obtained from the trained neural network were similar to those obtained using the nonlinear Bayesian estimator.


Author(s):  
Daniela Danciu ◽  
Vladimir Rasvan

All neural networks, both natural and artificial, are characterized by two kinds of dynamics. The first one is concerned with what we would call “learning dynamics”, in fact the sequential (discrete time) dynamics of the choice of synaptic weights. The second one is the intrinsic dynamics of the neural network viewed as a dynamical system after the weights have been established via learning. Regarding the second dynamics, the emergent computational capabilities of a recurrent neural network can be achieved provided it has many equilibria. The network task is achieved provided it approaches these equilibria. But the dynamical system has a dynamics induced a posteriori by the learning process that had established the synaptic weights. It is not compulsory that this a posteriori dynamics should have the required properties, hence they have to be checked separately. The standard stability properties (Lyapunov, asymptotic and exponential stability) are defined for a single equilibrium. Their counterpart for several equilibria are: mutability, global asymptotics, gradient behavior. For the definitions of these general concepts the reader is sent to Gelig et. al., (1978), Leonov et. al., (1992). In the last decades, the number of recurrent neural networks’ applications increased, they being designed for classification, identification and complex image, visual and spatio-temporal processing in fields as engineering, chemistry, biology and medicine (see, for instance: Fortuna et. al., 2001; Fink, 2004; Atencia et. al., 2004; Iwahori et. al., 2005; Maurer et. al., 2005; Guirguis & Ghoneimy, 2007). All these applications are mainly based on the existence of several equilibria for such networks, requiring them the “good behavior” properties above discussed. Another aspect of the qualitative analysis is the so-called synchronization problem, when an external stimulus, in most cases periodic or almost periodic has to be tracked (Gelig, 1982; Danciu, 2002). This problem is, from the mathematical point of view, nothing more but existence, uniqueness and global stability of forced oscillations.


Sign in / Sign up

Export Citation Format

Share Document