scholarly journals Missing data simulation inside flow rate time-series using multiple-point statistics

2016 ◽  
Vol 86 ◽  
pp. 264-276 ◽  
Author(s):  
Fabio Oriani ◽  
Andrea Borghi ◽  
Julien Straubhaar ◽  
Grégoire Mariethoz ◽  
Philippe Renard
Geomorphology ◽  
2014 ◽  
Vol 214 ◽  
pp. 148-156 ◽  
Author(s):  
Guillaume Pirot ◽  
Julien Straubhaar ◽  
Philippe Renard

2021 ◽  
Author(s):  
Fabio Oriani ◽  
Gregoire Mariethoz

<p><span>In the beginning of the 2000's [1], multiple-point statistics (MPS) was introduced as a novel geostatistical approach to explore the variability of natural phenomena in a realistic way by observing and simulating data patterns, sensibly improving the preservation of connectivity and shape of the modeled structures.</span></p><p><span>A usual requirement for MPS is the presence of complete and representative training images (TI), showing clear and possibly redundant examples of the studied structures. But in the everyday practice, this information is often partially or scarcely available, strongly limiting the use of MPS.</span></p><p><span>In this presentation we start with an overview of MPS strategies proposed to overcome training data limitations. We consider different examples of multisite rain-gauge networks containing sparse data gaps, with the goal of estimating the missing data, using the same incomplete dataset as TI [2]. Another considered study case regards the use of 2D training images of geological outcrops used to reconstruct a 3D volume of fluvioglacial deposits [3]. </span></p><p><span>We then consider a common problem in hydroclimatological studies: the bias correction of weather radar images with ground rainfall measurements. This is a typical no-TI problem where there is no example of unbiased grid image to train MPS. In this case, we propose a novel pattern-to-point approach, where we create a catalog of local grid patterns, each one associated to a rainfall measurement. This way the MPS algorithm 1) selects ungauged locations, 2) searches similar grid patterns in the catalog, and 3) projects the linked historical ground measurements at the ungauged locations.</span></p><p><span>From early results, this technique seems to recover hidden spatial patterns which correct the highly non-linear bias by extracting information from the pattern-to-point catalog. This is a first step for MPS towards the use of TIs integrating variables of different dimensionality, opening a new methodological path for future research.</span></p><p> </p><p><span>BIBLIOGRAPHY</span></p><p><span>[1] Strebelle, S. "Conditional simulation of complex geological structures using multiple-point statistics." Mathematical geology 34.1 (2002): 1-21.</span></p><p><span>[2] Oriani, F. et al. "Missing data imputation for multisite rainfall networks: a comparison between geostatistical interpolation and pattern-based estimation on different terrain types." Journal of Hydrometeorology 21.10 (2020): 2325-2341.</span></p><p><span>[3] Kessler, T. et al. "Modeling fine</span><span>‐</span><span>scale geological heterogeneity—examples of sand lenses in tills." Groundwater 51.5 (2013): 692-705.</span></p>


Author(s):  
Andrew Q. Philips

In cross-sectional time-series data with a dichotomous dependent variable, failing to account for duration dependence when it exists can lead to faulty inferences. A common solution is to include duration dummies, polynomials, or splines to proxy for duration dependence. Because creating these is not easy for the common practitioner, I introduce a new command, mkduration, that is a straightforward way to generate a duration variable for binary cross-sectional time-series data in Stata. mkduration can handle various forms of missing data and allows the duration variable to easily be turned into common parametric and nonparametric approximations.


Author(s):  
Yingying Ren ◽  
Hu Wang ◽  
Lizhen Lian ◽  
Jiexian Wang ◽  
Yingyan Cheng ◽  
...  

2013 ◽  
Vol 10 (6) ◽  
pp. 4055-4071 ◽  
Author(s):  
S. Kandasamy ◽  
F. Baret ◽  
A. Verger ◽  
P. Neveux ◽  
M. Weiss

Abstract. Moderate resolution satellite sensors including MODIS (Moderate Resolution Imaging Spectroradiometer) already provide more than 10 yr of observations well suited to describe and understand the dynamics of earth's surface. However, these time series are associated with significant uncertainties and incomplete because of cloud cover. This study compares eight methods designed to improve the continuity by filling gaps and consistency by smoothing the time course. It includes methods exploiting the time series as a whole (iterative caterpillar singular spectrum analysis (ICSSA), empirical mode decomposition (EMD), low pass filtering (LPF) and Whittaker smoother (Whit)) as well as methods working on limited temporal windows of a few weeks to few months (adaptive Savitzky–Golay filter (SGF), temporal smoothing and gap filling (TSGF), and asymmetric Gaussian function (AGF)), in addition to the simple climatological LAI yearly profile (Clim). Methods were applied to the MODIS leaf area index product for the period 2000–2008 and over 25 sites showed a large range of seasonal patterns. Performances were discussed with emphasis on the balance achieved by each method between accuracy and roughness depending on the fraction of missing observations and the length of the gaps. Results demonstrate that the EMD, LPF and AGF methods were failing because of a significant fraction of gaps (more than 20%), while ICSSA, Whit and SGF were always providing estimates for dates with missing data. TSGF (Clim) was able to fill more than 50% of the gaps for sites with more than 60% (80%) fraction of gaps. However, investigation of the accuracy of the reconstructed values shows that it degrades rapidly for sites with more than 20% missing data, particularly for ICSSA, Whit and SGF. In these conditions, TSGF provides the best performances that are significantly better than the simple Clim for gaps shorter than about 100 days. The roughness of the reconstructed temporal profiles shows large differences between the various methods, with a decrease of the roughness with the fraction of missing data, except for ICSSA. TSGF provides the smoothest temporal profiles for sites with a % gap > 30%. Conversely, ICSSA, LPF, Whit, AGF and Clim provide smoother profiles than TSGF for sites with a % gap < 30%. Impact of the accuracy and smoothness of the reconstructed time series were evaluated on the timing of phenological stages. The dates of start, maximum and end of the season are estimated with an accuracy of about 10 days for the sites with a % gap < 10% and increases rapidly with the % gap. TSGF provides more accurate estimates of phenological timing up to a % gap < 60%.


2010 ◽  
Vol 19 (01) ◽  
pp. 107-121 ◽  
Author(s):  
JUAN CARLOS FIGUEROA GARCÍA ◽  
DUSKO KALENATIC ◽  
CESAR AMILCAR LÓPEZ BELLO

This paper presents a proposal based on an evolutionary algorithm for imputing missing observations in time series. A genetic algorithm based on the minimization of an error function derived from their autocorrelation function, mean, and variance is presented. All methodological aspects of the genetic structure are presented. An extended description of the design of the fitness function is provided. Four application examples are provided and solved by using the proposed method.


2021 ◽  
Vol 68 (1) ◽  
pp. 17-46
Author(s):  
Adam Korczyński

Statistical practice requires various imperfections resulting from the nature of data to be addressed. Data containing different types of measurement errors and irregularities, such as missing observations, have to be modelled. The study presented in the paper concerns the application of the expectation-maximisation (EM) algorithm to calculate maximum likelihood estimates, using an autoregressive model as an example. The model allows describing a process observed only through measurements with certain level of precision and through more than one data series. The studied series are affected by a measurement error and interrupted in some time periods, which causes the information for parameters estimation and later for prediction to be less precise. The presented technique aims to compensate for missing data in time series. The missing data appear in the form of breaks in the source of the signal. The adjustment has been performed by the EM algorithm to a hybrid version, supplemented by the Newton-Raphson method. This technique allows the estimation of more complex models. The formulation of the substantive model of an autoregressive process affected by noise is outlined, as well as the adjustment introduced to overcome the issue of missing data. The extended version of the algorithm has been verified using sampled data from a model serving as an example for the examined process. The verification demonstrated that the joint EM and Newton-Raphson algorithms converged with a relatively small number of iterations and resulted in the restoration of the information lost due to missing data, providing more accurate predictions than the original algorithm. The study also features an example of the application of the supplemented algorithm to some empirical data (in the calculation of a forecasted demand for newspapers).


2018 ◽  
Vol 84 ◽  
pp. 106-118 ◽  
Author(s):  
Laura Neri ◽  
Luca Coscieme ◽  
Biagio F. Giannetti ◽  
Federico M. Pulselli
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document