A Comparison of Three Gap Filling Techniques for Eddy Covariance Net Carbon Fluxes in Short Vegetation Ecosystems

Advances in Meteorology ◽

10.1155/2015/260580 ◽

2015 ◽

Vol 2015 ◽

pp. 1-12 ◽

Cited By ~ 8

Author(s):

Xiaosong Zhao ◽

Yao Huang

Keyword(s):

Time Series ◽

Standard Deviation ◽

Diurnal Variation ◽

Eddy Covariance ◽

Nonlinear Regression ◽

Missing Values ◽

The Other ◽

Gap Filling ◽

Benchmark Datasets ◽

Filling Method

Missing data is an inevitable problem when measuring CO2, water, and energy fluxes between biosphere and atmosphere by eddy covariance systems. To find the optimum gap-filling method for short vegetations, we review three-methods mean diurnal variation (MDV), look-up tables (LUT), and nonlinear regression (NLR) for estimating missing values of net ecosystem CO2exchange (NEE) in eddy covariance time series and evaluate their performance for different artificial gap scenarios based on benchmark datasets from marsh and cropland sites in China. The cumulative errors for three methods have no consistent bias trends, which ranged between −30 and +30 mgCO2 m−2from May to October at three sites. To reduce sum bias in maximum, combined gap-filling methods were selected for short vegetation. The NLR or LUT method was selected after plant rapidly increasing in spring and before the end of plant growing, and MDV method was used to the other stage. The sum relative error (SRE) of optimum method ranged between −2 and +4% for four-gap level at three sites, except for 55% gaps at soybean site, which also obviously reduced standard deviation of error.

Download Full-text

A comparison of gap-filling algorithms for eddy covariance fluxes and their drivers

Geoscientific Instrumentation Methods and Data Systems ◽

10.5194/gi-10-123-2021 ◽

2021 ◽

Vol 10 (1) ◽

pp. 123-140

Author(s):

Atbin Mahabbati ◽

Jason Beringer ◽

Matthias Leopold ◽

Ian McHugh ◽

James Cleverly ◽

...

Keyword(s):

Random Forest ◽

Eddy Covariance ◽

Missing Values ◽

The Other ◽

Global Network ◽

Environmental Drivers ◽

Gap Filling ◽

Ground Heat Flux ◽

The Impact ◽

Mean Square Errors

Abstract. The errors and uncertainties associated with gap-filling algorithms of water, carbon, and energy fluxes data have always been one of the main challenges of the global network of microclimatological tower sites that use the eddy covariance (EC) technique. To address these concerns and find more efficient gap-filling algorithms, we reviewed eight algorithms to estimate missing values of environmental drivers and nine algorithms for the three major fluxes typically found in EC time series. We then examined the algorithms' performance for different gap-filling scenarios utilising the data from five EC towers during 2013. This research's objectives were (a) to evaluate the impact of the gap lengths on the performance of each algorithm and (b) to compare the performance of traditional and new gap-filling techniques for the EC data, for fluxes, and separately for their corresponding meteorological drivers. The algorithms' performance was evaluated by generating nine gap windows with different lengths, ranging from a day to 365 d. In each scenario, a gap period was chosen randomly, and the data were removed from the dataset accordingly. After running each scenario, a variety of statistical metrics were used to evaluate the algorithms' performance. The algorithms showed different levels of sensitivity to the gap lengths; the Prophet Forecast Model (FBP) revealed the most sensitivity, whilst the performance of artificial neural networks (ANNs), for instance, did not vary as much by changing the gap length. The algorithms' performance generally decreased with increasing the gap length, yet the differences were not significant for windows smaller than 30 d. No significant differences between the algorithms were recognised for the meteorological and environmental drivers. However, the linear algorithms showed slight superiority over those of machine learning (ML), except the random forest (RF) algorithm estimating the ground heat flux (root mean square errors – RMSEs – of 28.91 and 33.92 for RF and classic linear regression – CLR, respectively). However, for the major fluxes, ML algorithms and the MDS showed superiority over the other algorithms. Even though ANNs, random forest (RF), and eXtreme Gradient Boost (XGB) showed comparable performance in gap-filling of the major fluxes, RF provided more consistent results with slightly less bias against the other ML algorithms. The results indicated no single algorithm that outperforms in all situations, but the RF is a potential alternative for the MDS and ANNs as regards flux gap-filling.

Download Full-text

Evaluating four gap-filling methods for eddy covariance measurements of evapotranspiration over hilly crop fields

10.5194/gi-2017-44 ◽

2017 ◽

Author(s):

Nissaf Boudhina ◽

Rim Zitouna-Chebbi ◽

Insaf Mekki ◽

Frédéric Jacob ◽

Nétij Ben Mechlia ◽

...

Keyword(s):

Time Series ◽

Linear Regression ◽

Eddy Covariance ◽

Water Status ◽

Growth Cycle ◽

Filling Rate ◽

Gap Filling ◽

Crop Fields ◽

Downslope Winds ◽

Aerodynamic Properties

Abstract. Estimating evapotranspiration in hilly watersheds is paramount for managing water resources, especially in semi-arid regions. Eddy covariance (EC) technique allows continuous measurements of latent heat flux LE. However, time series of EC measurements often experience large portions of missing data, because of instrumental dysfunctions or quality filtering. Existing gap-filling methods are questionable over hilly crop fields, because of changes in airflow inclination and subsequent aerodynamic properties. We evaluated the performances of different gap-filling methods before and after tailoring to conditions of hilly crop fields. The tailoring consisted of beforehand splitting the LE time series on the basis of upslope and downslope winds. The experiment was setup within an agricultural hilly watershed in northeastern Tunisia. EC measurements were collected throughout the growth cycle of three wheat crops, two of them located in adjacent fields on opposite hillslopes, and the third one located in a flat field. We considered four gap-filling methods: the REddyProc method, the linear regression between LE and net radiation Rn, the multi-linear regression of LE against the other energy fluxes, and the use of evaporative fraction EF. Regardless of method, the splitting of the LE time series did not impact the gap filling rate, and it might improve the accuracies on LE retrievals in some cases. Regardless of method, the obtained accuracies on LE estimates after gap filling were close to instrumental accuracies, and were comparable to those reported in previous studies over flat and mountainous terrains. Overall, REddyProc was the most appropriate method, for both gap filling rate and retrieval accuracy. Thus, it seems possible to conduct gap-filling for LE time series collected over hilly crop fields, provided the LE time series are beforehand split on the basis of upslope / downslope winds. Future works should address consecutive vegetation growth cycles for a larger panel of conditions in terms of climate, vegetation and water status.

Download Full-text

Evaluating four gap-filling methods for eddy covariance measurements of evapotranspiration over hilly crop fields

Geoscientific Instrumentation Methods and Data Systems ◽

10.5194/gi-7-151-2018 ◽

2018 ◽

Vol 7 (2) ◽

pp. 151-167 ◽

Cited By ~ 3

Author(s):

Nissaf Boudhina ◽

Rim Zitouna-Chebbi ◽

Insaf Mekki ◽

Frédéric Jacob ◽

Nétij Ben Mechlia ◽

...

Keyword(s):

Time Series ◽

Linear Regression ◽

Eddy Covariance ◽

Water Status ◽

Growth Cycle ◽

Filling Rate ◽

Gap Filling ◽

Crop Fields ◽

Downslope Winds ◽

Aerodynamic Properties

Abstract. Estimating evapotranspiration in hilly watersheds is paramount for managing water resources, especially in semiarid/subhumid regions. The eddy covariance (EC) technique allows continuous measurements of latent heat flux (LE). However, time series of EC measurements often experience large portions of missing data because of instrumental malfunctions or quality filtering. Existing gap-filling methods are questionable over hilly crop fields because of changes in airflow inclination and subsequent aerodynamic properties. We evaluated the performances of different gap-filling methods before and after tailoring to conditions of hilly crop fields. The tailoring consisted of splitting the LE time series beforehand on the basis of upslope and downslope winds. The experiment was setup within an agricultural hilly watershed in northeastern Tunisia. EC measurements were collected throughout the growth cycle of three wheat crops, two of them located in adjacent fields on opposite hillslopes, and the third one located in a flat field. We considered four gap-filling methods: the REddyProc method, the linear regression between LE and net radiation (Rn), the multi-linear regression of LE against the other energy fluxes, and the use of evaporative fraction (EF). Regardless of the method, the splitting of the LE time series did not impact the gap-filling rate, and it might improve the accuracies on LE retrievals in some cases. Regardless of the method, the obtained accuracies on LE estimates after gap filling were close to instrumental accuracies, and they were comparable to those reported in previous studies over flat and mountainous terrains. Overall, REddyProc was the most appropriate method, for both gap-filling rate and retrieval accuracy. Thus, it seems possible to conduct gap filling for LE time series collected over hilly crop fields, provided the LE time series are split beforehand on the basis of upslope–downslope winds. Future works should address consecutive vegetation growth cycles for a larger panel of conditions in terms of climate, vegetation, and water status.

Download Full-text

A Hybrid Validity Index to Determine K Parameter Value of k-Means Algorithm for Time Series Clustering

International Journal of Information Technology & Decision Making ◽

10.1142/s0219622021500449 ◽

2021 ◽

pp. 1-22

Author(s):

Fatma Ozge Ozkok ◽

Mete Celik

Keyword(s):

Time Series ◽

Clustering Algorithms ◽

Real Life ◽

Internal Validity ◽

The Other ◽

Sequential Data ◽

Data Mining Technique ◽

Validity Index ◽

Number Of Clusters ◽

Benchmark Datasets

Time series is a set of sequential data point in time order. The sizes and dimensions of the time series datasets are increasing day by day. Clustering is an unsupervised data mining technique that groups objects based on their similarities. It is used to analyze various datasets, such as finance, climate, and bioinformatics datasets. [Formula: see text]-means is one of the most used clustering algorithms. However, it is challenging to determine the value of [Formula: see text] parameter, which is the number of clusters. One of the most used methods to determine the number of clusters (such as [Formula: see text]) is cluster validity indexes. Several internal and external validity indexes are used to find suitable cluster numbers based on characteristics of datasets. In this study, we propose a hybrid validity index to determine the value of [Formula: see text] parameter of [Formula: see text]-means algorithm. The proposed hybrid validity index comprises four internal validity indexes, such as Dunn, Silhouette, C index, and Davies–Bouldin indexes. The proposed method was applied to nine real-life finance and benchmarks time series datasets. The financial dataset was obtained from Yahoo Finance, consisting of daily closing data of stocks. The other eight benchmark datasets were obtained from UCR time series classification archive. Experimental results showed that the proposed hybrid validity index is promising for finding the suitable number of clusters with respect to the other indexes for clustering time-series datasets.

Download Full-text

Evaluation of Seven Gap-Filling Techniques for Daily Station-Based Rainfall Datasets in South Ethiopia

Advances in Meteorology ◽

10.1155/2021/9657460 ◽

2021 ◽

Vol 2021 ◽

pp. 1-15

Author(s):

Alefu Chinasho ◽

Bobe Bedadi ◽

Tesfaye Lemma ◽

Tamado Tana ◽

Tilahun Hordofa ◽

...

Keyword(s):

Time Series ◽

Missing Values ◽

Daily Rainfall ◽

Substantial Contribution ◽

Ratio Method ◽

Rainfall Time Series ◽

Gap Filling ◽

Quantile Mapping ◽

South Ethiopia ◽

Skill Scores

Meteorological stations, mainly located in developing countries, have gigantic missing values in the climate dataset (rainfall and temperature). Ignoring the missing values from analyses has been used as a technique to manage it. However, it leads to partial and biased results in data analyses. Instead, filling the data gaps using the reference datasets is a better and widely used approach. Thus, this study was initiated to evaluate the seven gap-filling techniques in daily rainfall datasets in five meteorological stations of Wolaita Zone and the surroundings in South Ethiopia. The considered gap-filling techniques in this study were simple arithmetic means (SAM), normal ratio method (NRM), correlation coefficient weighing (CCW), inverse distance weighting (IDW), multiple linear regression (MLR), empirical quantile mapping (EQM), and empirical quantile mapping plus (EQM+). The techniques were preferred because of their computational simplicity and appreciable accuracies. Their performance was evaluated against mean absolute error (MAE), root mean square error (RMSE), skill scores (SS), and Pearson’s correlation coefficients (R). The results indicated that MLR outperformed other techniques in all of the five meteorological stations. It showed the lowest RMSE and the highest SS and R in all stations. Four techniques (SAM, NRM, CCW, and IDW) showed similar performance and were second-ranked in all of the stations with little exceptions in time series. EQM+ improved (not substantial) the performance levels of gap-filling techniques in some stations. In general, MLR is suggested to fill in the missing values of the daily rainfall time series. However, the second-ranked techniques could also be used depending on the required time series (period) of each station. The techniques have better performance in stations located in higher altitudes. The authors expect a substantial contribution of this paper to the achievement of sustainable development goal thirteen (climate action) through the provision of gap-filling techniques with better accuracy.

Download Full-text

Using Deep Learning to Fill Spatio-Temporal Data Gaps in Hydrological Monitoring Networks

10.5194/hess-2019-196 ◽

2019 ◽

Cited By ~ 4

Author(s):

Huiying Ren ◽

Erol Cromwell ◽

Ben Kravitz ◽

Xingyuan Chen

Keyword(s):

Nonlinear Dynamics ◽

Time Series ◽

Series Data ◽

Percentage Error ◽

Gap Filling ◽

Error Statistics ◽

Groundwater Aquifer ◽

Data Gaps ◽

Filling Method ◽

Spatio Temporal

Abstract. Long-term spatio-temporal changes in subsurface hydrological flow are usually quantified through a network of wells; however, such observations often are spatially sparse and temporal gaps exist due to poor quality or instrument failure. In this study, we explore the ability of deep neural networks to fill in gaps in spatially distributed time-series data. We selected a location at the U.S. Department of Energy's Hanford site to demonstrate and evaluate the new method, using a 10-year spatio-temporal hydrological dataset of temperature, specific conductance, and groundwater table elevation from 42 wells that monitor the dynamic and heterogeneous hydrologic exchanges between the Columbia River and its adjacent groundwater aquifer. We employ a long short-term memory (LSTM)-based architecture, which is specially designed to address both spatial and temporal variations in the property fields. The performance of gap filling using an LSTM framework is evaluated using test datasets with synthetic data gaps created by assuming the observations were missing for a given time window (i.e., gap length), such that the mean absolute percentage error can be calculated against true observations. Such test datasets also allow us to examine how well the original nonlinear dynamics are captured in gap-filled time series beyond the error statistics. The performance of the LSTM-based gap-filling method is compared to that of a traditional, popular gap-filling method: autoregressive integrated moving average (ARIMA). Although ARIMA appears to perform slightly better than LSTM on average error statistics, LSTM is better able to capture nonlinear dynamics that are present in time series. Thus, LSTMs show promising potential to outperform ARIMA for gap filling in highly dynamic time-series observations characterized by multiple dominant modes of variability. Capturing such dynamics is essential to generate the most valuable observations to advance our understanding of dynamic complex systems.

Download Full-text

Estimating population variability of aphids (Hemiptera: Aphididae): how many years are required?

The Canadian Entomologist ◽

10.4039/tce.2016.34 ◽

2016 ◽

Vol 149 (1) ◽

pp. 48-55 ◽

Cited By ~ 5

Author(s):

Robert J. Lamb ◽

Patricia A. MacKay ◽

Andrei Alyokhin

Keyword(s):

Time Series ◽

Population Dynamics ◽

Standard Deviation ◽

Coefficient Of Variation ◽

Population Variability ◽

The Other ◽

Macrosiphum Euphorbiae ◽

Mean Values ◽

Robust Estimates ◽

Aphis Nasturtii

AbstractVariability is an important characteristic of population dynamics, but the length of the time series required to estimate population variability is poorly understood. To this end, population variability of Macrosiphum euphorbiae (Thomas), Myzus persicae (Sulzer), and Aphis nasturtii (Kaltenbach) (Hemiptera: Aphididae) was investigated. Population variability (measured as PV, a proportion between 0 and 1) was estimated for time series of 3–62 years, giving replicate estimates for time series of 3–20 years that were normally distributed. Mean values for PV were more uniform for a time series of 12 years or longer than for shorter ones. The standard deviation of PV declined to a minimum at 12–15 years, as the length of the time series increased. Discrimination of estimates of PV was reliable for 15-year time series and longer, but not necessarily for shorter ones. Although M. euphorbiae had a relatively low PV, the coefficient of variation of that PV (12.5), was higher than for the other two species (3.5, 4.5). For robust estimates of PV, a time series of 15 years is recommended, because it minimises the standard deviation of PV, and discriminates values of PV that differ by 0.06 on a 0–1 scale.

Download Full-text

Testing the applicability of neural networks as a gap-filling method using CH<sub>4</sub> flux data from high latitude wetlands

Biogeosciences ◽

10.5194/bg-10-8185-2013 ◽

2013 ◽

Vol 10 (12) ◽

pp. 8185-8200 ◽

Cited By ~ 54

Author(s):

S. Dengel ◽

D. Zona ◽

T. Sachs ◽

M. Aurela ◽

M. Jammet ◽

...

Keyword(s):

Neural Networks ◽

Time Series ◽

Pearson Correlation ◽

Time Of Day ◽

Gap Filling ◽

Ch4 Flux ◽

Advantages And Disadvantages ◽

Flux Data ◽

Free Data ◽

Filling Method

Abstract. Since the advancement in CH4 gas analyser technology and its applicability to eddy covariance flux measurements, monitoring of CH4 emissions is becoming more widespread. In order to accurately determine the greenhouse gas balance, high quality gap-free data is required. Currently there is still no consensus on CH4 gap-filling methods, and methods applied are still study-dependent and often carried out on low resolution, daily data. In the current study, we applied artificial neural networks to six distinctively different CH4 time series from high latitudes, explain the method and test its functionality. We discuss the applicability of neural networks in CH4 flux studies, the advantages and disadvantages of this method, and what information we were able to extract from such models. Three different approaches were tested by including drivers such as air and soil temperature, barometric air pressure, solar radiation, wind direction (indicator of source location) and in addition the lagged effect of water table depth and precipitation. In keeping with the principle of parsimony, we included up to five of these variables traditionally measured at CH4 flux measurement sites. Fuzzy sets were included representing the seasonal change and time of day. High Pearson correlation coefficients (r) of up to 0.97 achieved in the final analysis are indicative for the high performance of neural networks and their applicability as a gap-filling method for CH4 flux data time series. This novel approach which we show to be appropriate for CH4 fluxes is a step towards standardising CH4 gap-filling protocols.

Download Full-text

A MULTI-SCALE SEGMENTATION APPROACH TO FILLING GAPS IN LANDSAT ETM+ SLC-OFF IMAGES THROUGH PIXEL WEIGHTING

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xlii-3-w11-79-2020 ◽

2020 ◽

Vol XLII-3/W11 ◽

pp. 79-84

Author(s):

R. F. B. Marujo ◽

L. M. G. Fonseca ◽

T. S. Körting ◽

H. N. Bendini

Keyword(s):

Missing Values ◽

Original Data ◽

Image Texture ◽

Gap Filling ◽

Optical Images ◽

Object Based ◽

Small Streams ◽

Segmentation Approach ◽

Filling Method ◽

Most Frequent Value

Abstract. Monitoring changes on Earth’s surface is a difficult task commonly performed using multi-spectral remote sensing images. The absence of surface information in optical images due to the presence of cloud, low temporal resolution and sensors defects interfere in analyses. In this context, we present an approach for filling gaps in imagery mainly caused by small clouds and sensor defects. Our method consists of an adaptation from an existing method that uses spatial context of close-in-time images through the use of the most frequent value obtained using multiscale segmentation. Our method uses the pixel proportion contained in each segment to fill missing values. We applied the gap-filling methodology on three dates containing simulated images from Landsat7 using Landsat8 images. We validated the method by introducing and filling artificial gaps, and comparing the original data with model predictions. The developed approach surpassed Maxwell et al. (2007) gap-filling method for all bands, presenting a minimal R2 of 0.78. Our method proved to enhance the Maxwell et al. (2007) gap-filling method, while also asymptotically maintaining the algorithm cost. It also allowed image texture to be conserved on reconstructed images. This characteristic enables narrow features, e.g., as roads, riparian areas, and small streams capable of being detected on the filled images. Based on that, further object-based approaches can be used on images filled using this methodology, demonstrating its capacity to estimate Earth’s surface data.

Download Full-text