Green LAI Mapping and Cloud Gap-Filling Using Gaussian Process Regression in Google Earth Engine

Luca Pipia; Eatidal Amin; Santiago Belda; Matías Salinero-Delgado; Jochem Verrelst

doi:10.3390/rs13030403

Monitoring Cropland Phenology on Google Earth Engine Using Gaussian Process Regression

Remote Sensing ◽

10.3390/rs14010146 ◽

2021 ◽

Vol 14 (1) ◽

pp. 146

Author(s):

Matías Salinero-Delgado ◽

José Estévez ◽

Luca Pipia ◽

Santiago Belda ◽

Katja Berger ◽

...

Keyword(s):

Time Series ◽

Gaussian Process ◽

Satellite Data ◽

Gaussian Process Regression ◽

Google Earth ◽

Gap Filling ◽

Area Index ◽

Filling Time ◽

Google Earth Engine ◽

Mean Square Errors

Monitoring cropland phenology from optical satellite data remains a challenging task due to the influence of clouds and atmospheric artifacts. Therefore, measures need to be taken to overcome these challenges and gain better knowledge of crop dynamics. The arrival of cloud computing platforms such as Google Earth Engine (GEE) has enabled us to propose a Sentinel-2 (S2) phenology end-to-end processing chain. To achieve this, the following pipeline was implemented: (1) the building of hybrid Gaussian Process Regression (GPR) retrieval models of crop traits optimized with active learning, (2) implementation of these models on GEE (3) generation of spatiotemporally continuous maps and time series of these crop traits with the use of gap-filling through GPR fitting, and finally, (4) calculation of land surface phenology (LSP) metrics such as the start of season (SOS) or end of season (EOS). Overall, from good to high performance was achieved, in particular for the estimation of canopy-level traits such as leaf area index (LAI) and canopy chlorophyll content, with normalized root mean square errors (NRMSE) of 9% and 10%, respectively. By means of the GPR gap-filling time series of S2, entire tiles were reconstructed, and resulting maps were demonstrated over an agricultural area in Castile and Leon, Spain, where crop calendar data were available to assess the validity of LSP metrics derived from crop traits. In addition, phenology derived from the normalized difference vegetation index (NDVI) was used as reference. NDVI not only proved to be a robust indicator for the calculation of LSP metrics, but also served to demonstrate the good phenology quality of the quantitative trait products. Thanks to the GEE framework, the proposed workflow can be realized anywhere in the world and for any time window, thus representing a shift in the satellite data processing paradigm. We anticipate that the produced LSP metrics can provide meaningful insights into crop seasonal patterns in a changing environment that demands adaptive agricultural production.

Download Full-text

Gaussian Process Regression hyperparameter optimization for image time series gap-filling of Earth observation data and crop monitoring

10.5194/egusphere-egu21-14322 ◽

2021 ◽

Author(s):

Santiago Belda ◽

Matías Salinero ◽

Eatidal Amin ◽

Luca Pipia ◽

Pablo Morcillo-Pallarés ◽

...

Keyword(s):

Time Series ◽

Gaussian Process ◽

Computational Cost ◽

Gaussian Process Regression ◽

Superior Performance ◽

Observation Data ◽

Gap Filling ◽

Crop Monitoring ◽

Exponential Kernel ◽

Uncertainty Estimates

In general, modeling phenological evolution represents a challenging task mainly because of time series gaps and noisy data, coming from different viewing and illumination geometries, cloud cover, seasonal snow and the interval needed to revisit and acquire data for the exact same location. For that reason, the use of reliable gap-filling fitting functions and smoothing filters is frequently required for retrievals at the highest feasible accuracy. Of specific interest to filling gaps in time series is the emergence of machine learning regression algorithms (MLRAs) which can serve as fitting functions. Among the multiple MLRA approaches currently available, the kernel-based methods developed in a Bayesian framework deserve special attention because of both being adaptive and providing associated uncertainty estimates, such as Gaussian Process Regression (GPR).Recent studies demonstrated the effectiveness of GPR for gap-filling of biophysical parameter time series because the hyperparameters can be optimally set for each time series (one for each pixel in the area) with a single optimization procedure. The entire procedure of learning a GPR model only relies on appropriate selection of the type of kernel and the hyperparameters involved in the estimation of input data covariance. Despite its clear strategic advantage, the most important shortcomings of this technique are the (1) high computational cost and (2) memory requirements of their training, which grows cubically and quadratically with the number of model&#8217;s samples, respectively. This can become problematic in view of processing a large amount of data, such as in Sentinel-2 (S2) time series tiles. Hence, optimization strategies need to be developed on how to speed up the GPR processing while maintaining the superior performance in terms of accuracy.To mitigate its computational burden and to address such shortcoming and repetitive procedure, we evaluated whether the GPR hyperparameters can be preoptimized over a reduced set of representative pixels and kept fixed over a more extended crop area. We used S2 LAI time series over an agricultural region in Castile and Leon (North-West Spain) and testing different functions for Covariance estimation such as exponential Kernel, Squared exponential kernel and matern kernel with parameter 3/2 or 5/2. The performance of image reconstructions was compared against the standard per-pixel GPR time series training process. Results showed that accuracies were on the same order (12% RMSE degradation) whereas processing time accelerated up to 90 times. Crop phenology indicators were also calculated and compared, revealing similar temporal patterns with differences in start and end of growing season of no more than five days. To the benefit of crop monitoring applications, all the gap-filling and phenology indicators retrieval techniques have been implemented into the freely downloadable GUI toolbox DATimeS (Decomposition and Analysis of Time Series Software - https://artmotoolbox.com/).

Download Full-text

Mapping vegetation variables in Google Earth Engine using Gaussian Process Regression models.

10.5194/egusphere-egu21-12359 ◽

2021 ◽

Author(s):

Matías Salinero Delgado ◽

Luca Pipia ◽

Eatidal Amin ◽

Santiago Belda ◽

Jochem Verrelst

Keyword(s):

Gaussian Process ◽

Computational Efficiency ◽

Gaussian Process Regression ◽

Google Earth ◽

Time Interval ◽

Fluorescence Signal ◽

Retrieval Models ◽

Area Index ◽

Google Earth Engine ◽

Sentinel 2

The aim of ESA's forthcoming FLuorescence EXplorer (FLEX) is to achieve a global monitoring of the vegetation's chlorophyll fluorescence by means of an imaging spectrometer, FLORIS. For the retrieval of the fluorescence signal measured from space, other vegetation variables need to be retrieved simultaneously, such as (1) Leaf Area Index (LAI), (2) Leaf Chlorophyll content (Cab), and (3) Fractional Vegetation cover (FCover), among others. The undergoing SENTIFLEX ERC project has already demonstrated the feasibility to operationally infer these variables by hybrid retrieval approaches, which combine the generalization capabilities offered by radiative transfer models (RTMs) and computational efficiency of machine learning methods. Reflectance spectra corresponding to a large variety of canopy realizations served as input to train a Gaussian Process Regression (GPR) algorithm for each targeted variable. Following this approach, sets of GPR retrieval models have been trained for Sentinel-2 and -3 reflectance images.In that direction, we started to explore the potential of Google Earth Engine (GEE) to facilitate regional to global mapping. &#160;GEE is a platform with multi-petabyte satellite imagery catalog and geospatial datasets with planetary-scale analysis capabilities, which is freely available for scientific purposes. Among the different EO archives, it is possible to access the whole collection of Sentinel-2 ground reflectance data. In this work, we present the results of an efficient implementation of the GPR-based vegetation models developed for Sentinel-2 in the framework of SENSAGRI H2020 project in GEE. By taking advantage of GEE cloud-computing power, we are able to avoid the typical bottleneck of downloading and process large amounts of data locally and generate results of GPR-based retrieval models developed for Sentinel-2 in a fast and efficient way, covering large areas in matter of seconds. As a first step in that direction we present here an open web-based GEE application able to generate LAI Green and LAI Brown maps from Sentinel-2- imagery at 20m in a tile-wise manner all over the world, and time series of selected pixels during user-defined time interval.To illustrate this functionalities and have better understanding of the phenology, we targeted a region in Castilla y Le&#243;n (Spain) from where we will present results for 2018 classified per crop type. This land cover classification was generated by the ITACYL (Instituto Tecnol&#243;gico Agrario de Castilla y Le&#243;n) during SENSAGRI.Future development will tackle the possibility to extend our analysis capability to additional variables, such as FCover and Cab, maintaining the computational efficiency as the main driver to ensure that the GEE application continues to be an agile and easy tool for spatiotemporal Earth observation studies.

Download Full-text

Down-scaling MODIS operational vegetation products with machine learning and fused gap-free high resolution reflectance data in Google Earth Engine

10.5194/egusphere-egu2020-19616 ◽

2020 ◽

Author(s):

Álvaro Moreno Martínez ◽

Emma Izquierdo Verdiguier ◽

Gustau Camps Valls ◽

Marco Maneta ◽

Jordi Muñoz Marí ◽

...

Keyword(s):

Machine Learning ◽

Spatial Resolution ◽

High Spatial Resolution ◽

Google Earth ◽

Radiative Transfer Model ◽

Transfer Model ◽

Learning Approaches ◽

Area Index ◽

Uncertainty Estimates ◽

Google Earth Engine

Among Essential Climate Variables (ECVs) for global climate observation, the Leaf Area Index (LAI) and the Fraction of Absorbed Photosynthetically Active Radiation (FAPAR) are the most widely used to study land vegetated surfaces. The NASA&#8217;s Moderate&#160; Resolution Imaging Spectro-radiometer (MODIS) is a key instrument aboard the Terra and Aqua platforms and allows to estimate both biophysical variables at coarse resolution (500 m) and global scales. The MODIS operational algorithm to retrieve LAI and FAPAR (MOD15/MYD15/MCD15) uses a physically-based radiative transfer model (RTM) to compute their estimates with corrected surface spectral information content. This algorithm has been heavily validated and compared with field measurements and other sensors but, so far, no equivalent products at high spatial resolution and continental or global scales are routinely produced.&#160;Here, we introduce and validate a methodology to create a set of high spatial resolution LAI/FAPAR products by learning the MODIS RTM using advanced machine learning approaches and gap filled Landsat surface reflectances. The latter are smoothed and gap-filled by the HIghly Scalable Temporal Adaptive Reflectance Fusion Model (HISTARFM). HISTARTFM has a great potential to improve the original Landsat reflectances by reducing their noise and recovering missing data due to cloud contamination. In addition, HISTARFM runs very fast in cloud computing platforms such as Google Earth Engine (GEE) and provides uncertainty estimates which can be propagated through the models. These estimates allow to compute numerical uncertainties beyond the typical and qualitative control information layers provided in operational products such as the MODIS LAI/FAPAR. The introduced high spatial resolution biophysical products here could be of interest to the users to achieve the needed levels of spatial detail to adequately monitor croplands and heterogeneously vegetated landscapes.&#160;

Download Full-text

Optimizing Gaussian Process Regression for Image Time Series Gap-Filling and Crop Monitoring

Agronomy ◽

10.3390/agronomy10050618 ◽

2020 ◽

Vol 10 (5) ◽

pp. 618

Author(s):

Santiago Belda ◽

Luca Pipia ◽

Pablo Morcillo-Pallarés ◽

Jochem Verrelst

Keyword(s):

Time Series ◽

Gaussian Process ◽

Gaussian Process Regression ◽

Machine Learning Algorithms ◽

Series Data ◽

Gap Filling ◽

Crop Phenology ◽

Crop Monitoring ◽

Computational Performance ◽

Uncertainty Estimates

Image processing entered the era of artificial intelligence, and machine learning algorithms emerged as attractive alternatives for time series data processing. Satellite image time series processing enables crop phenology monitoring, such as the calculation of start and end of season. Among the promising algorithms, Gaussian process regression (GPR) proved to be a competitive time series gap-filling algorithm with the advantage of, as developed within a Bayesian framework, providing associated uncertainty estimates. Nevertheless, the processing of time series images becomes computationally inefficient in its standard per-pixel usage, mainly for GPR training rather than the fitting step. To mitigate this computational burden, we propose to substitute the per-pixel optimization step with the creation of a cropland-based precalculations for the GPR hyperparameters θ . To demonstrate our approach hardly affects the accuracy in fitting, we used Sentinel-2 LAI time series over an agricultural region in Castile and Leon, North-West Spain. The performance of image reconstructions were compared against the standard per-pixel GPR time series processing. Results showed that accuracies were on the same order (RMSE 0.1767 vs. 0.1564 [ m 2 / m 2 ] , 12% RMSE degradation) whereas processing time accelerated about 90 times. We further evaluated the alternative option of using the same hyperparameters for all the pixels within the complete scene. It led to similar overall accuracies over crop areas and computational performance. Crop phenology indicators were also calculated for the three different approaches and compared. Results showed analogous crop temporal patterns, with differences in start and end of growing season of no more than five days. To the benefit of crop monitoring applications, all the gap-filling and phenology indicators retrieval techniques have been implemented into the freely downloadable GUI toolbox DATimeS.

Download Full-text

A machine learning software framework for extraction of phenology indicators from multi-temporal sentinel-2 images

10.5194/egusphere-egu2020-21952 ◽

2020 ◽

Author(s):

Dounia arezki ◽

Hadria Fizazi ◽

Santiago Belda ◽

Charlotte De Grave ◽

Luca Pipia ◽

...

Keyword(s):

Machine Learning ◽

Time Series ◽

Radiative Transfer ◽

Gaussian Process ◽

Vegetation Dynamics ◽

Gaussian Process Regression ◽

Software Framework ◽

Gap Filling ◽

Radiative Transfer Models ◽

Sentinel 2

Optical Earth observation satellites provide spatially-explicit data that are necessary to study trends in vegetation dynamics. However, more of often than not optical data are discontinuous in time, due to persistent cloud cover and instrumental noises. Hence, the operating constraints of these data require several essential pre-processing steps, especially when aiming to reach towards monitoring of vegetation seasonal trends.&#160; To facilitate this task, here we present an end-to-end processing software framework applied to Sentinel-2 images.To do so, first biophysical retrieval models were generated by means of a trained machine learning regression algorithm (MLRA) using simulated data coming from radiative transfer models. Among various tested MLRAs, the variational heteroscedastic Gaussian process regression (VHGPR) was evaluated as best performing. to train the retrieval model.&#160; The training and retrieval were conducted in the Automated Radiative Transfer Models Operator (ARTMO) software framework.Subsequently, in view of retrieving the phenological parameters from the obtained vegetation products, a novel times series toolbox as part of the ARTMO framework was used, called:&#160; Decomposition and Analysis of Time Series software (DATimeS). DATimeS provides temporal interpolation among other functionalities with several advanced MLRAs for gap filling, smoothing functions and subsequent calculation of phenology indicators. Various MLRAs were tested for gap filling to reconstruct cloud-free maps of biophysical variables at a step of 10 days.A demonstration case is presented involving the retrieval of Leaf area index (LAI), fraction of Absorbed Photosynthetically Active Radiation (FAPAR) from sentinel-2 time series.&#160; A large agricultural Algerian site of 143, 75 km&#178; including Oued Rhiou, Ouarizane, Djidioua (1,345,075 pixels) was chosen for this study.&#160; A reference image was excluded from the time series in order to evaluate the reconstruction accuracy over a 40-day artificial gap.&#160; The reference vs.&#160; Reconstructed maps produced by the gap-filling methods were compared with statistical goodness-of-fit metrics.&#160; Considering both accuracy and processing speed, the fitting algorithms Gaussian process regression (GPR) and Next neighbour interpolation (R&#178;= 0.90 / 0.081 sec per pixel and R&#178;=0.88 / 0.001 sec per pixel respectively) interpolations proved to reconstruct the vegetation products the most efficient, with GPR as more accurate but Next faster by a factor of 70.Finally, we evaluated of the phenology indicators such as start-of-season and end-of-season based on LAI and FAPAR. The obtained maps provide valid information of the vegetation dynamics.&#160; Altogether, the ARTMO-DATimeS software framework enabled seamless processing of all essential steps:&#160; (1) from L2A sentinel-2 images converted to vegetation products, (2) to cloud-free composite products, and finally (3) converted into vegetation phenology indicators.&#160;

Download Full-text

Exchange Spin Coupling from Gaussian Process Regression

10.26434/chemrxiv.12589541.v3 ◽

2020 ◽

Author(s):

Marc Philipp Bahlke ◽

Natnael Mogos ◽

Jonny Proppe ◽

Carmen Herrmann

Keyword(s):

Machine Learning ◽

Gaussian Process ◽

Gaussian Process Regression ◽

Molecular Magnets ◽

Molecular Structures ◽

Spin Coupling ◽

Structure Property ◽

Data Set ◽

Uncertainty Estimates

Heisenberg exchange spin coupling between metal centers is essential for describing and understanding the electronic structure of many molecular catalysts, metalloenzymes, and molecular magnets for potential application in information technology. We explore the machine-learnability of exchange spin coupling, which has not been studied yet. We employ Gaussian process regression since it can potentially deal with small training sets (as likely associated with the rather complex molecular structures required for exploring spin coupling) and since it provides uncertainty estimates (“error bars”) along with predicted values. We compare a range of descriptors and kernels for 257 small dicopper complexes and find that a simple descriptor based on chemical intuition, consisting only of copper-bridge angles and copper-copper distances, clearly outperforms several more sophisticated descriptors when it comes to extrapolating towards larger experimentally relevant complexes. Exchange spin coupling is similarly easy to learn as the polarizability, while learning dipole moments is much harder. The strength of the sophisticated descriptors lies in their ability to linearize structure-property relationships, to the point that a simple linear ridge regression performs just as well as the kernel-based machine-learning model for our small dicopper data set. The superior extrapolation performance of the simple descriptor is unique to exchange spin coupling, reinforcing the crucial role of choosing a suitable descriptor, and highlighting the interesting question of the role of chemical intuition vs. systematic or automated selection of features for machine learning in chemistry and material science.

Download Full-text

Exchange Spin Coupling from Gaussian Process Regression

10.26434/chemrxiv.12589541 ◽

2020 ◽

Author(s):

Marc Philipp Bahlke ◽

Natnael Mogos ◽

Jonny Proppe ◽

Carmen Herrmann

Keyword(s):

Machine Learning ◽

Gaussian Process ◽

Gaussian Process Regression ◽

Molecular Magnets ◽

Molecular Structures ◽

Spin Coupling ◽

Structure Property ◽

Data Set ◽

Uncertainty Estimates

Heisenberg exchange spin coupling between metal centers is essential for describing and understanding the electronic structure of many molecular catalysts, metalloenzymes, and molecular magnets for potential application in information technology. We explore the machine-learnability of exchange spin coupling, which has not been studied yet. We employ Gaussian process regression since it can potentially deal with small training sets (as likely associated with the rather complex molecular structures required for exploring spin coupling) and since it provides uncertainty estimates (“error bars”) along with predicted values. We compare a range of descriptors and kernels for 257 small dicopper complexes and find that a simple descriptor based on chemical intuition, consisting only of copper-bridge angles and copper-copper distances, clearly outperforms several more sophisticated descriptors when it comes to extrapolating towards larger experimentally relevant complexes. Exchange spin coupling is similarly easy to learn as the polarizability, while learning dipole moments is much harder. The strength of the sophisticated descriptors lies in their ability to linearize structure-property relationships, to the point that a simple linear ridge regression performs just as well as the kernel-based machine-learning model for our small dicopper data set. The superior extrapolation performance of the simple descriptor is unique to exchange spin coupling, reinforcing the crucial role of choosing a suitable descriptor, and highlighting the interesting question of the role of chemical intuition vs. systematic or automated selection of features for machine learning in chemistry and material science.

Download Full-text

Machine Learning for Multiple Yield Curve Markets: Fast Calibration in the Gaussian Affine Framework

Risks ◽

10.3390/risks8020050 ◽

2020 ◽

Vol 8 (2) ◽

pp. 50

Author(s):

Sandrine Gümbel ◽

Thorsten Schmidt

Keyword(s):

Machine Learning ◽

Gaussian Process ◽

Term Structure ◽

Kalman Filtering ◽

Yield Curve ◽

Gaussian Process Regression ◽

Machine Learning Techniques ◽

Future Research ◽

Extended Kalman Filtering ◽

Learning Techniques

Calibration is a highly challenging task, in particular in multiple yield curve markets. This paper is a first attempt to study the chances and challenges of the application of machine learning techniques for this. We employ Gaussian process regression, a machine learning methodology having many similarities with extended Kálmán filtering, which has been applied many times to interest rate markets and term structure models. We find very good results for the single-curve markets and many challenges for the multi-curve markets in a Vasiček framework. The Gaussian process regression is implemented with the Adam optimizer and the non-linear conjugate gradient method, where the latter performs best. We also point towards future research.

Download Full-text

Agricultural cropland extent and areas of South Asia derived using Landsat satellite 30-m time-series big-data using random forest machine learning algorithms on the Google Earth Engine cloud

GIScience & Remote Sensing ◽

10.1080/15481603.2019.1690780 ◽

2019 ◽

Vol 57 (3) ◽

pp. 302-322 ◽

Cited By ~ 9

Author(s):

Murali Krishna Gumma ◽

Prasad S. Thenkabail ◽

Pardhasaradhi G. Teluguntla ◽

Adam Oliphant ◽

Jun Xiong ◽

...

Keyword(s):

Machine Learning ◽

Time Series ◽

Big Data ◽

Random Forest ◽

South Asia ◽

Learning Algorithms ◽

Google Earth ◽

Machine Learning Algorithms ◽

Google Earth Engine ◽

Landsat Satellite

Download Full-text