scholarly journals Green LAI Mapping and Cloud Gap-Filling Using Gaussian Process Regression in Google Earth Engine

2021 ◽  
Vol 13 (3) ◽  
pp. 403
Author(s):  
Luca Pipia ◽  
Eatidal Amin ◽  
Santiago Belda ◽  
Matías Salinero-Delgado ◽  
Jochem Verrelst

For the last decade, Gaussian process regression (GPR) proved to be a competitive machine learning regression algorithm for Earth observation applications, with attractive unique properties such as band relevance ranking and uncertainty estimates. More recently, GPR also proved to be a proficient time series processor to fill up gaps in optical imagery, typically due to cloud cover. This makes GPR perfectly suited for large-scale spatiotemporal processing of satellite imageries into cloud-free products of biophysical variables. With the advent of the Google Earth Engine (GEE) cloud platform, new opportunities emerged to process local-to-planetary scale satellite data using advanced machine learning techniques and convert them into gap-filled vegetation properties products. However, GPR is not yet part of the GEE ecosystem. To circumvent this limitation, this work proposes a general adaptation of GPR formulation to parallel processing framework and its integration into GEE. To demonstrate the functioning and utility of the developed workflow, a GPR model predicting green leaf area index (LAIG) from Sentinel-2 imagery was imported. Although by running this GPR model into GEE any corner of the world can be mapped into LAIG at a resolution of 20 m, here we show some demonstration cases over western Europe with zoom-ins over Spain. Thanks to the computational power of GEE, the mapping takes place on-the-fly. Additionally, a GPR-based gap filling strategy based on pre-optimized kernel hyperparameters is also put forward for the generation of multi-orbit cloud-free LAIG maps with an unprecedented level of detail, and the extraction of regularly-sampled LAIG time series at a pixel level. The ability to plugin a locally-trained GPR model into the GEE framework and its instant processing opens up a new paradigm of remote sensing image processing.

2021 ◽  
Vol 14 (1) ◽  
pp. 146
Author(s):  
Matías Salinero-Delgado ◽  
José Estévez ◽  
Luca Pipia ◽  
Santiago Belda ◽  
Katja Berger ◽  
...  

Monitoring cropland phenology from optical satellite data remains a challenging task due to the influence of clouds and atmospheric artifacts. Therefore, measures need to be taken to overcome these challenges and gain better knowledge of crop dynamics. The arrival of cloud computing platforms such as Google Earth Engine (GEE) has enabled us to propose a Sentinel-2 (S2) phenology end-to-end processing chain. To achieve this, the following pipeline was implemented: (1) the building of hybrid Gaussian Process Regression (GPR) retrieval models of crop traits optimized with active learning, (2) implementation of these models on GEE (3) generation of spatiotemporally continuous maps and time series of these crop traits with the use of gap-filling through GPR fitting, and finally, (4) calculation of land surface phenology (LSP) metrics such as the start of season (SOS) or end of season (EOS). Overall, from good to high performance was achieved, in particular for the estimation of canopy-level traits such as leaf area index (LAI) and canopy chlorophyll content, with normalized root mean square errors (NRMSE) of 9% and 10%, respectively. By means of the GPR gap-filling time series of S2, entire tiles were reconstructed, and resulting maps were demonstrated over an agricultural area in Castile and Leon, Spain, where crop calendar data were available to assess the validity of LSP metrics derived from crop traits. In addition, phenology derived from the normalized difference vegetation index (NDVI) was used as reference. NDVI not only proved to be a robust indicator for the calculation of LSP metrics, but also served to demonstrate the good phenology quality of the quantitative trait products. Thanks to the GEE framework, the proposed workflow can be realized anywhere in the world and for any time window, thus representing a shift in the satellite data processing paradigm. We anticipate that the produced LSP metrics can provide meaningful insights into crop seasonal patterns in a changing environment that demands adaptive agricultural production.


2021 ◽  
Author(s):  
Santiago Belda ◽  
Matías Salinero ◽  
Eatidal Amin ◽  
Luca Pipia ◽  
Pablo Morcillo-Pallarés ◽  
...  

<p>In general, modeling phenological evolution represents a challenging task mainly because of time series gaps and noisy data, coming from different viewing and illumination geometries, cloud cover, seasonal snow and the interval needed to revisit and acquire data for the exact same location. For that reason, the use of reliable gap-filling fitting functions and smoothing filters is frequently required for retrievals at the highest feasible accuracy. Of specific interest to filling gaps in time series is the emergence of machine learning regression algorithms (MLRAs) which can serve as fitting functions. Among the multiple MLRA approaches currently available, the kernel-based methods developed in a Bayesian framework deserve special attention because of both being adaptive and providing associated uncertainty estimates, such as Gaussian Process Regression (GPR).</p><p>Recent studies demonstrated the effectiveness of GPR for gap-filling of biophysical parameter time series because the hyperparameters can be optimally set for each time series (one for each pixel in the area) with a single optimization procedure. The entire procedure of learning a GPR model only relies on appropriate selection of the type of kernel and the hyperparameters involved in the estimation of input data covariance. Despite its clear strategic advantage, the most important shortcomings of this technique are the (1) high computational cost and (2) memory requirements of their training, which grows cubically and quadratically with the number of model’s samples, respectively. This can become problematic in view of processing a large amount of data, such as in Sentinel-2 (S2) time series tiles. Hence, optimization strategies need to be developed on how to speed up the GPR processing while maintaining the superior performance in terms of accuracy.</p><p>To mitigate its computational burden and to address such shortcoming and repetitive procedure, we evaluated whether the GPR hyperparameters can be preoptimized over a reduced set of representative pixels and kept fixed over a more extended crop area. We used S2 LAI time series over an agricultural region in Castile and Leon (North-West Spain) and testing different functions for Covariance estimation such as exponential Kernel, Squared exponential kernel and matern kernel with parameter 3/2 or 5/2. The performance of image reconstructions was compared against the standard per-pixel GPR time series training process. Results showed that accuracies were on the same order (12% RMSE degradation) whereas processing time accelerated up to 90 times. Crop phenology indicators were also calculated and compared, revealing similar temporal patterns with differences in start and end of growing season of no more than five days. To the benefit of crop monitoring applications, all the gap-filling and phenology indicators retrieval techniques have been implemented into the <strong>freely downloadable GUI toolbox DATimeS</strong> (Decomposition and Analysis of Time Series Software - https://artmotoolbox.com/).</p>


2021 ◽  
Author(s):  
Matías Salinero Delgado ◽  
Luca Pipia ◽  
Eatidal Amin ◽  
Santiago Belda ◽  
Jochem Verrelst

<p>The aim of ESA's forthcoming FLuorescence EXplorer (FLEX) is to achieve a global monitoring of the vegetation's chlorophyll fluorescence by means of an imaging spectrometer, FLORIS. For the retrieval of the fluorescence signal measured from space, other vegetation variables need to be retrieved simultaneously, such as (1) Leaf Area Index (LAI), (2) Leaf Chlorophyll content (Cab), and (3) Fractional Vegetation cover (FCover), among others. The undergoing SENTIFLEX ERC project has already demonstrated the feasibility to operationally infer these variables by hybrid retrieval approaches, which combine the generalization capabilities offered by radiative transfer models (RTMs) and computational efficiency of machine learning methods. Reflectance spectra corresponding to a large variety of canopy realizations served as input to train a Gaussian Process Regression (GPR) algorithm for each targeted variable. Following this approach, sets of GPR retrieval models have been trained for Sentinel-2 and -3 reflectance images.</p><p>In that direction, we started to explore the potential of Google Earth Engine (GEE) to facilitate regional to global mapping.  GEE is a platform with multi-petabyte satellite imagery catalog and geospatial datasets with planetary-scale analysis capabilities, which is freely available for scientific purposes. Among the different EO archives, it is possible to access the whole collection of Sentinel-2 ground reflectance data. In this work, we present the results of an efficient implementation of the GPR-based vegetation models developed for Sentinel-2 in the framework of SENSAGRI H2020 project in GEE. By taking advantage of GEE cloud-computing power, we are able to avoid the typical bottleneck of downloading and process large amounts of data locally and generate results of GPR-based retrieval models developed for Sentinel-2 in a fast and efficient way, covering large areas in matter of seconds. As a first step in that direction we present here an open web-based GEE application able to generate LAI Green and LAI Brown maps from Sentinel-2- imagery at 20m in a tile-wise manner all over the world, and time series of selected pixels during user-defined time interval.</p><p>To illustrate this functionalities and have better understanding of the phenology, we targeted a region in Castilla y León (Spain) from where we will present results for 2018 classified per crop type. This land cover classification was generated by the ITACYL (<span>Instituto Tecnológico Agrario de Castilla y León</span>) during SENSAGRI.</p><p>Future development will tackle the possibility to extend our analysis capability to additional variables, such as FCover and Cab, maintaining the computational efficiency as the main driver to ensure that the GEE application continues to be an agile and easy tool for spatiotemporal Earth observation studies.</p>


2020 ◽  
Author(s):  
Álvaro Moreno Martínez ◽  
Emma Izquierdo Verdiguier ◽  
Gustau Camps Valls ◽  
Marco Maneta ◽  
Jordi Muñoz Marí ◽  
...  

<p>Among Essential Climate Variables (ECVs) for global climate observation, the Leaf Area Index (LAI) and the Fraction of Absorbed Photosynthetically Active Radiation (FAPAR) are the most widely used to study land vegetated surfaces. The NASA’s Moderate  Resolution Imaging Spectro-radiometer (MODIS) is a key instrument aboard the Terra and Aqua platforms and allows to estimate both biophysical variables at coarse resolution (500 m) and global scales. The MODIS operational algorithm to retrieve LAI and FAPAR (MOD15/MYD15/MCD15) uses a physically-based radiative transfer model (RTM) to compute their estimates with corrected surface spectral information content. This algorithm has been heavily validated and compared with field measurements and other sensors but, so far, no equivalent products at high spatial resolution and continental or global scales are routinely produced. </p><p>Here, we introduce and validate a methodology to create a set of high spatial resolution LAI/FAPAR products by learning the MODIS RTM using advanced machine learning approaches and gap filled Landsat surface reflectances. The latter are smoothed and gap-filled by the HIghly Scalable Temporal Adaptive Reflectance Fusion Model (HISTARFM). HISTARTFM has a great potential to improve the original Landsat reflectances by reducing their noise and recovering missing data due to cloud contamination. In addition, HISTARFM runs very fast in cloud computing platforms such as Google Earth Engine (GEE) and provides uncertainty estimates which can be propagated through the models. These estimates allow to compute numerical uncertainties beyond the typical and qualitative control information layers provided in operational products such as the MODIS LAI/FAPAR. The introduced high spatial resolution biophysical products here could be of interest to the users to achieve the needed levels of spatial detail to adequately monitor croplands and heterogeneously vegetated landscapes.</p><p> </p>


Agronomy ◽  
2020 ◽  
Vol 10 (5) ◽  
pp. 618
Author(s):  
Santiago Belda ◽  
Luca Pipia ◽  
Pablo Morcillo-Pallarés ◽  
Jochem Verrelst

Image processing entered the era of artificial intelligence, and machine learning algorithms emerged as attractive alternatives for time series data processing. Satellite image time series processing enables crop phenology monitoring, such as the calculation of start and end of season. Among the promising algorithms, Gaussian process regression (GPR) proved to be a competitive time series gap-filling algorithm with the advantage of, as developed within a Bayesian framework, providing associated uncertainty estimates. Nevertheless, the processing of time series images becomes computationally inefficient in its standard per-pixel usage, mainly for GPR training rather than the fitting step. To mitigate this computational burden, we propose to substitute the per-pixel optimization step with the creation of a cropland-based precalculations for the GPR hyperparameters θ . To demonstrate our approach hardly affects the accuracy in fitting, we used Sentinel-2 LAI time series over an agricultural region in Castile and Leon, North-West Spain. The performance of image reconstructions were compared against the standard per-pixel GPR time series processing. Results showed that accuracies were on the same order (RMSE 0.1767 vs. 0.1564 [ m 2 / m 2 ] , 12% RMSE degradation) whereas processing time accelerated about 90 times. We further evaluated the alternative option of using the same hyperparameters for all the pixels within the complete scene. It led to similar overall accuracies over crop areas and computational performance. Crop phenology indicators were also calculated for the three different approaches and compared. Results showed analogous crop temporal patterns, with differences in start and end of growing season of no more than five days. To the benefit of crop monitoring applications, all the gap-filling and phenology indicators retrieval techniques have been implemented into the freely downloadable GUI toolbox DATimeS.


2020 ◽  
Author(s):  
Dounia arezki ◽  
Hadria Fizazi ◽  
Santiago Belda ◽  
Charlotte De Grave ◽  
Luca Pipia ◽  
...  

<p>Optical Earth observation satellites provide spatially-explicit data that are necessary to study trends in vegetation dynamics. However, more of often than not optical data are discontinuous in time, due to persistent cloud cover and instrumental noises. Hence, the operating constraints of these data require several essential pre-processing steps, especially when aiming to reach towards monitoring of vegetation seasonal trends.  To facilitate this task, here we present an end-to-end processing software framework applied to Sentinel-2 images.</p><p>To do so, first biophysical retrieval models were generated by means of a trained machine learning regression algorithm (MLRA) using simulated data coming from radiative transfer models. Among various tested MLRAs, the variational heteroscedastic Gaussian process regression (VHGPR) was evaluated as best performing. to train the retrieval model.  The training and retrieval were conducted in the Automated Radiative Transfer Models Operator (ARTMO) software framework.</p><p>Subsequently, in view of retrieving the phenological parameters from the obtained vegetation products, a novel times series toolbox as part of the ARTMO framework was used, called:  Decomposition and Analysis of Time Series software (DATimeS). DATimeS provides temporal interpolation among other functionalities with several advanced MLRAs for gap filling, smoothing functions and subsequent calculation of phenology indicators. Various MLRAs were tested for gap filling to reconstruct cloud-free maps of biophysical variables at a step of 10 days.</p><p>A demonstration case is presented involving the retrieval of Leaf area index (LAI), fraction of Absorbed Photosynthetically Active Radiation (FAPAR) from sentinel-2 time series.  A large agricultural Algerian site of 143, 75 km² including Oued Rhiou, Ouarizane, Djidioua (1,345,075 pixels) was chosen for this study.  A reference image was excluded from the time series in order to evaluate the reconstruction accuracy over a 40-day artificial gap.</p><p>  The reference vs.  Reconstructed maps produced by the gap-filling methods were compared with statistical goodness-of-fit metrics.  Considering both accuracy and processing speed, the fitting algorithms Gaussian process regression (GPR) and Next neighbour interpolation (R²= 0.90 / 0.081 sec per pixel and R²=0.88 / 0.001 sec per pixel respectively) interpolations proved to reconstruct the vegetation products the most efficient, with GPR as more accurate but Next faster by a factor of 70.</p><p>Finally, we evaluated of the phenology indicators such as start-of-season and end-of-season based on LAI and FAPAR. The obtained maps provide valid information of the vegetation dynamics.  Altogether, the ARTMO-DATimeS software framework enabled seamless processing of all essential steps:  (1) from L2A sentinel-2 images converted to vegetation products, (2) to cloud-free composite products, and finally (3) converted into vegetation phenology indicators.</p><p> </p>


2020 ◽  
Author(s):  
Marc Philipp Bahlke ◽  
Natnael Mogos ◽  
Jonny Proppe ◽  
Carmen Herrmann

Heisenberg exchange spin coupling between metal centers is essential for describing and understanding the electronic structure of many molecular catalysts, metalloenzymes, and molecular magnets for potential application in information technology. We explore the machine-learnability of exchange spin coupling, which has not been studied yet. We employ Gaussian process regression since it can potentially deal with small training sets (as likely associated with the rather complex molecular structures required for exploring spin coupling) and since it provides uncertainty estimates (“error bars”) along with predicted values. We compare a range of descriptors and kernels for 257 small dicopper complexes and find that a simple descriptor based on chemical intuition, consisting only of copper-bridge angles and copper-copper distances, clearly outperforms several more sophisticated descriptors when it comes to extrapolating towards larger experimentally relevant complexes. Exchange spin coupling is similarly easy to learn as the polarizability, while learning dipole moments is much harder. The strength of the sophisticated descriptors lies in their ability to linearize structure-property relationships, to the point that a simple linear ridge regression performs just as well as the kernel-based machine-learning model for our small dicopper data set. The superior extrapolation performance of the simple descriptor is unique to exchange spin coupling, reinforcing the crucial role of choosing a suitable descriptor, and highlighting the interesting question of the role of chemical intuition vs. systematic or automated selection of features for machine learning in chemistry and material science.


2020 ◽  
Author(s):  
Marc Philipp Bahlke ◽  
Natnael Mogos ◽  
Jonny Proppe ◽  
Carmen Herrmann

Heisenberg exchange spin coupling between metal centers is essential for describing and understanding the electronic structure of many molecular catalysts, metalloenzymes, and molecular magnets for potential application in information technology. We explore the machine-learnability of exchange spin coupling, which has not been studied yet. We employ Gaussian process regression since it can potentially deal with small training sets (as likely associated with the rather complex molecular structures required for exploring spin coupling) and since it provides uncertainty estimates (“error bars”) along with predicted values. We compare a range of descriptors and kernels for 257 small dicopper complexes and find that a simple descriptor based on chemical intuition, consisting only of copper-bridge angles and copper-copper distances, clearly outperforms several more sophisticated descriptors when it comes to extrapolating towards larger experimentally relevant complexes. Exchange spin coupling is similarly easy to learn as the polarizability, while learning dipole moments is much harder. The strength of the sophisticated descriptors lies in their ability to linearize structure-property relationships, to the point that a simple linear ridge regression performs just as well as the kernel-based machine-learning model for our small dicopper data set. The superior extrapolation performance of the simple descriptor is unique to exchange spin coupling, reinforcing the crucial role of choosing a suitable descriptor, and highlighting the interesting question of the role of chemical intuition vs. systematic or automated selection of features for machine learning in chemistry and material science.


Risks ◽  
2020 ◽  
Vol 8 (2) ◽  
pp. 50
Author(s):  
Sandrine Gümbel ◽  
Thorsten Schmidt

Calibration is a highly challenging task, in particular in multiple yield curve markets. This paper is a first attempt to study the chances and challenges of the application of machine learning techniques for this. We employ Gaussian process regression, a machine learning methodology having many similarities with extended Kálmán filtering, which has been applied many times to interest rate markets and term structure models. We find very good results for the single-curve markets and many challenges for the multi-curve markets in a Vasiček framework. The Gaussian process regression is implemented with the Adam optimizer and the non-linear conjugate gradient method, where the latter performs best. We also point towards future research.


Sign in / Sign up

Export Citation Format

Share Document