scholarly journals Estimation of Salinity Content in Different Saline-Alkali Zones Based on Machine Learning Model Using FOD Pretreatment Method

2021 ◽  
Vol 13 (24) ◽  
pp. 5140
Author(s):  
Chengbiao Fu ◽  
Anhong Tian ◽  
Daming Zhu ◽  
Junsan Zhao ◽  
Heigang Xiong

Soil salinization is a global ecological and environmental problem in arid and semi-arid areas that can be ameliorated via soil management, visible-near infrared-shortwave infrared (VNIR-SWIR) spectroscopy can be adapted to rapidly monitor soil salinity content. This study explored the potential of Grünwald–Letnikov fractional-order derivative (FOD), feature band selection methods, nonlinear partial least squares regression (PLSR), and four machine learning models to estimate the soil salinity content using VNIR-SWIR spectra. Ninety sample points were field scanned with VNIR-SWR and soil samples (0–20 cm) were obtained at the time of scanning. The samples points come from three zones representing different intensities of human interference (I, II, and III Zones) in Fukang, Xinjiang, China. Each zone contained thirty sample points. For modeling, we firstly adopted FOD (with intervals of 0.1 and range of 0–2) as a preprocessing method to analyze soil hyperspectral data. Then, four sets of spectral bands (R-FOD-FULL indicates full band range, R-FOD-CC5 bands that met a 0.05 significance test, R-FOD-CC1 bands that met a 0.01 significance test, and R-FOD-CC1-CARS represents CC1 combined with competitive adaptive reweighted sampling) were selected as spectral input variables to develop the estimation model. Finally, four machine learning models, namely, generalized regression neural network (GRNN), extreme learning machine (ELM), random forest (RF), and PLSR, to estimate soil salinity. Study results showed that (1) the heat map of correlation coefficient matrix between hyperspectral data and salinity indicated that FOD significantly improved the correlation. (2) The characteristic band variables extracted and used by R-FOD-CC1 were fewer in number, and redundancy between bands smaller than R-FOD-FULL and R-FOD-CC5, thus estimation accuracy of R-FOD-CC1 was higher than R-FOD-CC5 or R-FOD-FULL. A high prediction accuracy was achieved with a less complex calculation. (3) The GRNN model yielded the best salinity estimation in all three zones compared to ELM, BPNN, RF, and PLSR on the whole, whereas, the RF model had the worst estimation effect. The R-FOD-CC1-CARS-GRNN model yielded the best salinity estimation in I Zone with R2, RMSE and RPD of 0.7784, 1.8762, and 2.0568, respectively. The fractional order was 1.5 and estimation performance was great. The optimal model for predicting soil salinity in II and III Zone was, also, R-FOD-CC1-CARS-GRNN (R2 = 0.7912, RMSE = 3.4001, and RPD = 1.8985 in II Zone; R2 = 0.8192, RMSE = 6.6260, and RPD = 1.8190 in III Zone), with the fractional order of 1.7- and 1.6-, respectively, and the estimation performance were all fine. (4) The characteristic bands selected by the best model in I, II, and III Zones were 8, 9, and 11, respectively, which account for 0.45%, 0.51%, and 0.63%% of the full bands. This approach reduces the number of modeled band variables and simplifies the model structure.

Toxics ◽  
2021 ◽  
Vol 9 (12) ◽  
pp. 333
Author(s):  
Ayush Agrawal ◽  
Mark R. Petersen

Arsenic, a potent carcinogen and neurotoxin, affects over 200 million people globally. Current detection methods are laborious, expensive, and unscalable, being difficult to implement in developing regions and during crises such as COVID-19. This study attempts to determine if a relationship exists between soil’s hyperspectral data and arsenic concentration using NASA’s Hyperion satellite. It is the first arsenic study to use satellite-based hyperspectral data and apply a classification approach. Four regression machine learning models are tested to determine this correlation in soil with bare land cover. Raw data are converted to reflectance, problematic atmospheric influences are removed, characteristic wavelengths are selected, and four noise reduction algorithms are tested. The combination of data augmentation, Genetic Algorithm, Second Derivative Transformation, and Random Forest regression (R2=0.840 and normalized root mean squared error (re-scaled to [0,1]) = 0.122) shows strong correlation, performing better than past models despite using noisier satellite data (versus lab-processed samples). Three binary classification machine learning models are then applied to identify high-risk shrub-covered regions in ten U.S. states, achieving strong accuracy (=0.693) and F1-score (=0.728). Overall, these results suggest that such a methodology is practical and can provide a sustainable alternative to arsenic contamination detection.


2021 ◽  
pp. 1-9
Author(s):  
Tao Yang ◽  
Gulnar Yerkinkyzy ◽  
Knut Uleberg ◽  
Ibnu Hafidz Arief

Summary In a recent paper, we published a machine learning method to quantitatively predict reservoir fluid gas/oil ratio (GOR) from advanced mud gas (AMG) data. The significant increase of the model accuracy compared to traditional modeling approaches makes it possible to estimate reservoir fluid GOR based on AMG data while drilling, before the wireline operation. This approach has clear advantages because of early access, low cost, and a continuous reservoir fluid GOR for all reservoir zones. This paper releases further study results to predict other reservoir fluid properties in addition to GOR, which is essential for geo-operations, field development plans, and production optimization. Two approaches were selected to predict other reservoir fluid properties. As illustrated by the reservoir fluid density example, we developed machine learning models for individual reservoir fluid properties for the first approach, similar to the GOR prediction approach in the previous paper. As for the second approach, instead of developing many machine learning models for individual reservoir fluid property, we investigated the essential properties for equation of state (EOS) fluid characterization: C6 and C7+ composition and the molecular weight and density of the C7+ fraction. Once these properties are in place, the entire spectrum of reservoir fluid properties can be calculated with the EOS model. The results of reservoir fluid property prediction are satisfactory with both approaches. The reservoir oil density prediction has a mean average error (MAE) of 0.039 g/cm3. The accuracy is similar to the typical density derived from the pressure gradient from wireline logging data. For the essential fluid properties required for EOS model prediction, the overall accuracy is less than the laboratory measurements but acceptable as the early phase estimations. The reservoir fluid properties predicted from the EOS model are similar to the predictions from individual machine learning models. We applied the field measured AMG data into the reservoir fluid property models and achieved good results, as illustrated by the reservoir fluid density example. The previous paper completed the methodology to predict all reservoir fluid properties based on AMG data. This work paves the way to generate a complete reservoir fluid log for all relevant reservoir fluid properties while drilling. The method has a significant business impact, providing full coverage of reservoir fluid properties along the well path in the early drilling phase. The advantage of providing reservoir fluid properties in all reservoir zones while drilling far outweighs the limitation of somewhat reduced reservoir fluid property accuracy.


Author(s):  
Bin Zheng ◽  
Yuchen Qiu ◽  
Faranak Aghaei ◽  
Seyedehnafiseh Mirniaharikandehei ◽  
Morteza Heidari ◽  
...  

AbstractIn order to develop precision or personalized medicine, identifying new quantitative imaging markers and building machine learning models to predict cancer risk and prognosis has been attracting broad research interest recently. Most of these research approaches use the similar concepts of the conventional computer-aided detection schemes of medical images, which include steps in detecting and segmenting suspicious regions or tumors, followed by training machine learning models based on the fusion of multiple image features computed from the segmented regions or tumors. However, due to the heterogeneity and boundary fuzziness of the suspicious regions or tumors, segmenting subtle regions is often difficult and unreliable. Additionally, ignoring global and/or background parenchymal tissue characteristics may also be a limitation of the conventional approaches. In our recent studies, we investigated the feasibility of developing new computer-aided schemes implemented with the machine learning models that are trained by global image features to predict cancer risk and prognosis. We trained and tested several models using images obtained from full-field digital mammography, magnetic resonance imaging, and computed tomography of breast, lung, and ovarian cancers. Study results showed that many of these new models yielded higher performance than other approaches used in current clinical practice. Furthermore, the computed global image features also contain complementary information from the features computed from the segmented regions or tumors in predicting cancer prognosis. Therefore, the global image features can be used alone to develop new case-based prediction models or can be added to current tumor-based models to increase their discriminatory power.


2019 ◽  
Vol 11 (2) ◽  
pp. 128 ◽  
Author(s):  
Pham Hoa ◽  
Nguyen Giang ◽  
Nguyen Binh ◽  
Le Hai ◽  
Tien-Dat Pham ◽  
...  

Soil salinity caused by climate change associated with rising sea level is considered as one of the most severe natural hazards that has a negative effect on agricultural activities in the coastal areas in most tropical climates. This issue has become more severe and increasingly occurred in the Mekong River Delta of Vietnam. The main objective of this work is to map soil salinity intrusion in Ben Tre province located on the Mekong River Delta of Vietnam using the Sentinel-1 Synthetic Aperture Radar (SAR) C-band data combined with five state-of-the-art machine learning models, Multilayer Perceptron Neural Networks (MLP-NN), Radial Basis Function Neural Networks (RBF-NN), Gaussian Processes (GP), Support Vector Regression (SVR), and Random Forests (RF). For this purpose, 63 soil samples were collected during the field survey conducted from 4–6 April 2018 corresponding to the Sentinel-1 SAR imagery. The performance of the five models was assessed and compared using the root-mean-square error (RMSE), the mean absolute error (MAE), and the correlation coefficient (r). The results revealed that the GP model yielded the highest prediction performance (RMSE = 2.885, MAE = 1.897, and r = 0.808) and outperformed the other machine learning models. We conclude that the advanced machine learning models can be used for mapping soil salinity in the Delta areas; thus, providing a useful tool for assisting farmers and the policy maker in choosing better crop types in the context of climate change.


Author(s):  
Fatemeh Abedi ◽  
Alireza Amirian‐Chakan ◽  
Mohammad Faraji ◽  
Ruhollah Taghizadeh‐Mehrjardi ◽  
Ruth Kerry ◽  
...  

Author(s):  
P. M. Maier ◽  
S. Keller

<p><strong>Abstract.</strong> Water is a key component of life, the natural environment and human health. For monitoring the conditions of a water body, the chlorophyll a concentration can serve as a proxy for nutrients and oxygen supply. In situ measurements of water quality parameters are often time-consuming, expensive and limited in areal validity. Therefore, we apply remote sensing techniques. During field campaigns, we collected hyperspectral data with a spectrometer and in situ measured chlorophyll a concentrations of 13 inland water bodies with different spectral characteristics. One objective of this study is to estimate chlorophyll a concentrations of these inland waters by applying three machine learning regression models: Random Forest, Support Vector Machine and an Artificial Neural Network. Additionally, we simulate four different hyperspectral resolutions of the spectrometer data to investigate the effects on the estimation performance. Furthermore, the application of first order derivatives of the spectra is evaluated in turn to the regression performance. This study reveals the potential of combining machine learning approaches and remote sensing data for inland waters. Each machine learning model achieves an R2-score between 80% to 90% for the regression on chlorophyll a concentrations. The random forest model benefits clearly from the applied derivatives of the spectra. In further studies, we will focus on the application of machine learning models on spectral satellite data to enhance the area-wide estimation of chlorophyll a concentration for inland waters.</p>


2020 ◽  
Vol 2 (1) ◽  
pp. 3-6
Author(s):  
Eric Holloway

Imagination Sampling is the usage of a person as an oracle for generating or improving machine learning models. Previous work demonstrated a general system for using Imagination Sampling for obtaining multibox models. Here, the possibility of importing such models as the starting point for further automatic enhancement is explored.


Sign in / Sign up

Export Citation Format

Share Document