Estimation of Salinity Content in Different Saline-Alkali Zones Based on Machine Learning Model Using FOD Pretreatment Method

Chengbiao Fu; Anhong Tian; Daming Zhu; Junsan Zhao; Heigang Xiong

doi:10.3390/rs13245140

Estimation of Salinity Content in Different Saline-Alkali Zones Based on Machine Learning Model Using FOD Pretreatment Method

Remote Sensing ◽

10.3390/rs13245140 ◽

2021 ◽

Vol 13 (24) ◽

pp. 5140

Author(s):

Chengbiao Fu ◽

Anhong Tian ◽

Daming Zhu ◽

Junsan Zhao ◽

Heigang Xiong

Keyword(s):

Machine Learning ◽

Fractional Order ◽

Soil Salinity ◽

Significance Test ◽

Hyperspectral Data ◽

Learning Models ◽

Estimation Performance ◽

Study Results ◽

Sample Points ◽

Machine Learning Models

Soil salinization is a global ecological and environmental problem in arid and semi-arid areas that can be ameliorated via soil management, visible-near infrared-shortwave infrared (VNIR-SWIR) spectroscopy can be adapted to rapidly monitor soil salinity content. This study explored the potential of Grünwald–Letnikov fractional-order derivative (FOD), feature band selection methods, nonlinear partial least squares regression (PLSR), and four machine learning models to estimate the soil salinity content using VNIR-SWIR spectra. Ninety sample points were field scanned with VNIR-SWR and soil samples (0–20 cm) were obtained at the time of scanning. The samples points come from three zones representing different intensities of human interference (I, II, and III Zones) in Fukang, Xinjiang, China. Each zone contained thirty sample points. For modeling, we firstly adopted FOD (with intervals of 0.1 and range of 0–2) as a preprocessing method to analyze soil hyperspectral data. Then, four sets of spectral bands (R-FOD-FULL indicates full band range, R-FOD-CC5 bands that met a 0.05 significance test, R-FOD-CC1 bands that met a 0.01 significance test, and R-FOD-CC1-CARS represents CC1 combined with competitive adaptive reweighted sampling) were selected as spectral input variables to develop the estimation model. Finally, four machine learning models, namely, generalized regression neural network (GRNN), extreme learning machine (ELM), random forest (RF), and PLSR, to estimate soil salinity. Study results showed that (1) the heat map of correlation coefficient matrix between hyperspectral data and salinity indicated that FOD significantly improved the correlation. (2) The characteristic band variables extracted and used by R-FOD-CC1 were fewer in number, and redundancy between bands smaller than R-FOD-FULL and R-FOD-CC5, thus estimation accuracy of R-FOD-CC1 was higher than R-FOD-CC5 or R-FOD-FULL. A high prediction accuracy was achieved with a less complex calculation. (3) The GRNN model yielded the best salinity estimation in all three zones compared to ELM, BPNN, RF, and PLSR on the whole, whereas, the RF model had the worst estimation effect. The R-FOD-CC1-CARS-GRNN model yielded the best salinity estimation in I Zone with R2, RMSE and RPD of 0.7784, 1.8762, and 2.0568, respectively. The fractional order was 1.5 and estimation performance was great. The optimal model for predicting soil salinity in II and III Zone was, also, R-FOD-CC1-CARS-GRNN (R2 = 0.7912, RMSE = 3.4001, and RPD = 1.8985 in II Zone; R2 = 0.8192, RMSE = 6.6260, and RPD = 1.8190 in III Zone), with the fractional order of 1.7- and 1.6-, respectively, and the estimation performance were all fine. (4) The characteristic bands selected by the best model in I, II, and III Zones were 8, 9, and 11, respectively, which account for 0.45%, 0.51%, and 0.63%% of the full bands. This approach reduces the number of modeled band variables and simplifies the model structure.

Download Full-text

Detecting Arsenic Contamination Using Satellite Imagery and Machine Learning

Toxics ◽

10.3390/toxics9120333 ◽

2021 ◽

Vol 9 (12) ◽

pp. 333

Author(s):

Ayush Agrawal ◽

Mark R. Petersen

Keyword(s):

Machine Learning ◽

Data Augmentation ◽

Mean Squared Error ◽

Binary Classification ◽

Arsenic Concentration ◽

Arsenic Contamination ◽

Hyperspectral Data ◽

Detection Methods ◽

Learning Models ◽

Machine Learning Models

Arsenic, a potent carcinogen and neurotoxin, affects over 200 million people globally. Current detection methods are laborious, expensive, and unscalable, being difficult to implement in developing regions and during crises such as COVID-19. This study attempts to determine if a relationship exists between soil’s hyperspectral data and arsenic concentration using NASA’s Hyperion satellite. It is the first arsenic study to use satellite-based hyperspectral data and apply a classification approach. Four regression machine learning models are tested to determine this correlation in soil with bare land cover. Raw data are converted to reflectance, problematic atmospheric influences are removed, characteristic wavelengths are selected, and four noise reduction algorithms are tested. The combination of data augmentation, Genetic Algorithm, Second Derivative Transformation, and Random Forest regression (R2=0.840 and normalized root mean squared error (re-scaled to [0,1]) = 0.122) shows strong correlation, performing better than past models despite using noisier satellite data (versus lab-processed samples). Three binary classification machine learning models are then applied to identify high-risk shrub-covered regions in ten U.S. states, achieving strong accuracy (=0.693) and F1-score (=0.728). Overall, these results suggest that such a methodology is practical and can provide a sustainable alternative to arsenic contamination detection.

Download Full-text

Predicting Reservoir Fluid Properties from Advanced Mud Gas Data

SPE Reservoir Evaluation & Engineering ◽

10.2118/201635-pa ◽

2021 ◽

pp. 1-9

Author(s):

Tao Yang ◽

Gulnar Yerkinkyzy ◽

Knut Uleberg ◽

Ibnu Hafidz Arief

Keyword(s):

Machine Learning ◽

Production Optimization ◽

Fluid Density ◽

Learning Models ◽

Fluid Property ◽

Reservoir Fluid ◽

Study Results ◽

Fluid Properties ◽

Prediction Approach ◽

Machine Learning Models

Summary In a recent paper, we published a machine learning method to quantitatively predict reservoir fluid gas/oil ratio (GOR) from advanced mud gas (AMG) data. The significant increase of the model accuracy compared to traditional modeling approaches makes it possible to estimate reservoir fluid GOR based on AMG data while drilling, before the wireline operation. This approach has clear advantages because of early access, low cost, and a continuous reservoir fluid GOR for all reservoir zones. This paper releases further study results to predict other reservoir fluid properties in addition to GOR, which is essential for geo-operations, field development plans, and production optimization. Two approaches were selected to predict other reservoir fluid properties. As illustrated by the reservoir fluid density example, we developed machine learning models for individual reservoir fluid properties for the first approach, similar to the GOR prediction approach in the previous paper. As for the second approach, instead of developing many machine learning models for individual reservoir fluid property, we investigated the essential properties for equation of state (EOS) fluid characterization: C6 and C7+ composition and the molecular weight and density of the C7+ fraction. Once these properties are in place, the entire spectrum of reservoir fluid properties can be calculated with the EOS model. The results of reservoir fluid property prediction are satisfactory with both approaches. The reservoir oil density prediction has a mean average error (MAE) of 0.039 g/cm3. The accuracy is similar to the typical density derived from the pressure gradient from wireline logging data. For the essential fluid properties required for EOS model prediction, the overall accuracy is less than the laboratory measurements but acceptable as the early phase estimations. The reservoir fluid properties predicted from the EOS model are similar to the predictions from individual machine learning models. We applied the field measured AMG data into the reservoir fluid property models and achieved good results, as illustrated by the reservoir fluid density example. The previous paper completed the methodology to predict all reservoir fluid properties based on AMG data. This work paves the way to generate a complete reservoir fluid log for all relevant reservoir fluid properties while drilling. The method has a significant business impact, providing full coverage of reservoir fluid properties along the well path in the early drilling phase. The advantage of providing reservoir fluid properties in all reservoir zones while drilling far outweighs the limitation of somewhat reduced reservoir fluid property accuracy.

Download Full-text

Developing global image feature analysis models to predict cancer risk and prognosis

Visual Computing for Industry Biomedicine and Art ◽

10.1186/s42492-019-0026-5 ◽

2019 ◽

Vol 2 (1) ◽

Author(s):

Bin Zheng ◽

Yuchen Qiu ◽

Faranak Aghaei ◽

Seyedehnafiseh Mirniaharikandehei ◽

Morteza Heidari ◽

...

Keyword(s):

Machine Learning ◽

Cancer Risk ◽

Prediction Models ◽

Image Features ◽

Image Feature ◽

Learning Models ◽

Study Results ◽

Computer Aided ◽

Predict Cancer Risk ◽

Machine Learning Models

AbstractIn order to develop precision or personalized medicine, identifying new quantitative imaging markers and building machine learning models to predict cancer risk and prognosis has been attracting broad research interest recently. Most of these research approaches use the similar concepts of the conventional computer-aided detection schemes of medical images, which include steps in detecting and segmenting suspicious regions or tumors, followed by training machine learning models based on the fusion of multiple image features computed from the segmented regions or tumors. However, due to the heterogeneity and boundary fuzziness of the suspicious regions or tumors, segmenting subtle regions is often difficult and unreliable. Additionally, ignoring global and/or background parenchymal tissue characteristics may also be a limitation of the conventional approaches. In our recent studies, we investigated the feasibility of developing new computer-aided schemes implemented with the machine learning models that are trained by global image features to predict cancer risk and prognosis. We trained and tested several models using images obtained from full-field digital mammography, magnetic resonance imaging, and computed tomography of breast, lung, and ovarian cancers. Study results showed that many of these new models yielded higher performance than other approaches used in current clinical practice. Furthermore, the computed global image features also contain complementary information from the features computed from the segmented regions or tumors in predicting cancer prognosis. Therefore, the global image features can be used alone to develop new case-based prediction models or can be added to current tumor-based models to increase their discriminatory power.

Download Full-text

Soil Salinity Mapping Using SAR Sentinel-1 Data and Advanced Machine Learning Algorithms: A Case Study at Ben Tre Province of the Mekong River Delta (Vietnam)

Remote Sensing ◽

10.3390/rs11020128 ◽

2019 ◽

Vol 11 (2) ◽

pp. 128 ◽

Cited By ~ 16

Author(s):

Pham Hoa ◽

Nguyen Giang ◽

Nguyen Binh ◽

Le Hai ◽

Tien-Dat Pham ◽

...

Keyword(s):

Climate Change ◽

Machine Learning ◽

Neural Networks ◽

Soil Salinity ◽

River Delta ◽

Mekong River ◽

Support Vector ◽

Learning Models ◽

Mekong River Delta ◽

Machine Learning Models

Soil salinity caused by climate change associated with rising sea level is considered as one of the most severe natural hazards that has a negative effect on agricultural activities in the coastal areas in most tropical climates. This issue has become more severe and increasingly occurred in the Mekong River Delta of Vietnam. The main objective of this work is to map soil salinity intrusion in Ben Tre province located on the Mekong River Delta of Vietnam using the Sentinel-1 Synthetic Aperture Radar (SAR) C-band data combined with five state-of-the-art machine learning models, Multilayer Perceptron Neural Networks (MLP-NN), Radial Basis Function Neural Networks (RBF-NN), Gaussian Processes (GP), Support Vector Regression (SVR), and Random Forests (RF). For this purpose, 63 soil samples were collected during the field survey conducted from 4–6 April 2018 corresponding to the Sentinel-1 SAR imagery. The performance of the five models was assessed and compared using the root-mean-square error (RMSE), the mean absolute error (MAE), and the correlation coefficient (r). The results revealed that the GP model yielded the highest prediction performance (RMSE = 2.885, MAE = 1.897, and r = 0.808) and outperformed the other machine learning models. We conclude that the advanced machine learning models can be used for mapping soil salinity in the Delta areas; thus, providing a useful tool for assisting farmers and the policy maker in choosing better crop types in the context of climate change.

Download Full-text

Salt dome related soil salinity in southern Iran: Prediction and mapping with averaging machine learning models

Land Degradation and Development ◽

10.1002/ldr.3811 ◽

2020 ◽

Author(s):

Fatemeh Abedi ◽

Alireza Amirian‐Chakan ◽

Mohammad Faraji ◽

Ruhollah Taghizadeh‐Mehrjardi ◽

Ruth Kerry ◽

...

Keyword(s):

Machine Learning ◽

Soil Salinity ◽

Salt Dome ◽

Learning Models ◽

Southern Iran ◽

Machine Learning Models

Download Full-text

ESTIMATING CHLOROPHYLL A CONCENTRATIONS OF SEVERAL INLAND WATERS WITH HYPERSPECTRAL DATA AND MACHINE LEARNING MODELS

ISPRS Annals of Photogrammetry Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-annals-iv-2-w5-609-2019 ◽

2019 ◽

Vol IV-2/W5 ◽

pp. 609-614 ◽

Cited By ~ 3

Author(s):

P. M. Maier ◽

S. Keller

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Random Forest ◽

Chlorophyll A ◽

Hyperspectral Data ◽

Inland Waters ◽

Learning Models ◽

Derivatives Of ◽

Machine Learning Models

<p><strong>Abstract.</strong> Water is a key component of life, the natural environment and human health. For monitoring the conditions of a water body, the chlorophyll a concentration can serve as a proxy for nutrients and oxygen supply. In situ measurements of water quality parameters are often time-consuming, expensive and limited in areal validity. Therefore, we apply remote sensing techniques. During field campaigns, we collected hyperspectral data with a spectrometer and in situ measured chlorophyll a concentrations of 13 inland water bodies with different spectral characteristics. One objective of this study is to estimate chlorophyll a concentrations of these inland waters by applying three machine learning regression models: Random Forest, Support Vector Machine and an Artificial Neural Network. Additionally, we simulate four different hyperspectral resolutions of the spectrometer data to investigate the effects on the estimation performance. Furthermore, the application of first order derivatives of the spectra is evaluated in turn to the regression performance. This study reveals the potential of combining machine learning approaches and remote sensing data for inland waters. Each machine learning model achieves an R2-score between 80% to 90% for the regression on chlorophyll a concentrations. The random forest model benefits clearly from the applied derivatives of the spectra. In further studies, we will focus on the application of machine learning models on spectral satellite data to enhance the area-wide estimation of chlorophyll a concentration for inland waters.</p>

Download Full-text

Atmospheric Correction of Hyperspectral Data Over Coastal Waters Based on Machine Learning Models

2021 11th Workshop on Hyperspectral Imaging and Signal Processing: Evolution in Remote Sensing (WHISPERS) ◽

10.1109/whispers52202.2021.9483999 ◽

2021 ◽

Author(s):

Ole Martin Borge ◽

Sivert Bakken ◽

Tor A. Johansen

Keyword(s):

Machine Learning ◽

Coastal Waters ◽

Atmospheric Correction ◽

Hyperspectral Data ◽

Learning Models ◽

Machine Learning Models

Download Full-text

Improving XGBoost with Imagination Sampling

Communications of the Blyth Institute ◽

10.33014/issn.2640-5652.2.1.holloway.1 ◽

2020 ◽

Vol 2 (1) ◽

pp. 3-6

Author(s):

Eric Holloway

Keyword(s):

Machine Learning ◽

General System ◽

Learning Models ◽

Starting Point ◽

Machine Learning Models

Imagination Sampling is the usage of a person as an oracle for generating or improving machine learning models. Previous work demonstrated a general system for using Imagination Sampling for obtaining multibox models. Here, the possibility of importing such models as the starting point for further automatic enhancement is explored.

Download Full-text

Development of Machine Learning Models to Predict Student Performance in Computer Literacy Courses

International Review on Computers and Software (IRECOS) ◽

10.15866/irecos.v13i1.16863 ◽

2018 ◽

Vol 13 (1) ◽

pp. 21

Author(s):

George Anderson ◽

Oduronke T. Eyitayo

Keyword(s):

Machine Learning ◽

Student Performance ◽

Computer Literacy ◽

Learning Models ◽

Machine Learning Models

Download Full-text

Experimental Comparison of Machine Learning Models in Malware Packing Detection

2020 21st Asia-Pacific Network Operations and Management Symposium (APNOMS) ◽

10.23919/apnoms50412.2020.9237007 ◽

2020 ◽

Author(s):

Jong-Wouk Kim ◽

Juhong Namgung ◽

Yang-Sae Moon ◽

Mi-Jung Choi

Keyword(s):

Machine Learning ◽

Experimental Comparison ◽

Learning Models ◽

Machine Learning Models

Download Full-text