Detection of Potassium Deficiency and Momentary Transpiration Rate Estimation at Early Growth Stages Using Proximal Hyperspectral Imaging and Extreme Gradient Boosting

Shahar Weksler; Offer Rozenstein; Nadav Haish; Menachem Moshelion; Rony Wallach; Eyal Ben-Dor

doi:10.3390/s21030958

Detection of Potassium Deficiency and Momentary Transpiration Rate Estimation at Early Growth Stages Using Proximal Hyperspectral Imaging and Extreme Gradient Boosting

Sensors ◽

10.3390/s21030958 ◽

2021 ◽

Vol 21 (3) ◽

pp. 958

Author(s):

Shahar Weksler ◽

Offer Rozenstein ◽

Nadav Haish ◽

Menachem Moshelion ◽

Rony Wallach ◽

...

Keyword(s):

Crop Yield ◽

Transpiration Rate ◽

Learning Algorithm ◽

Irrigation Management ◽

Stress Factors ◽

Spectral Information ◽

Gradient Boosting ◽

Growth Stages ◽

Ambient Conditions ◽

Extreme Gradient Boosting

Potassium is a macro element in plants that is typically supplied to crops in excess throughout the season to avoid a deficit leading to reduced crop yield. Transpiration rate is a momentary physiological attribute that is indicative of soil water content, the plant’s water requirements, and abiotic stress factors. In this study, two systems were combined to create a hyperspectral–physiological plant database for classification of potassium treatments (low, medium, and high) and estimation of momentary transpiration rate from hyperspectral images. PlantArray 3.0 was used to control fertigation, log ambient conditions, and calculate transpiration rates. In addition, a semi-automated platform carrying a hyperspectral camera was triggered every hour to capture images of a large array of pepper plants. The combined attributes and spectral information on an hourly basis were used to classify plants into their given potassium treatments (average accuracy = 80%) and to estimate transpiration rate (RMSE = 0.025 g/min, R2 = 0.75) using the advanced ensemble learning algorithm XGBoost (extreme gradient boosting algorithm). Although potassium has no direct spectral absorption features, the classification results demonstrated the ability to label plants according to potassium treatments based on a remotely measured hyperspectral signal. The ability to estimate transpiration rates for different potassium applications using spectral information can aid in irrigation management and crop yield optimization. These combined results are important for decision-making during the growing season, and particularly at the early stages when potassium levels can still be corrected to prevent yield loss.

Download Full-text

Detection and Identification of Organic Pollutants in Drinking Water from Fluorescence Spectra Based on Deep Learning Using Convolutional Autoencoder

Water ◽

10.3390/w13192633 ◽

2021 ◽

Vol 13 (19) ◽

pp. 2633

Author(s):

Jie Yu ◽

Yitong Cao ◽

Fei Shi ◽

Jiegen Shi ◽

Dibo Hou ◽

...

Keyword(s):

Drinking Water ◽

Deep Learning ◽

Fluorescence Spectroscopy ◽

Organic Pollutants ◽

Learning Algorithm ◽

Three Dimensional ◽

Gradient Boosting ◽

Spectral Processing ◽

Extreme Gradient Boosting ◽

Convolutional Autoencoder

Three dimensional fluorescence spectroscopy has become increasingly useful in the detection of organic pollutants. However, this approach is limited by decreased accuracy in identifying low concentration pollutants. In this research, a new identification method for organic pollutants in drinking water is accordingly proposed using three-dimensional fluorescence spectroscopy data and a deep learning algorithm. A novel application of a convolutional autoencoder was designed to process high-dimensional fluorescence data and extract multi-scale features from the spectrum of drinking water samples containing organic pollutants. Extreme Gradient Boosting (XGBoost), an implementation of gradient-boosted decision trees, was used to identify the organic pollutants based on the obtained features. Method identification performance was validated on three typical organic pollutants in different concentrations for the scenario of accidental pollution. Results showed that the proposed method achieved increasing accuracy, in the case of both high-(>10 μg/L) and low-(≤10 μg/L) concentration pollutant samples. Compared to traditional spectrum processing techniques, the convolutional autoencoder-based approach enabled obtaining features of enhanced detail from fluorescence spectral data. Moreover, evidence indicated that the proposed method maintained the detection ability in conditions whereby the background water changes. It can effectively reduce the rate of misjudgments associated with the fluctuation of drinking water quality. This study demonstrates the possibility of using deep learning algorithms for spectral processing and contamination detection in drinking water.

Download Full-text

Prediction of West Nile Virus using Ensemble Classifiers

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.a9810.109119 ◽

2019 ◽

Vol 9 (1) ◽

pp. 3744-3749

Keyword(s):

West Nile Virus ◽

Random Forest ◽

Learning Algorithm ◽

Traditional Approach ◽

The United States ◽

Gradient Boosting ◽

Ensemble Classifiers ◽

Human Beings ◽

West Nile ◽

Extreme Gradient Boosting

West Nile Virus (WNV) is a disease caused by mosquitoes where human beings get infected by the mosquito’s bite. The disease is considered to be a serious threat to the society especially in the United States where it is frequently found in localities having water bodies. The traditional approach is to collect the traps of mosquitoes from a locality and check whether they are infected with virus. If there is a virus found then that locality is sprayed with pesticides. But this process is very time consuming and requires a lot of financial support. Machine learning methods can provide an efficient approach to predict the presence of virus in a locality using data related to the location and weather. This paper uses the dataset present in Kaggle which includes information related to the traps found in the locality and also about the information related to the locality’s weather. The dataset is found to be imbalanced hence Synthetic Minority Over sampling Technique (SMOTE), an upsampling method, is used to sample the dataset to balance it. Ensemble learning classifiers like random forest, gradient boosting and Extreme Gradient Boosting (XGB). The performance of ensemble classifiers is compared with the performance of the best supervised learning algorithm, SVM. Among the models, XGB gave the highest F-1 score of 92.93 by performing marginally better than random forest (92.78) and also SVM (91.16).

Download Full-text

Evaluation of Total Nitrogen in Water via Airborne Hyperspectral Data: Potential of Fractional Order Discretization Algorithm and Discrete Wavelet Transform Analysis

Remote Sensing ◽

10.3390/rs13224643 ◽

2021 ◽

Vol 13 (22) ◽

pp. 4643

Author(s):

Jinhua Liu ◽

Jianli Ding ◽

Xiangyu Ge ◽

Jingzhe Wang

Keyword(s):

Water Quality ◽

Wavelet Transform ◽

Discrete Wavelet Transform ◽

Fractional Order ◽

Total Nitrogen ◽

Hyperspectral Data ◽

Spectral Information ◽

Gradient Boosting ◽

Discrete Wavelet ◽

Extreme Gradient Boosting

Controlling and managing surface source pollution depends on the rapid monitoring of total nitrogen in water. However, the complex factors affecting water quality (plant shading and suspended matter in water) make direct estimation extremely challenging. Considering the spectral response mechanisms of emergent plants, we coupled discrete wavelet transform (DWT) and fractional order discretization (FOD) techniques with three machine learning models (random forest (RF), bagging algorithm (bagging), and eXtreme Gradient Boosting (XGBoost)) to mine this potential spectral information. A total of 567 models were developed, and airborne hyperspectral data processed with various DWT scales and FOD techniques were compared. The effective information in the hyperspectral reflectance data were better emphasized after DWT processing. After DWT processing the original spectrum (OR), its sensitivity to TN in water was maximally improved by 0.22, and the correlation between FOD and TN in water was optimally increased by 0.57. The transformed spectral information enhanced the TN model accuracy, especially for FOD after DWT. For RF, 82% of the model R2 values improved by 0.02~0.72 compared to the model using FOD spectra; 78.8% of the bagging values improved by 0.01~0.53 and 65.0% of the XGBoost values improved by 0.01~0.64. The XGBoost model with DWT coupled with grey relation analysis (GRA) yielded the best estimation accuracy, with the highest precision of R2 = 0.91 for L6. In conclusion, appropriately scaled DWT analysis can substantially improve the accuracy of extracting TN from UAV hyperspectral images. These outcomes may facilitate the further development of accurate water quality monitoring in sophisticated global waters from drone or satellite hyperspectral data.

Download Full-text

The extraction of early warning features for the predicting financial distress based on XGboost model and shap framework

International Journal of Financial Engineering ◽

10.1142/s2424786321410048 ◽

2021 ◽

pp. 2141004

Author(s):

He Yang ◽

Emma Li ◽

Yi Fang Cai ◽

Jiapei Li ◽

George X. Yuan

Keyword(s):

Machine Learning ◽

Early Warning ◽

Financial Distress ◽

Prediction Accuracy ◽

Financial Risk ◽

Learning Algorithm ◽

Listed Companies ◽

Gradient Boosting ◽

Distress Risk ◽

Extreme Gradient Boosting

The purpose of this paper is to establish a framework for the extraction of early warning risk features for the predicting financial distress based on XGBoost model and SHAP. It is well known that the way to construct early warning risk features to predict financial distress of companies is very important, and by comparing with the traditional statistical methods, though the data-driven machine learning for the financial early warning, modelling has a better performance in terms of prediction accuracy, but it also brings the difficulty such as the one the corresponding model may be not explained well. Recently, eXtreme Gradient Boosting (XGBoost), an ensemble learning algorithm based on extreme gradient boosting, has become a hot topic in the area of machine learning research field due to its strong nonlinear information recognition ability and high prediction accuracy in the practice. In this study, the XGBoost algorithm is used to extract early warning features for the predicting financial distress for listed companies, with 76 financial risk features from seven categories of aspects, and 14 non-financial risk features from four categories of aspects, which are collected to establish an early warning system for the predication of financial distress. With applications, we conduct the empirical testing respect to AUC, KS and Kappa, the numerical results show that by comparing with the Logistic model, our method based on XGBoost model established in this paper has much better ability to predict the financial distress risk of listed companies. Moreover, under the framework of SHAP (SHAPley Additive exPlanations), we are able to give a reasonable explanation for important risk features and influencing ways affecting the financial distress visibly. The results given by this paper show that the XGBoost approach to model early warning features for financial distress does not only preform a better prediction accuracy, but also is explainable, which is significant for the identification of early warning to the financial distress risk for listed companies in the practice.

Download Full-text

Using Machine Learning to Predict Invasive Bacterial Infections in Young Febrile Infants Visiting the Emergency Department

Journal of Clinical Medicine ◽

10.3390/jcm10091875 ◽

2021 ◽

Vol 10 (9) ◽

pp. 1875

Author(s):

I-Min Chiu ◽

Chi-Yung Cheng ◽

Wun-Huei Zeng ◽

Ying-Hsien Huang ◽

Chun-Hung Richard Lin

Keyword(s):

Machine Learning ◽

Emergency Department ◽

Bacterial Infections ◽

Clinical Symptoms ◽

Learning Algorithm ◽

Gradient Boosting ◽

P Value ◽

Young Infants ◽

Extreme Gradient Boosting ◽

Sensitivity Level

Background: The aim of this study was to develop and evaluate a machine learning (ML) model to predict invasive bacterial infections (IBIs) in young febrile infants visiting the emergency department (ED). Methods: This retrospective study was conducted in the EDs of three medical centers across Taiwan from 2011 to 2018. We included patients age in 0–60 days who were visiting the ED with clinical symptoms of fever. We developed three different ML algorithms, including logistic regression (LR), supportive vector machine (SVM), and extreme gradient boosting (XGboost), comparing their performance at predicting IBIs to a previous validated score system (IBI score). Results: During the study period, 4211 patients were included, where 126 (3.1%) had IBI. A total of eight, five, and seven features were used in the LR, SVM, and XGboost through the feature selection process, respectively. The ML models can achieve a better AUROC value when predicting IBIs in young infants compared with the IBI score (LR: 0.85 vs. SVM: 0.84 vs. XGBoost: 0.85 vs. IBI score: 0.70, p-value < 0.001). Using a cost sensitive learning algorithm, all ML models showed better specificity in predicting IBIs at a 90% sensitivity level compared to an IBI score > 2 (LR: 0.59 vs. SVM: 0.60 vs. XGBoost: 0.57 vs. IBI score >2: 0.43, p-value < 0.001). Conclusions: All ML models developed in this study outperformed the traditional scoring system in stratifying low-risk febrile infants after the standardized sensitivity level.

Download Full-text

Extreme Gradient Boosting Machine Learning Algorithm For Safe Auto Insurance Operations

2019 IEEE International Conference on Vehicular Electronics and Safety (ICVES) ◽

10.1109/icves.2019.8906396 ◽

2019 ◽

Cited By ~ 4

Author(s):

Najmeddine Dhieb ◽

Hakim Ghazzai ◽

Hichem Besbes ◽

Yehia Massoud

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Gradient Boosting ◽

Machine Learning Algorithm ◽

Auto Insurance ◽

Gradient Boosting Machine ◽

Extreme Gradient Boosting

Download Full-text

Machine learning for predictions of cervical cancer identification – preliminary investigation based on refractive index

10.21203/rs.3.rs-948525/v1 ◽

2021 ◽

Author(s):

Michał Kruczkowski ◽

Anna Drabik-Kruczkowska ◽

Anna Marciniak ◽

Martyna Tarczewska ◽

Monika Kosowska ◽

...

Keyword(s):

Machine Learning ◽

Cervical Cancer ◽

Refractive Index ◽

Early Diagnosis ◽

Learning Algorithm ◽

Optical Measurements ◽

Prediction Algorithm ◽

Gradient Boosting ◽

Extreme Gradient Boosting

Abstract Cervical cancer is one of the most commonly appearing cancers, which early diagnosis is of greatest importance. Unfortunately, many diagnoses are based on subjective opinions of doctors – to date, there is no general measurement method with a calibrated standard. The problem can be solved with the measurement system being a fusion of an optoelectronic sensor and machine learning algorithm to provide reliable assistance for doctors in the early diagnosis stage of cervical cancer. We demonstrate the preliminary research on cervical cancer assessment utilizing optical sensor and prediction algorithm. Since each matter is characterized by refractive index, measuring its value and detecting changes give information about the state of the tissue. The optical measurements provided datasets for training and validating the analyzing software. We present data preprocessing, machine learning results utilizing three algorithms (Random Forest, eXtreme Gradient Boosting, Naïve Bayes) and assessment of their performance for classification of tissue as healthy or sick. All of them provided high values (>89%) of the measures describing them. Our solution allows for rapid sample measurement and automatic classification of the results constituting a potential support tool for doctors.

Download Full-text

Augmented Data and XGBoost Improvement for Sales Forecasting in the Large-Scale Retail Sector

Applied Sciences ◽

10.3390/app11177793 ◽

2021 ◽

Vol 11 (17) ◽

pp. 7793

Author(s):

Alessandro Massaro ◽

Antonio Panarese ◽

Daniele Giannone ◽

Angelo Galiano

Keyword(s):

Mean Square Error ◽

Large Scale ◽

Learning Algorithm ◽

Training Model ◽

Gradient Boosting ◽

Retail Sector ◽

Mean Square ◽

Extreme Gradient Boosting ◽

Initial Dataset ◽

Order Of Magnitude

The organized large-scale retail sector has been gradually establishing itself around the world, and has increased activities exponentially in the pandemic period. This modern sales system uses Data Mining technologies processing precious information to increase profit. In this direction, the extreme gradient boosting (XGBoost) algorithm was applied in an industrial project as a supervised learning algorithm to predict product sales including promotion condition and a multiparametric analysis. The implemented XGBoost model was trained and tested by the use of the Augmented Data (AD) technique in the event that the available data are not sufficient to achieve the desired accuracy, as for many practical cases of artificial intelligence data processing, where a large dataset is not available. The prediction was applied to a grid of segmented customers by allowing personalized services according to their purchasing behavior. The AD technique conferred a good accuracy if compared with results adopting the initial dataset with few records. An improvement of the prediction error, such as the Root Mean Square Error (RMSE) and Mean Square Error (MSE), which decreases by about an order of magnitude, was achieved. The AD technique formulated for large-scale retail sector also represents a good way to calibrate the training model.

Download Full-text

Explainable Artificial Intelligence for Sarcasm Detection in Dialogues

Wireless Communications and Mobile Computing ◽

10.1155/2021/2939334 ◽

2021 ◽

Vol 2021 ◽

pp. 1-13

Author(s):

Akshi Kumar ◽

Shubham Dikshit ◽

Victor Hugo C. Albuquerque

Keyword(s):

Language Processing ◽

Learning Algorithm ◽

Real Life ◽

Decision Makers ◽

Gradient Boosting ◽

Trained Classifier ◽

Extreme Gradient Boosting ◽

Interpretable Model ◽

The Right ◽

Post Hoc

Sarcasm detection in dialogues has been gaining popularity among natural language processing (NLP) researchers with the increased use of conversational threads on social media. Capturing the knowledge of the domain of discourse, context propagation during the course of dialogue, and situational context and tone of the speaker are some important features to train the machine learning models for detecting sarcasm in real time. As situational comedies vibrantly represent human mannerism and behaviour in everyday real-life situations, this research demonstrates the use of an ensemble supervised learning algorithm to detect sarcasm in the benchmark dialogue dataset, MUStARD. The punch-line utterance and its associated context are taken as features to train the eXtreme Gradient Boosting (XGBoost) method. The primary goal is to predict sarcasm in each utterance of the speaker using the chronological nature of a scene. Further, it is vital to prevent model bias and help decision makers understand how to use the models in the right way. Therefore, as a twin goal of this research, we make the learning model used for conversational sarcasm detection interpretable. This is done using two post hoc interpretability approaches, Local Interpretable Model-agnostic Explanations (LIME) and Shapley Additive exPlanations (SHAP), to generate explanations for the output of a trained classifier. The classification results clearly depict the importance of capturing the intersentence context to detect sarcasm in conversational threads. The interpretability methods show the words (features) that influence the decision of the model the most and help the user understand how the model is making the decision for detecting sarcasm in dialogues.

Download Full-text

Evaluation of AquaCrop model of cucumber under greenhouse cultivation

The Journal of Agricultural Science ◽

10.1017/s0021859621000472 ◽

2021 ◽

pp. 1-10

Author(s):

H. Khafajeh ◽

A. Banakar ◽

S. Minaei ◽

M. Delavar

Keyword(s):

Control System ◽

Relative Humidity ◽

Crop Yield ◽

Water Consumption ◽

Crop Production ◽

Environmental Control ◽

Irrigation Management ◽

Growth Stages ◽

Area Index ◽

Aquacrop Model

Abstract Water consumption in agriculture is impossible without considering relations between water, soil and plant. In this regard, there are various models and developed software in order to evaluate relation between soil, water and crop growth stages. These models can be used for irrigation planning if properly optimized and applied. AquaCrop is one of the known crop models, which was developed by the Food and Agriculture Organization of the United Nations. In order to optimize this model for crop production and irrigation management, an experiment was developed in a hydroponic cucumber greenhouse. Various parameters including water consumption volume, crop yield and leaf area index were measured during a season. A fuzzy control system was utilized for controlling temperature, relative humidity, planting bed moisture, light intensity and carbon dioxide values. The main purpose of designing a control system in the greenhouse is to achieve the desired values of temperature and relative humidity. In this model, evapotranspiration, irrigation requirements and crop yield were simulated. The results show that the AquaCrop model can estimate evapotranspiration with the least error in the greenhouse environment, which is controlled by a fuzzy controller. Also the system has estimated the crop yield and biomass of the product with a good degree of precision and it may support crop production in a greenhouse, including crop management and environmental control.

Download Full-text