Employing Machine Learning for Detection of Invasive Species using Sentinel-2 and AVIRIS Data: The Case of Kudzu in the United States

Tobias Jensen; Frederik Seerup Hass; Mohammad Seam Akbar; Philip Holm Petersen; Jamal Jokar Arsanjani

doi:10.3390/su12093544

Employing Machine Learning for Detection of Invasive Species using Sentinel-2 and AVIRIS Data: The Case of Kudzu in the United States

Sustainability ◽

10.3390/su12093544 ◽

2020 ◽

Vol 12 (9) ◽

pp. 3544 ◽

Cited By ~ 1

Author(s):

Tobias Jensen ◽

Frederik Seerup Hass ◽

Mohammad Seam Akbar ◽

Philip Holm Petersen ◽

Jamal Jokar Arsanjani

Keyword(s):

Neural Network ◽

Machine Learning ◽

Invasive Species ◽

Support Vector Machine ◽

Random Forest ◽

The United States ◽

Support Vector ◽

Area Of Interest ◽

Responsible Consumption ◽

Sentinel 2

Invasive plants are causing massive economic and environmental troubles for our societies worldwide. The aim of this study is to employ a set of machine learning classifiers for detecting invasive plant species using remote sensing data. The target species is Kudzu vine, which mostly grows in the south-eastern states of the US and quickly outcompetes other plants, making it a relevant and threatening species to consider. Our study area is Atlanta, Georgia and the surrounding area. Five different algorithms: Boosted Logistic Regression (BLR), Naive Bayes (NB), Neural Network (NN), Random Forest (RF) and Support Vector Machine (SVM) were tested with the aim of testing their performance and identifying the most optimal one. Furthermore, the influence of temporal, spectral and spatial resolution in detecting Kudzu was also tested and reviewed. Our finding shows that random forest, neural network and support vector machine classifiers outperformed. While the achieved internal accuracies were about 97%, an external validation conducted over an expanded area of interest resulted in 79.5% accuracy. Furthermore, the study indicates that high accuracy classification can be achieved using multispectral Sentinel-2 imagery and can be improved while integrating with airborne visible/infrared imaging spectrometer (AVIRIS) hyperspectral data. Finally, this study indicates that dimensionality reduction methods such as principal component analysis (PCA) should be applied cautiously to the hyperspectral AVIRIS data to preserve its utility. The applied approach and the utilized set of methods can be of interest for detecting other kinds of invasive species as part of fulfilling UN sustainable development goals, particularly number 12: responsible consumption and production, 13: climate action, and 15: life on land.

Download Full-text

Performance Comparison of Oil Spill and Ship Classification from X-Band Dual- and Single-Polarized SAR Image Using Support Vector Machine, Random Forest, and Deep Neural Network

Remote Sensing ◽

10.3390/rs13163203 ◽

2021 ◽

Vol 13 (16) ◽

pp. 3203

Author(s):

Won-Kyung Baek ◽

Hyung-Sup Jung

Keyword(s):

Neural Network ◽

Machine Learning ◽

Support Vector Machine ◽

Random Forest ◽

Performance Improvement ◽

Oil Spill ◽

Deep Neural Network ◽

Support Vector ◽

Sar Image ◽

X Band

It is well known that the polarization characteristics in X-band synthetic aperture radar (SAR) image analysis can provide us with additional information for marine target classification and detection. Normally, dual-and single-polarized SAR images are acquired by SAR satellites, and then we must determine how accurate the marine mapping performance from dual-polarized (pol) images is versus the marine mapping performance from the single-pol images in a given machine learning model. The purpose of this study is to compare the performance of single- and dual-pol SAR image classification achieved by the support vector machine (SVM), random forest (RF), and deep neural network (DNN) models. The test image is a TerraSAR-X dual-pol image acquired from the 2007 Kerch Strait oil spill event. For this, 824,026 pixels and 1,648,051 pixels were extracted from the image for the training and test, respectively, and sea, ship, oil, and land objects were classified from the image by using the three machine learning methods. The mean f1-scores of the SVM, RF, and DNN models resulting from the single-pol image were approximately 0.822, 0.882, and 0.889, respectively, and those from the dual-pol image were about 0.852, 0.908, and 0.898, respectively. The performance improvement achieved by dual-pol was about 3.6%, 2.9%, and 1% in SVM, RF, and DNN, respectively. The DNN model had the best performance (0.889) in the single-pol test while the RF model was best (0.908) in the dual-pol test. The performance improvement was approximately 2.1% and not noticeable. If the condition that dual-pol images have two-times lower spatial resolution versus single-pol images in the azimuth direction is considered, a small improvement may not be valuable. Therefore, the results show that the performance improvement by X-band dual-pol image may be not remarkable when classifying the sea, ships, oil spills, and sea and land surfaces.

Download Full-text

Machine learning model for predicting the optimal depth of tracheal tube insertion in pediatric patients: A retrospective cohort study

PLoS ONE ◽

10.1371/journal.pone.0257069 ◽

2021 ◽

Vol 16 (9) ◽

pp. e0257069

Author(s):

Jae-Geum Shim ◽

Kyoung-Ho Ryu ◽

Sung Hyun Lee ◽

Eun-Ah Cho ◽

Sungho Lee ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Artificial Neural Network ◽

Support Vector Machine ◽

Random Forest ◽

Tracheal Tube ◽

Pediatric Patients ◽

Support Vector ◽

Learning Models ◽

Machine Learning Models

Objective To construct a prediction model for optimal tracheal tube depth in pediatric patients using machine learning. Methods Pediatric patients aged <7 years who received post-operative ventilation after undergoing surgery between January 2015 and December 2018 were investigated in this retrospective study. The optimal location of the tracheal tube was defined as the median of the distance between the upper margin of the first thoracic(T1) vertebral body and the lower margin of the third thoracic(T3) vertebral body. We applied four machine learning models: random forest, elastic net, support vector machine, and artificial neural network and compared their prediction accuracy to three formula-based methods, which were based on age, height, and tracheal tube internal diameter(ID). Results For each method, the percentage with optimal tracheal tube depth predictions in the test set was calculated as follows: 79.0 (95% confidence interval [CI], 73.5 to 83.6) for random forest, 77.4 (95% CI, 71.8 to 82.2; P = 0.719) for elastic net, 77.0 (95% CI, 71.4 to 81.8; P = 0.486) for support vector machine, 76.6 (95% CI, 71.0 to 81.5; P = 1.0) for artificial neural network, 66.9 (95% CI, 60.9 to 72.5; P < 0.001) for the age-based formula, 58.5 (95% CI, 52.3 to 64.4; P< 0.001) for the tube ID-based formula, and 44.4 (95% CI, 38.3 to 50.6; P < 0.001) for the height-based formula. Conclusions In this study, the machine learning models predicted the optimal tracheal tube tip location for pediatric patients more accurately than the formula-based methods. Machine learning models using biometric variables may help clinicians make decisions regarding optimal tracheal tube depth in pediatric patients.

Download Full-text

A Comparative Study on Supervised Machine Learning Algorithms for Copper Recovery Quality Prediction in a Leaching Process

Sensors ◽

10.3390/s21062119 ◽

2021 ◽

Vol 21 (6) ◽

pp. 2119

Author(s):

Victor Flores ◽

Claudio Leiva

Keyword(s):

Neural Network ◽

Machine Learning ◽

Artificial Neural Network ◽

Support Vector Machine ◽

Random Forest ◽

Mining Industry ◽

Machine Learning Algorithms ◽

Copper Recovery ◽

Support Vector ◽

Copper Mining

The copper mining industry is increasingly using artificial intelligence methods to improve copper production processes. Recent studies reveal the use of algorithms, such as Artificial Neural Network, Support Vector Machine, and Random Forest, among others, to develop models for predicting product quality. Other studies compare the predictive models developed with these machine learning algorithms in the mining industry as a whole. However, not many copper mining studies published compare the results of machine learning techniques for copper recovery prediction. This study makes a detailed comparison between three models for predicting copper recovery by leaching, using four datasets resulting from mining operations in Northern Chile. The algorithms used for developing the models were Random Forest, Support Vector Machine, and Artificial Neural Network. To validate these models, four indicators or values of merit were used: accuracy (acc), precision (p), recall (r), and Matthew’s correlation coefficient (mcc). This paper describes the dataset preparation and the refinement of the threshold values used for the predictive variable most influential on the class (the copper recovery). Results show both a precision over 98.50% and also the model with the best behavior between the predicted and the real values. Finally, the obtained models have the following mean values: acc = 0.943, p = 88.47, r = 0.995, and mcc = 0.232. These values are highly competitive when compared with those obtained in similar studies using other approaches in the context.

Download Full-text

Retrieval of aerosol optical depth from surface solar radiation measurements using machine learning algorithms, nonlinear regression and a radiative transfer based look-up table

10.5194/acp-2016-58 ◽

2016 ◽

Author(s):

J. Huttunen ◽

H. Kokkola ◽

T. Mielonen ◽

M. Mononen ◽

A. Lipponen ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Support Vector Machine ◽

Random Forest ◽

Nonlinear Regression ◽

Support Vector ◽

Learning Methods ◽

Surface Solar Radiation ◽

Machine Learning Methods ◽

Look Up Table

Abstract. In order to have a good estimate of the current forcing by anthropogenic aerosols knowledge on past aerosol levels is needed. Aerosol optical depth (AOD) is a good measure for aerosol loading. However, dedicated measurements of AOD are only available from 1990’s onward. One option to lengthen the AOD time series beyond 1990’s is to retrieve AOD from surface solar radiation (SSR) measurements done with pyranometers. In this work, we have evaluated several inversion methods designed for this task. We compared a look-up table method based on radiative transfer modelling, a nonlinear regression method and four machine learning methods (Gaussian Process, Neural Network, Random Forest and Support Vector Machine) with AOD observations done with a sun photometer at an Aerosol Robotic Network (AERONET) site in Thessaloniki, Greece. Our results show that most of the machine learning methods produce AOD estimates comparable to the look-up table and nonlinear regression methods. All of the applied methods produced AOD values that corresponded well to the AERONET observations with the lowest correlation coefficient value being 0.87 for the Random Forest method. While many of the methods tended to slightly overestimate low AODs and underestimate high AODs, Neural network and support vector machine showed overall better correspondence for the whole AOD range. The differences in producing both ends of the AOD range seem to be caused by differences in the aerosol composition. High AODs were in most cases those with high water vapour content which might affect the aerosol single scattering albedo (SSA) through uptake of water into aerosols. Our study indicates that machine learning methods benefit from the fact that they do not constrain the aerosol SSA in the retrieval where as the LUT method assumes a constant value for it. This would also mean that machine learning methods could have potential in reproducing AOD from SSR even though SSA would have changed during the observation period.

Download Full-text

Predicting Incident Duration Based on Machine Learning Methods

Iraqi Journal of Computer Communication Control and System Engineering ◽

10.33103/uot.ijccce.21.1.1 ◽

2021 ◽

pp. 1-15

Keyword(s):

Neural Network ◽

Machine Learning ◽

Support Vector Machine ◽

Random Forest ◽

Mean Squared Error ◽

Productivity Loss ◽

Support Vector ◽

Multi Layer Perceptron ◽

Traffic Incidents ◽

Incident Duration

Traffic incidents dont only cause various levels of traffic congestion but often contribute to traffic accidents and secondary accidents, resulting in substantial loss of life, economy, and productivity loss in terms of injuries and deaths, increased travel times and delays, and excessive consumption of energy and air pollution. Therefore, it is essential to accurately estimate the duration of the incident to mitigate these effects. Traffic management center incident logs and traffic sensors data from Eastbound Interstate 70 (I-70) in Missouri, United States collected during the period from January 2015 to January 2017, with a total of 352 incident records were used to develop incident duration estimation models. This paper investigated different machine learning (ML) methods for traffic incidents duration prediction. The attempted ML techniques include Support Vector Machine (SVM), Random Forest (RF), and Neural Network Multi-Layer Perceptron (MLP). Root mean squared error (RMSE) and Mean absolute error (MAE) were used to evaluate the performance of these models. The results showed that the performance of the models was comparable with SVM models slightly outperforms the RF, and MLP models in terms of MAE index, where MAE was 14.23 min for the best-performing SVM models. Whereas, in terms of the RMSE index, RF models slightly outperformed the other two models given RMSE of 18.91 min for the best-performing RF model. Index Terms— Incident Duration, Neural Network Multi-Layer Perceptron, Random Forest, Support Vector Machine.

Download Full-text

A Comparative Study on Supervised Machine Learning Algorithms for Copper Recovery Quality Prediction in a Leaching Process

10.20944/preprints202102.0326.v1 ◽

2021 ◽

Author(s):

Victor Flores ◽

Claudio Leiva

Keyword(s):

Neural Network ◽

Machine Learning ◽

Artificial Neural Network ◽

Support Vector Machine ◽

Random Forest ◽

Mining Industry ◽

Machine Learning Algorithms ◽

Copper Recovery ◽

Support Vector ◽

Copper Mining

The copper mining industry is increasingly using artificial intelligence methods to improve cop-per production processes. Recent studies reveal the use of algorithms such as Artificial Neural Network, Support Vector Machine, and Random Forest, among others, to develop models for predicting product quality. Other studies compare the predictive models developed with these machine learning algorithms in the mining industry, as a whole. However, not many copper mining studies published compare the results of machine learning techniques for copper recovery prediction. This study makes a detailed comparison between three models for predicting copper recovery by leaching, using four datasets resulting from mining operations in northern Chile. The algorithms used for developing the models were Random Forest, Support Vector Machine, and Artificial Neural Network. To validate these models, four indicators or values of merit were used: accuracy (acc), precision (p), recall (r), and Matthew’s correlation coefficient (mcc). This paper describes dataset preparation and the refinement of the threshold values used for the predictive variable most influential on the class (the copper recovery). Results show both a precision over 98.50% and also the model with the best behavior between the predicted and the real. Finally, the models obtained show the following mean values: acc=94.32, p=88.47, r=99.59, and mcc=2.31. These values are highly competitive as compared with those obtained in similar studies using other approaches in the context.

Download Full-text

A Two Layer Machine Learning System for Intrusion Detection Based on Random Forest and Support Vector Machine

2020 IEEE International Women in Engineering (WIE) Conference on Electrical and Computer Engineering (WIECON-ECE) ◽

10.1109/wiecon-ece52138.2020.9397945 ◽

2020 ◽

Author(s):

Sabrina Afroz ◽

S.M Ariful Islam ◽

Samin Nawer Rafa ◽

Maheen Islam

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Random Forest ◽

Intrusion Detection ◽

Learning System ◽

Support Vector

Download Full-text

Possibility of Autonomous Estimation of Shiba Goat’s Estrus and Non-Estrus Behavior by Machine Learning Methods

Animals ◽

10.3390/ani10050771 ◽

2020 ◽

Vol 10 (5) ◽

pp. 771

Author(s):

Toshiya Arakawa

Keyword(s):

Neural Network ◽

Machine Learning ◽

Random Forest ◽

Markov Models ◽

Tracking System ◽

Video Tracking ◽

Training Data ◽

Support Vector ◽

Learning Methods ◽

Machine Learning Methods

Mammalian behavior is typically monitored by observation. However, direct observation requires a substantial amount of effort and time, if the number of mammals to be observed is sufficiently large or if the observation is conducted for a prolonged period. In this study, machine learning methods as hidden Markov models (HMMs), random forests, support vector machines (SVMs), and neural networks, were applied to detect and estimate whether a goat is in estrus based on the goat’s behavior; thus, the adequacy of the method was verified. Goat’s tracking data was obtained using a video tracking system and used to estimate whether they, which are in “estrus” or “non-estrus”, were in either states: “approaching the male”, or “standing near the male”. Totally, the PC of random forest seems to be the highest. However, The percentage concordance (PC) value besides the goats whose data were used for training data sets is relatively low. It is suggested that random forest tend to over-fit to training data. Besides random forest, the PC of HMMs and SVMs is high. However, considering the calculation time and HMM’s advantage in that it is a time series model, HMM is better method. The PC of neural network is totally low, however, if the more goat’s data were acquired, neural network would be an adequate method for estimation.

Download Full-text

HARTH: A Human Activity Recognition Dataset for Machine Learning

Sensors ◽

10.3390/s21237853 ◽

2021 ◽

Vol 21 (23) ◽

pp. 7853

Author(s):

Aleksej Logacjov ◽

Kerstin Bach ◽

Atle Kongsvold ◽

Hilde Bremseth Bårdstu ◽

Paul Jarle Mork

Keyword(s):

Neural Network ◽

Machine Learning ◽

Support Vector Machine ◽

Convolutional Neural Network ◽

Activity Recognition ◽

Human Activity ◽

Short Term Memory ◽

Human Activity Recognition ◽

Support Vector ◽

Free Living

Existing accelerometer-based human activity recognition (HAR) benchmark datasets that were recorded during free living suffer from non-fixed sensor placement, the usage of only one sensor, and unreliable annotations. We make two contributions in this work. First, we present the publicly available Human Activity Recognition Trondheim dataset (HARTH). Twenty-two participants were recorded for 90 to 120 min during their regular working hours using two three-axial accelerometers, attached to the thigh and lower back, and a chest-mounted camera. Experts annotated the data independently using the camera’s video signal and achieved high inter-rater agreement (Fleiss’ Kappa =0.96). They labeled twelve activities. The second contribution of this paper is the training of seven different baseline machine learning models for HAR on our dataset. We used a support vector machine, k-nearest neighbor, random forest, extreme gradient boost, convolutional neural network, bidirectional long short-term memory, and convolutional neural network with multi-resolution blocks. The support vector machine achieved the best results with an F1-score of 0.81 (standard deviation: ±0.18), recall of 0.85±0.13, and precision of 0.79±0.22 in a leave-one-subject-out cross-validation. Our highly professional recordings and annotations provide a promising benchmark dataset for researchers to develop innovative machine learning approaches for precise HAR in free living.

Download Full-text

Deep Learning Assisted Neonatal Cry Classification via Support Vector Machine Models

Frontiers in Public Health ◽

10.3389/fpubh.2021.670352 ◽

2021 ◽

Vol 9 ◽

Author(s):

Ashwini K ◽

P. M. Durai Raj Vincent ◽

Kathiravan Srinivasan ◽

Chuan-Yu Chang

Keyword(s):

Neural Network ◽

Machine Learning ◽

Support Vector Machine ◽

Feature Extraction ◽

Deep Learning ◽

Convolutional Neural Network ◽

Support Vector ◽

Svm Classifier ◽

Infant Cry ◽

Learning Techniques

Neonatal infants communicate with us through cries. The infant cry signals have distinct patterns depending on the purpose of the cries. Preprocessing, feature extraction, and feature selection need expert attention and take much effort in audio signals in recent days. In deep learning techniques, it automatically extracts and selects the most important features. For this, it requires an enormous amount of data for effective classification. This work mainly discriminates the neonatal cries into pain, hunger, and sleepiness. The neonatal cry auditory signals are transformed into a spectrogram image by utilizing the short-time Fourier transform (STFT) technique. The deep convolutional neural network (DCNN) technique takes the spectrogram images for input. The features are obtained from the convolutional neural network and are passed to the support vector machine (SVM) classifier. Machine learning technique classifies neonatal cries. This work combines the advantages of machine learning and deep learning techniques to get the best results even with a moderate number of data samples. The experimental result shows that CNN-based feature extraction and SVM classifier provides promising results. While comparing the SVM-based kernel techniques, namely radial basis function (RBF), linear and polynomial, it is found that SVM-RBF provides the highest accuracy of kernel-based infant cry classification system provides 88.89% accuracy.

Download Full-text