A Machine Learning Approach to Determine Oyster Vessel Behavior

Devin Joseph Frey; Avdesh Mishra; Md Tamjidul Hoque; Mahdi Abdelguerfi; Thomas Soniat

doi:10.3390/make1010004

A Machine Learning Approach to Determine Oyster Vessel Behavior

Machine Learning and Knowledge Extraction ◽

10.3390/make1010004 ◽

2018 ◽

Vol 1 (1) ◽

pp. 64-74 ◽

Cited By ~ 2

Author(s):

Devin Joseph Frey ◽

Avdesh Mishra ◽

Md Tamjidul Hoque ◽

Mahdi Abdelguerfi ◽

Thomas Soniat

Keyword(s):

Machine Learning ◽

Satellite Communication ◽

Ground Truth ◽

Optimization Techniques ◽

Support Vector ◽

Svm Classifier ◽

Test Accuracy ◽

Trajectory Data ◽

Multi Class Classification ◽

Rule Based Classifier

In this work, we address a multi-class classification task of oyster vessel behaviors determination by classifying them into four different classes: fishing, traveling, poling (exploring) and docked (anchored). The main purpose of this work is to automate the oyster vessel behaviors determination task using machine learning and to explore different techniques to improve the accuracy of the oyster vessel behavior prediction problem. To employ machine learning technique, two important descriptors: speed and net speed, are calculated from the trajectory data, recorded by a satellite communication system (Vessel Management System, VMS) attached to the vessels fishing on the public oyster grounds of Louisiana. We constructed a support vector machine (SVM) based method which employs Radial Basis Function (RBF) as a kernel to accurately predict the behavior of oyster vessels. Several validation and parameter optimization techniques were used to improve the accuracy of the SVM classifier. A total 93% of the trajectory data from a July 2013 to August 2014 dataset consisting of 612,700 samples for which the ground truth can be obtained using rule-based classifier is used for validation and independent testing of our method. The results show that the proposed SVM based method is able to correctly classify 99.99% of 612,700 samples using the 10-fold cross validation. Furthermore, we achieved a precision of 1.00, recall of 1.00, F1-score of 1.00 and a test accuracy of 99.99%, while performing an independent test using a subset of 93% of the dataset, which consists of 31,418 points.

Download Full-text

Improving Machine Learning Identification of Unsafe Driver Behavior by Means of Sensor Fusion

Applied Sciences ◽

10.3390/app10186417 ◽

2020 ◽

Vol 10 (18) ◽

pp. 6417 ◽

Cited By ~ 1

Author(s):

Emanuele Lattanzi ◽

Giacomo Castellucci ◽

Valerio Freschi

Keyword(s):

Neural Network ◽

Machine Learning ◽

Driver Behavior ◽

Ground Truth ◽

Support Vector ◽

Svm Classifier ◽

Learning Technology ◽

Average Accuracy ◽

Unsafe Behaviors ◽

Vehicle Sensors

Most road accidents occur due to human fatigue, inattention, or drowsiness. Recently, machine learning technology has been successfully applied to identifying driving styles and recognizing unsafe behaviors starting from in-vehicle sensors signals such as vehicle and engine speed, throttle position, and engine load. In this work, we investigated the fusion of different external sensors, such as a gyroscope and a magnetometer, with in-vehicle sensors, to increase machine learning identification of unsafe driver behavior. Starting from those signals, we computed a set of features capable to accurately describe the behavior of the driver. A support vector machine and an artificial neural network were then trained and tested using several features calculated over more than 200 km of travel. The ground truth used to evaluate classification performances was obtained by means of an objective methodology based on the relationship between speed, and lateral and longitudinal acceleration of the vehicle. The classification results showed an average accuracy of about 88% using the SVM classifier and of about 90% using the neural network demonstrating the potential capability of the proposed methodology to identify unsafe driver behaviors.

Download Full-text

Classification of Crop Residue Cover in High-Resolution RGB Images Using Machine Learning

Journal of the ASABE ◽

10.13031/ja.14572 ◽

2022 ◽

Vol 65 (1) ◽

pp. 75-86

Author(s):

Parth C. Upadhyay ◽

John A. Lory ◽

Guilherme N. DeSouza ◽

Timotius A. P. Lagaunne ◽

Christine M. Spinka

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Feature Selection Method ◽

Texture Features ◽

Ground Truth ◽

Selection Method ◽

Support Vector ◽

Svm Classifier ◽

Aerial Vehicle ◽

Rgb Images

HighlightsA machine learning framework estimated residue cover in RGB images taken at three resolutions from 88 locations.The best results primarily used texture features, the RFE-SVM feature selection method, and the SVM classifier.Accounting for shadows and plants plus modifying and optimizing the texture features may improve performance.An automated system developed using machine learning is a viable strategy to estimate residue cover from RGB images obtained with handheld or UAV platforms.Abstract. Maintaining plant residue on the soil surface contributes to sustainable cultivation of arable land. Applying machine learning methods to RGB images of residue could overcome the subjectivity of manual methods. The objectives of this study were to use supervised machine learning while identifying the best feature selection method, the best classifier, and the most effective image feature types for classifying residue levels in RGB imagery. Imagery was collected from 88 locations in 40 row-crop fields in five Missouri counties between early May and late June in 2018 and 2019 using a tripod-mounted camera (0.014 cm pixel-1 ground sampling distance, GSD) and an unmanned aerial vehicle (UAV, 0.05 and 0.14 GSD). At each field location, 50 contiguous 0.3 × 0.2 m region of interest (ROI) images were extracted from the imagery, resulting in a dataset of 4,400 ROI images at each GSD. Residue percentages for ground truth were estimated using a bullseye grid method (n = 100 points) based on the 0.014 GSD images. Representative color, texture, and shape features were extracted and evaluated using four feature selection methods and two classifiers. Recursive feature elimination using support vector machine (RFE-SVM) was the best feature selection method, and the SVM classifier performed best for classifying the amount of residue as a three-class problem. The best features for this application were associated with texture, with local binary pattern (LBP) features being the most prevalent for all three GSDs. Shape features were irrelevant. The three residue classes were correctly identified with 88%, 84%, and 81% 10-fold cross-validation scores for the 2018 training data and 81%, 69%, and 65% accuracy for the 2019 testing data in decreasing resolution order. Converting image-wise data (0.014 GSD) to location residue estimates using a Bayesian model showed good agreement with the location-based ground truth (r2 = 0.90). This initial assessment documents the use of RGB images to match other methods of estimating residue, with potential to replace or be used as a quality control for line-transect assessments. Keywords: Feature selection, Soil erosion, Support vector machine, Texture features, Unmanned aerial vehicle.

Download Full-text

A machine learning approach for single cell interphase cell cycle staging

Scientific Reports ◽

10.1038/s41598-021-98489-5 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Hemaxi Narotamo ◽

Maria Sofia Fernandes ◽

Ana Margarida Moreira ◽

Soraia Melo ◽

Raquel Seruca ◽

...

Keyword(s):

Machine Learning ◽

Cell Cycle ◽

Single Cell ◽

Cell Function ◽

Ground Truth ◽

Supervised Machine Learning ◽

Support Vector ◽

Svm Classifier ◽

Interphase Cell

AbstractThe cell nucleus is a tightly regulated organelle and its architectural structure is dynamically orchestrated to maintain normal cell function. Indeed, fluctuations in nuclear size and shape are known to occur during the cell cycle and alterations in nuclear morphology are also hallmarks of many diseases including cancer. Regrettably, automated reliable tools for cell cycle staging at single cell level using in situ images are still limited. It is therefore urgent to establish accurate strategies combining bioimaging with high-content image analysis for a bona fide classification. In this study we developed a supervised machine learning method for interphase cell cycle staging of individual adherent cells using in situ fluorescence images of nuclei stained with DAPI. A Support Vector Machine (SVM) classifier operated over normalized nuclear features using more than 3500 DAPI stained nuclei. Molecular ground truth labels were obtained by automatic image processing using fluorescent ubiquitination-based cell cycle indicator (Fucci) technology. An average F1-Score of 87.7% was achieved with this framework. Furthermore, the method was validated on distinct cell types reaching recall values higher than 89%. Our method is a robust approach to identify cells in G1 or S/G2 at the individual level, with implications in research and clinical applications.

Download Full-text

Total nitrogen estimation in agricultural soils via aerial multispectral imaging and LIBS

Scientific Reports ◽

10.1038/s41598-021-90624-6 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Md Abir Hossen ◽

Prasoon K Diwakar ◽

Shankarachary Ragi

Keyword(s):

Machine Learning ◽

Total Nitrogen ◽

Multispectral Imaging ◽

Ground Truth ◽

Soil Samples ◽

Optimization Techniques ◽

Support Vector ◽

Growth Stages ◽

Calibration Model ◽

Ground Truth Data

AbstractMeasuring soil health indicators (SHIs), particularly soil total nitrogen (TN), is an important and challenging task that affects farmers’ decisions on timing, placement, and quantity of fertilizers applied in the farms. Most existing methods to measure SHIs are in-lab wet chemistry or spectroscopy-based methods, which require significant human input and effort, time-consuming, costly, and are low-throughput in nature. To address this challenge, we develop an artificial intelligence (AI)-driven near real-time unmanned aerial vehicle (UAV)-based multispectral sensing solution (UMS) to estimate soil TN in an agricultural farm. TN is an important macro-nutrient or SHI that directly affects the crop health. Accurate prediction of soil TN can significantly increase crop yield through informed decision making on the timing of seed planting, and fertilizer quantity and timing. The ground-truth data required to train the AI approaches is generated via laser-induced breakdown spectroscopy (LIBS), which can be readily used to characterize soil samples, providing rapid chemical analysis of the samples and their constituents (e.g., nitrogen, potassium, phosphorus, calcium). Although LIBS was previously applied for soil nutrient detection, there is no existing study on the integration of LIBS with UAV multispectral imaging and AI. We train two machine learning (ML) models including multi-layer perceptron regression and support vector regression to predict the soil nitrogen using a suite of data classes including multispectral characteristics of the soil and crops in red (R), near-infrared, and green (G) spectral bands, computed vegetation indices (NDVI), and environmental variables including air temperature and relative humidity (RH). To generate the ground-truth data or the training data for the machine learning models, we determine the N spectrum of the soil samples (collected from a farm) using LIBS and develop a calibration model using the correlation between actual TN of the soil samples and the maximum intensity of N spectrum. In addition, we extract the features from the multispectral images captured while the UAV follows an autonomous flight plan, at different growth stages of the crops. The ML model’s performance is tested on a fixed configuration space for the hyper-parameters using various hyper-parameter optimization techniques at three different wavelengths of the N spectrum.

Download Full-text

Binary Spectrum Feature for Improved Classiﬁer Performance

10.36227/techrxiv.12993122 ◽

2020 ◽

Author(s):

Nalika Ulapane ◽

Karthick Thiyagarajan ◽

sarath kodagoda

Keyword(s):

Machine Learning ◽

Classification Performance ◽

Feature Reduction ◽

Sensor Data ◽

Machine Learning Techniques ◽

Support Vector ◽

Svm Classifier ◽

Monitoring Task ◽

Classifier Performance ◽

Spectrum Feature

<div>Classiﬁcation has become a vital task in modern machine learning and Artiﬁcial Intelligence applications, including smart sensing. Numerous machine learning techniques are available to perform classiﬁcation. Similarly, numerous practices, such as feature selection (i.e., selection of a subset of descriptor variables that optimally describe the output), are available to improve classiﬁer performance. In this paper, we consider the case of a given supervised learning classiﬁcation task that has to be performed making use of continuous-valued features. It is assumed that an optimal subset of features has already been selected. Therefore, no further feature reduction, or feature addition, is to be carried out. Then, we attempt to improve the classiﬁcation performance by passing the given feature set through a transformation that produces a new feature set which we have named the “Binary Spectrum”. Via a case study example done on some Pulsed Eddy Current sensor data captured from an infrastructure monitoring task, we demonstrate how the classiﬁcation accuracy of a Support Vector Machine (SVM) classiﬁer increases through the use of this Binary Spectrum feature, indicating the feature transformation’s potential for broader usage.</div><div><br></div>

Download Full-text

Value of radiomics in differential diagnosis of chromophobe renal cell carcinoma and renal oncocytoma

Abdominal Radiology ◽

10.1007/s00261-019-02269-9 ◽

2019 ◽

Vol 45 (10) ◽

pp. 3193-3201 ◽

Cited By ~ 3

Author(s):

Yajuan Li ◽

Xialing Huang ◽

Yuwei Xia ◽

Liling Long

Keyword(s):

Machine Learning ◽

Differential Diagnosis ◽

Cell Carcinoma ◽

Area Under The Curve ◽

Image Features ◽

Renal Tumors ◽

Support Vector ◽

Svm Classifier ◽

Renal Oncocytoma ◽

Lasso Regression

Abstract Purpose To explore the value of CT-enhanced quantitative features combined with machine learning for differential diagnosis of renal chromophobe cell carcinoma (chRCC) and renal oncocytoma (RO). Methods Sixty-one cases of renal tumors (chRCC = 44; RO = 17) that were pathologically confirmed at our hospital between 2008 and 2018 were retrospectively analyzed. All patients had undergone preoperative enhanced CT scans including the corticomedullary (CMP), nephrographic (NP), and excretory phases (EP) of contrast enhancement. Volumes of interest (VOIs), including lesions on the images, were manually delineated using the RadCloud platform. A LASSO regression algorithm was used to screen the image features extracted from all VOIs. Five machine learning classifications were trained to distinguish chRCC from RO by using a fivefold cross-validation strategy. The performance of the classifier was mainly evaluated by areas under the receiver operating characteristic (ROC) curve and accuracy. Results In total, 1029 features were extracted from CMP, NP, and EP. The LASSO regression algorithm was used to screen out the four, four, and six best features, respectively, and eight features were selected when CMP and NP were combined. All five classifiers had good diagnostic performance, with area under the curve (AUC) values greater than 0.850, and support vector machine (SVM) classifier showed a diagnostic accuracy of 0.945 (AUC 0.964 ± 0.054; sensitivity 0.999; specificity 0.800), showing the best performance. Conclusions Accurate preoperative differential diagnosis of chRCC and RO can be facilitated by a combination of CT-enhanced quantitative features and machine learning.

Download Full-text

Risky Driver Recognition with Class Imbalance Data and Automated Machine Learning Framework

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph18147534 ◽

2021 ◽

Vol 18 (14) ◽

pp. 7534

Author(s):

Ke Wang ◽

Qingwen Xue ◽

Jian John Lu

Keyword(s):

Machine Learning ◽

High Risk ◽

Loss Function ◽

Class Imbalance ◽

Support Vector ◽

Trajectory Data ◽

Recognition Model ◽

Learning Framework ◽

Sampling Cost ◽

Automated Machine Learning

Identifying high-risk drivers before an accident happens is necessary for traffic accident control and prevention. Due to the class-imbalance nature of driving data, high-risk samples as the minority class are usually ill-treated by standard classification algorithms. Instead of applying preset sampling or cost-sensitive learning, this paper proposes a novel automated machine learning framework that simultaneously and automatically searches for the optimal sampling, cost-sensitive loss function, and probability calibration to handle class-imbalance problem in recognition of risky drivers. The hyperparameters that control sampling ratio and class weight, along with other hyperparameters, are optimized by Bayesian optimization. To demonstrate the performance of the proposed automated learning framework, we establish a risky driver recognition model as a case study, using video-extracted vehicle trajectory data of 2427 private cars on a German highway. Based on rear-end collision risk evaluation, only 4.29% of all drivers are labeled as risky drivers. The inputs of the recognition model are the discrete Fourier transform coefficients of target vehicle’s longitudinal speed, lateral speed, and the gap between the target vehicle and its preceding vehicle. Among 12 sampling methods, 2 cost-sensitive loss functions, and 2 probability calibration methods, the result of automated machine learning is consistent with manual searching but much more computation-efficient. We find that the combination of Support Vector Machine-based Synthetic Minority Oversampling TEchnique (SVMSMOTE) sampling, cost-sensitive cross-entropy loss function, and isotonic regression can significantly improve the recognition ability and reduce the error of predicted probability.

Download Full-text

Classification of Diffusion Tensor Metrics for the Diagnosis of a Myelopathic Cord Using Machine Learning

International Journal of Neural Systems ◽

10.1142/s0129065717500368 ◽

2018 ◽

Vol 28 (02) ◽

pp. 1750036 ◽

Cited By ~ 8

Author(s):

Shuqiang Wang ◽

Yong Hu ◽

Yanyan Shen ◽

Hanxiong Li

Keyword(s):

Machine Learning ◽

Surgical Planning ◽

Diffusion Tensor ◽

Mean Value ◽

Machine Learning Algorithms ◽

Support Vector ◽

Svm Classifier ◽

Control Groups ◽

Diffusion Tensor Imaging Dti

In this study, we propose an automated framework that combines diffusion tensor imaging (DTI) metrics with machine learning algorithms to accurately classify control groups and groups with cervical spondylotic myelopathy (CSM) in the spinal cord. The comparison between selected voxel-based classification and mean value-based classification were performed. A support vector machine (SVM) classifier using a selected voxel-based dataset produced an accuracy of 95.73%, sensitivity of 93.41% and specificity of 98.64%. The efficacy of each index of diffusion for classification was also evaluated. Using the proposed approach, myelopathic areas in CSM are detected to provide an accurate reference to assist spine surgeons in surgical planning in complicated cases.

Download Full-text

Deep Learning Assisted Neonatal Cry Classification via Support Vector Machine Models

Frontiers in Public Health ◽

10.3389/fpubh.2021.670352 ◽

2021 ◽

Vol 9 ◽

Author(s):

Ashwini K ◽

P. M. Durai Raj Vincent ◽

Kathiravan Srinivasan ◽

Chuan-Yu Chang

Keyword(s):

Neural Network ◽

Machine Learning ◽

Support Vector Machine ◽

Feature Extraction ◽

Deep Learning ◽

Convolutional Neural Network ◽

Support Vector ◽

Svm Classifier ◽

Infant Cry ◽

Learning Techniques

Neonatal infants communicate with us through cries. The infant cry signals have distinct patterns depending on the purpose of the cries. Preprocessing, feature extraction, and feature selection need expert attention and take much effort in audio signals in recent days. In deep learning techniques, it automatically extracts and selects the most important features. For this, it requires an enormous amount of data for effective classification. This work mainly discriminates the neonatal cries into pain, hunger, and sleepiness. The neonatal cry auditory signals are transformed into a spectrogram image by utilizing the short-time Fourier transform (STFT) technique. The deep convolutional neural network (DCNN) technique takes the spectrogram images for input. The features are obtained from the convolutional neural network and are passed to the support vector machine (SVM) classifier. Machine learning technique classifies neonatal cries. This work combines the advantages of machine learning and deep learning techniques to get the best results even with a moderate number of data samples. The experimental result shows that CNN-based feature extraction and SVM classifier provides promising results. While comparing the SVM-based kernel techniques, namely radial basis function (RBF), linear and polynomial, it is found that SVM-RBF provides the highest accuracy of kernel-based infant cry classification system provides 88.89% accuracy.

Download Full-text

Identification of Diagnostic Markers for Major Depressive Disorder Using Machine Learning Methods

Frontiers in Neuroscience ◽

10.3389/fnins.2021.645998 ◽

2021 ◽

Vol 15 ◽

Author(s):

Shu Zhao ◽

Zhiwei Bao ◽

Xinyi Zhao ◽

Mengxiang Xu ◽

Ming D. Li ◽

...

Keyword(s):

Machine Learning ◽

Major Depressive Disorder ◽

Depressive Disorder ◽

Meta Analysis ◽

Single Gene ◽

Support Vector ◽

Svm Classifier ◽

Diagnostic Model ◽

Roc Curve Analysis ◽

Major Depressive

BackgroundMajor depressive disorder (MDD) is a global health challenge that impacts the quality of patients’ lives severely. The disorder can manifest in many forms with different combinations of symptoms, which makes its clinical diagnosis difficult. Robust biomarkers are greatly needed to improve diagnosis and to understand the etiology of the disease. The main purpose of this study was to create a predictive model for MDD diagnosis based on peripheral blood transcriptomes.Materials and MethodsWe collected nine RNA expression datasets for MDD patients and healthy samples from the Gene Expression Omnibus database. After a series of quality control and heterogeneity tests, 302 samples from six studies were deemed suitable for the study. R package “MetaOmics” was applied for systematic meta-analysis of genome-wide expression data. Receiver operating characteristic (ROC) curve analysis was used to evaluate the diagnostic effectiveness of individual genes. To obtain a better diagnostic model, we also adopted the support vector machine (SVM), random forest (RF), k-nearest neighbors (kNN), and naive Bayesian (NB) tools for modeling, with the RF method being used for feature selection.ResultsOur analysis revealed six differentially expressed genes (AKR1C3, ARG1, KLRB1, MAFG, TPST1, and WWC3) with a false discovery rate (FDR) < 0.05 between MDD patients and control subjects. We then evaluated the diagnostic ability of these genes individually. With single gene prediction, we achieved a corresponding area under the curve (AUC) value of 0.63 ± 0.04, 0.67 ± 0.07, 0.70 ± 0.11, 0.64 ± 0.08, 0.68 ± 0.07, and 0.62 ± 0.09, respectively, for these genes. Next, we constructed the classifiers of SVM, RF, kNN, and NB with an AUC of 0.84 ± 0.09, 0.81 ± 0.10, 0.73 ± 0.11, and 0.83 ± 0.09, respectively, in validation datasets, suggesting that the SVM classifier might be superior for constructing an MDD diagnostic model. The final SVM classifier including 70 feature genes was capable of distinguishing MDD samples from healthy controls and yielded an AUC of 0.78 in an independent dataset.ConclusionThis study provides new insights into potential biomarkers through meta-analysis of GEO data. Constructing different machine learning models based on these biomarkers could be a valuable approach for diagnosing MDD in clinical practice.

Download Full-text