Improved Classification Models to Distinguish Natural from Anthropic Oil Slicks in the Gulf of Mexico: Seasonality and Radarsat-2 Beam Mode Effects under a Machine Learning Approach

Ítalo de Oliveira Matias; Patrícia Carneiro Genovez; Sarah Barrón Torres; Francisco Fábio de Araújo Ponte; Anderson José Silva de Oliveira; Fernando Pellon de Miranda; Gil Márcio Avellino

doi:10.3390/rs13224568

Improved Classification Models to Distinguish Natural from Anthropic Oil Slicks in the Gulf of Mexico: Seasonality and Radarsat-2 Beam Mode Effects under a Machine Learning Approach

Remote Sensing ◽

10.3390/rs13224568 ◽

2021 ◽

Vol 13 (22) ◽

pp. 4568

Author(s):

Ítalo de Oliveira Matias ◽

Patrícia Carneiro Genovez ◽

Sarah Barrón Torres ◽

Francisco Fábio de Araújo Ponte ◽

Anderson José Silva de Oliveira ◽

...

Keyword(s):

Machine Learning ◽

Gulf Of Mexico ◽

Classification Model ◽

Synthetic Aperture ◽

Classification Models ◽

Linear Discriminant ◽

Mode Effects ◽

Oil Slick ◽

Machine Learning Approach ◽

Beam Mode

Distinguishing between natural and anthropic oil slicks is a challenging task, especially in the Gulf of Mexico, where these events can be simultaneously observed and recognized as seeps or spills. In this study, a powerful data analysis provided by machine learning (ML) methods was employed to develop, test, and implement a classification model (CM) to distinguish an oil slick source (OSS) as natural or anthropic. A robust database containing 4916 validated oil samples, detected using synthetic aperture radar (SAR), was employed for this task. Six ML algorithms were evaluated, including artificial neural networks (ANN), random forest (RF), decision trees (DT), naive Bayes (NB), linear discriminant analysis (LDA), and logistic regression (LR). Using RF, the global CM achieved a maximum accuracy value of 73.15. An innovative approach evaluated how external factors, such as seasonality, satellite configurations, and the synergy between them, limit or improve OSS predictions. To accomplish this, specific classification models (SCMs) were derived from the global ones (CMs), tuning the best algorithms and parameters according to different scenarios. Median accuracies revealed winter and spring to be the best seasons and ScanSAR Narrow B (SCNB) as the best beam mode. The maximum median accuracy to distinguish seeps from spills was achieved in winter using SCNB (83.05). Among the tested algorithms, RF was the most robust, with a better performance in 81% of the investigated scenarios. The accuracy increment provided by the well-fitted models may minimize the confusion between seeps and spills. This represents a concrete contribution to reducing economic and geologic risks derived from exploration activities in offshore areas. Additionally, from an operational standpoint, specific models support specialists to select the best SAR products and seasons for new acquisitions, as well as to optimize performances according to the available data.

Download Full-text

Machine learning to distinguish natural and anthropic oil slicks: classification model and the Radarsat-2 beam mode effects

Rio Oil and Gas Expo and Conference ◽

10.48072/2525-7579.rog.2020.458 ◽

2020 ◽

Vol 20 (2020) ◽

pp. 458-459

Author(s):

Sarah Barrón Torres ◽

Italo de Oliveira Matias ◽

Gustavo Robichez ◽

Gil Marcio Avelino Silva ◽

Fernando Pellon De Miranda ◽

...

Keyword(s):

Machine Learning ◽

Classification Model ◽

Mode Effects ◽

Beam Mode

Download Full-text

MicroRNA Profiling as a Methodology to Diagnose Ménière’s Disease: Potential Application of Machine Learning

Otolaryngology ◽

10.1177/0194599820940649 ◽

2020 ◽

pp. 019459982094064

Author(s):

Matthew Shew ◽

Helena Wichova ◽

Andres Bur ◽

Devin C. Koestler ◽

Madeleine St Peter ◽

...

Keyword(s):

Machine Learning ◽

Hearing Loss ◽

Meniere’S Disease ◽

Menière’S Disease ◽

Classification Model ◽

Meniere's Disease ◽

Ménière’S Disease ◽

Classification Models ◽

Menière's Disease ◽

Ménière's Disease

Objective Diagnosis and treatment of Ménière’s disease remains a significant challenge because of our inability to understand what is occurring on a molecular level. MicroRNA (miRNA) perilymph profiling is a safe methodology and may serve as a “liquid biopsy” equivalent. We used machine learning (ML) to evaluate miRNA expression profiles of various inner ear pathologies to predict diagnosis of Ménière’s disease. Study Design Prospective cohort study. Setting Tertiary academic hospital. Subjects and Methods Perilymph was collected during labyrinthectomy (Ménière’s disease, n = 5), stapedotomy (otosclerosis, n = 5), and cochlear implantation (sensorineural hearing loss [SNHL], n = 9). miRNA was isolated and analyzed with the Affymetrix miRNA 4.0 array. Various ML classification models were evaluated with an 80/20 train/test split and cross-validation. Permutation feature importance was performed to understand miRNAs that were critical to the classification models. Results In terms of miRNA profiles for conductive hearing loss versus Ménière’s, 4 models were able to differentiate and identify the 2 disease classes with 100% accuracy. The top-performing models used the same miRNAs in their decision classification model but with different weighted values. All candidate models for SNHL versus Ménière’s performed significantly worse, with the best models achieving 66% accuracy. Ménière’s models showed unique features distinct from SNHL. Conclusions We can use ML to build Ménière’s-specific prediction models using miRNA profile alone. However, ML models were less accurate in predicting SNHL from Ménière’s, likely from overlap of miRNA biomarkers. The power of this technique is that it identifies biomarkers without knowledge of the pathophysiology, potentially leading to identification of novel biomarkers and diagnostic tests.

Download Full-text

Medium‐Term Forecasting of Loop Current Eddy Cameron and Eddy Darwin Formation in the Gulf of Mexico With a Divide‐and‐Conquer Machine Learning Approach

Journal of Geophysical Research Oceans ◽

10.1029/2019jc015172 ◽

2019 ◽

Vol 124 (8) ◽

pp. 5586-5606

Author(s):

Justin L. Wang ◽

Hanqi Zhuang ◽

Laurent M. Chérubin ◽

Ali K. Ibrahim ◽

Ali Muhamed Ali

Keyword(s):

Machine Learning ◽

Gulf Of Mexico ◽

Divide And Conquer ◽

Learning Approach ◽

Loop Current ◽

Medium Term ◽

Machine Learning Approach

Download Full-text

Predicting pulmonary function from the analysis of voice: a machine learning approach

10.1101/2021.05.11.21256997 ◽

2021 ◽

Author(s):

Md. Zahangir Alam ◽

Albino Simonetti ◽

Rafaelle Billantino ◽

Nick Tayler ◽

Chris Grainge ◽

...

Keyword(s):

Machine Learning ◽

Lung Function ◽

Predictive Models ◽

Binary Classification ◽

Support Vector ◽

Learning Approach ◽

Classification Models ◽

Special Equipment ◽

Self Monitoring ◽

Machine Learning Approach

Providing proper timely treatment of asthma, self-monitoring can play a vital role in disease control. Existing methods (such as peak flow meter, smart spirometer) requires special equipment and are not always used by the patient. Using voice recording as surrogate measures of lung function can be used to assess asthma, which has good potential to self-monitor asthma and could be integrated into telehealth platforms. This study aims to apply machine learning approach to predict lung functions from recorded voice for asthma patients. A threshold-based mechanism was designed to separate speech and breathing from recordings (323 recordings from 26 participants) and features extracted from these were combined with biological attributes and lung function (percentage predicted forced expiratory volume in 1 second, FEV1%). Three predictive models were developed: (a) regression models to predict lung function, (b) multi-class classification models to predict the severity, and (c) binary classification models to predict abnormality. Random Forest (RF), Support Vector Machine (SVM), and Linear Regression (LR) algorithms were implemented to develop these predictive models. Training and test samples were separated (70%:30% using balanced portioning). Features were normalised and 10-fold cross-validation used to measure the model's training performances on the training samples. Models were then run on the test samples to measure the final performances. The RF based regression model performed better with lowest root mean square error = 10.86, and mean absolute score = 11.47, as compared to other models. In predicting the severity of lung function, the SVM based model performed better with 73.20% accuracy. The RF based model performed better in binary classification models for predicting abnormality of lung function (accuracy = 0.85, F1-score = 0.84, and area under the receiver operating characteristic curve = 0.88). The proposed machine learning approach can predict lung function (in terms of FEV1%), from the recorded voice files, better than other published approaches. These models can be extended to predict both the severity and abnormality of lung function with reasonable accuracies. This technique could be used to develop future telehealth solutions including smartphone-based applications which have potential to aid decision making and self-monitoring in asthma.

Download Full-text

A Secure Data Classification Model in Cloud Computing Using Machine Learning Approach

International Journal of Grid and Distributed Computing ◽

10.14257/ijgdc.2016.9.8.02 ◽

2016 ◽

Vol 9 (8) ◽

pp. 13-22

Author(s):

Kulwinder Kaur ◽

Vikas Zandu

Keyword(s):

Machine Learning ◽

Cloud Computing ◽

Data Classification ◽

Classification Model ◽

Learning Approach ◽

Machine Learning Approach ◽

Secure Data

Download Full-text

A Machine Learning Approach to Flood Depth and Extent Detection Using Sentinel 1A/B Synthetic Aperture Radar

10.1109/igarss47720.2021.9553601 ◽

2021 ◽

Author(s):

K. Tiampo ◽

C. Woods ◽

L. Huang ◽

P. Sharma ◽

Z. Chen ◽

...

Keyword(s):

Machine Learning ◽

Synthetic Aperture Radar ◽

Synthetic Aperture ◽

Learning Approach ◽

Flood Depth ◽

Machine Learning Approach ◽

Aperture Radar

Download Full-text

Machine Learning Approach to Dysphonia Detection

Applied Sciences ◽

10.3390/app8101927 ◽

2018 ◽

Vol 8 (10) ◽

pp. 1927 ◽

Cited By ~ 1

Author(s):

Zuzana Dankovičová ◽

Dávid Sovák ◽

Peter Drotár ◽

Liberios Vokorokos

Keyword(s):

Machine Learning ◽

State Of The Art ◽

Nearest Neighbors ◽

Classification Model ◽

Support Vector ◽

Learning Approach ◽

K Nearest Neighbors ◽

Machine Learning Methods ◽

Machine Learning Approach ◽

Speech Features

This paper addresses the processing of speech data and their utilization in a decision support system. The main aim of this work is to utilize machine learning methods to recognize pathological speech, particularly dysphonia. We extracted 1560 speech features and used these to train the classification model. As classifiers, three state-of-the-art methods were used: K-nearest neighbors, random forests, and support vector machine. We analyzed the performance of classifiers with and without gender taken into account. The experimental results showed that it is possible to recognize pathological speech with as high as a 91.3% classification accuracy.

Download Full-text

Oil-Slick Category Discrimination (Seeps vs. Spills): A Linear Discriminant Analysis Using RADARSAT-2 Backscatter Coefficients (σ°, β°, and γ°) in Campeche Bay (Gulf of Mexico)

Remote Sensing ◽

10.3390/rs11141652 ◽

2019 ◽

Vol 11 (14) ◽

pp. 1652 ◽

Cited By ~ 1

Author(s):

Gustavo de Araújo Carvalho ◽

Peter J. Minnett ◽

Eduardo T. Paes ◽

Fernando P. de Miranda ◽

Luiz Landau

Keyword(s):

Discriminant Analysis ◽

Gulf Of Mexico ◽

Linear Discriminant Analysis ◽

Petroleum Industry ◽

Cube Root ◽

Data Transformations ◽

Linear Discriminant ◽

Oil Slick ◽

Category Discrimination ◽

The Impact

A novel empirical approach to categorize oil slicks’ sea surface expressions in synthetic aperture radar (SAR) measurements into oil seeps or oil spills is investigated, contributing both to academic remote sensing research and to practical applications for the petroleum industry. We use linear discriminant analysis (LDA) to try accuracy improvements from our previously published methods of discriminating seeps from spills that achieved ~70% of overall accuracy. Analyzing 244 RADARSAT-2 scenes containing 4562 slicks observed in Campeche Bay (Gulf of Mexico), our exploratory data analysis evaluates the impact of 61 combinations of SAR backscatter coefficients (σ°, β°, γ°), SAR calibrated products (received radar beam given in amplitude or decibel, with or without a despeckle filter), and data transformations (none, cube root, log10). The LDA ability to discriminate the oil-slick category is rather independent of backscatter coefficients and calibrated products, but influenced by data transformations. The combination of attributes plays a role in the discrimination; combining oil-slicks’ size and SAR information is more effective. We have simplified our analyses using fewer attributes to reach accuracies comparable to those of our earlier studies, and we suggest using other multivariate data analyses—cubist or random forest—to attempt to further improve oil-slick category discrimination.

Download Full-text

Ensemble machine learning approach for screening of coronary heart disease based on echocardiography and risk factors

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-021-01535-5 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Jingyi Zhang ◽

Huolan Zhu ◽

Yongkai Chen ◽

Chenguang Yang ◽

Huimin Cheng ◽

...

Keyword(s):

Machine Learning ◽

Coronary Heart Disease ◽

Heart Disease ◽

Speckle Tracking ◽

Speckle Tracking Echocardiography ◽

Screening Tools ◽

Learning Approach ◽

Classification Models ◽

Ensemble Machine Learning ◽

Machine Learning Approach

Abstract Background Extensive clinical evidence suggests that a preventive screening of coronary heart disease (CHD) at an earlier stage can greatly reduce the mortality rate. We use 64 two-dimensional speckle tracking echocardiography (2D-STE) features and seven clinical features to predict whether one has CHD. Methods We develop a machine learning approach that integrates a number of popular classification methods together by model stacking, and generalize the traditional stacking method to a two-step stacking method to improve the diagnostic performance. Results By borrowing strengths from multiple classification models through the proposed method, we improve the CHD classification accuracy from around 70–87.7% on the testing set. The sensitivity of the proposed method is 0.903 and the specificity is 0.843, with an AUC of 0.904, which is significantly higher than those of the individual classification models. Conclusion Our work lays a foundation for the deployment of speckle tracking echocardiography-based screening tools for coronary heart disease.

Download Full-text

EEG correlation at a distance: A re-analysis of two studies using a machine learning approach

F1000Research ◽

10.12688/f1000research.17613.2 ◽

2019 ◽

Vol 8 ◽

pp. 43 ◽

Cited By ~ 2

Author(s):

Marco Bilucaglia ◽

Luciano Pederzoli ◽

William Giroldini ◽

Elena Prati ◽

Patrizio Tressoldi

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Eeg Activity ◽

Linear Discriminant ◽

Machine Learning Approach ◽

Sensorial Stimulation ◽

The Relationship ◽

Electroencephalogram Eeg ◽

Linear Discriminant Classifier

Background: In this paper, data from two studies relative to the relationship between the electroencephalogram (EEG) activities of two isolated and physically separated subjects were re-analyzed using machine-learning algorithms. The first dataset comprises the data of 25 pairs of participants where one member of each pair was stimulated with a visual and an auditory 500 Hz signals of 1 second duration. The second dataset consisted of the data of 20 pairs of participants where one member of each pair received visual and auditory stimulation lasting 1 second duration with on-off modulation at 10, 12, and 14 Hz. Methods and Results: Applying a ‘linear discriminant classifier’ to the first dataset, it was possible to correctly classify 50.74% of the EEG activity of non-stimulated participants, correlated to the remote sensorial stimulation of the distant partner. In the second dataset, the percentage of correctly classified EEG activity in the non-stimulated partners was 51.17%, 50.45% and 51.91%, respectively, for the 10, 12, and 14 Hz stimulations, with respect the condition of no stimulation in the distant partner. Conclusions: The analysis of EEG activity using machine-learning algorithms has produced advances in the study of the connection between the EEG activities of the stimulated partner and the isolated distant partner, opening new insight into the possibility to devise practical application for non-conventional “mental telecommunications” between physically and sensorially separated participants.

Download Full-text