Using Machine Learning for Remote Behaviour Classification—Verifying Acceleration Data to Infer Feeding Events in Free-Ranging Cheetahs

Lisa Giese; Jörg Melzheimer; Dirk Bockmühl; Bernd Wasiolka; Wanja Rast; Anne Berger; Bettina Wachter

doi:10.3390/s21165426

Using Machine Learning for Remote Behaviour Classification—Verifying Acceleration Data to Infer Feeding Events in Free-Ranging Cheetahs

Sensors ◽

10.3390/s21165426 ◽

2021 ◽

Vol 21 (16) ◽

pp. 5426

Author(s):

Lisa Giese ◽

Jörg Melzheimer ◽

Dirk Bockmühl ◽

Bernd Wasiolka ◽

Wanja Rast ◽

...

Keyword(s):

Machine Learning ◽

Ensemble Learning ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Learning Approaches ◽

Direct Observations ◽

Acceleration Data ◽

Probability Threshold ◽

Free Ranging

Behavioural studies of elusive wildlife species are challenging but important when they are threatened and involved in human-wildlife conflicts. Accelerometers (ACCs) and supervised machine learning algorithms (MLAs) are valuable tools to remotely determine behaviours. Here we used five captive cheetahs in Namibia to test the applicability of ACC data in identifying six behaviours by using six MLAs on data we ground-truthed by direct observations. We included two ensemble learning approaches and a probability threshold to improve prediction accuracy. We used the model to then identify the behaviours in four free-ranging cheetah males. Feeding behaviours identified by the model and matched with corresponding GPS clusters were verified with previously identified kill sites in the field. The MLAs and the two ensemble learning approaches in the captive cheetahs achieved precision (recall) ranging from 80.1% to 100.0% (87.3% to 99.2%) for resting, walking and trotting/running behaviour, from 74.4% to 81.6% (54.8% and 82.4%) for feeding behaviour and from 0.0% to 97.1% (0.0% and 56.2%) for drinking and grooming behaviour. The model application to the ACC data of the free-ranging cheetahs successfully identified all nine kill sites and 17 of the 18 feeding events of the two brother groups. We demonstrated that our behavioural model reliably detects feeding events of free-ranging cheetahs. This has useful applications for the determination of cheetah kill sites and helping to mitigate human-cheetah conflicts.

Download Full-text

Attack and Anomaly Detection in IoT Networks Using Supervised Machine Learning Approaches

Revue d intelligence artificielle ◽

10.18280/ria.350102 ◽

2021 ◽

Vol 35 (1) ◽

pp. 11-21

Author(s):

Himani Tyagi ◽

Rajendra Kumar

Keyword(s):

Machine Learning ◽

Performance Metrics ◽

Detection System ◽

Feature Reduction ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Testing Time ◽

Learning Approaches ◽

Reduction Techniques ◽

Share Data

IoT is characterized by communication between things (devices) that constantly share data, analyze, and make decisions while connected to the internet. This interconnected architecture is attracting cyber criminals to expose the IoT system to failure. Therefore, it becomes imperative to develop a system that can accurately and automatically detect anomalies and attacks occurring in IoT networks. Therefore, in this paper, an Intrsuion Detection System (IDS) based on extracted novel feature set synthesizing BoT-IoT dataset is developed that can swiftly, accurately and automatically differentiate benign and malicious traffic. Instead of using available feature reduction techniques like PCA that can change the core meaning of variables, a unique feature set consisting of only seven lightweight features is developed that is also IoT specific and attack traffic independent. Also, the results shown in the study demonstrates the effectiveness of fabricated seven features in detecting four wide variety of attacks namely DDoS, DoS, Reconnaissance, and Information Theft. Furthermore, this study also proves the applicability and efficiency of supervised machine learning algorithms (KNN, LR, SVM, MLP, DT, RF) in IoT security. The performance of the proposed system is validated using performance Metrics like accuracy, precision, recall, F-Score and ROC. Though the accuracy of Decision Tree (99.9%) and Randon Forest (99.9%) Classifiers are same but other metrics like training and testing time shows Random Forest comparatively better.

Download Full-text

Ensemble Learning Approaches Based on Covariance Pooling of CNN Features for High Resolution Remote Sensing Scene Classification

Remote Sensing ◽

10.3390/rs12203292 ◽

2020 ◽

Vol 12 (20) ◽

pp. 3292

Author(s):

Sara Akodad ◽

Lionel Bombrun ◽

Junshi Xia ◽

Yannick Berthoumieu ◽

Christian Germain

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Ensemble Learning ◽

Covariance Matrices ◽

Machine Learning Algorithms ◽

Learning Approach ◽

Learning Approaches ◽

Scene Classification ◽

First Order ◽

Fisher Vector Encoding

Remote sensing image scene classification, which consists of labeling remote sensing images with a set of categories based on their content, has received remarkable attention for many applications such as land use mapping. Standard approaches are based on the multi-layer representation of first-order convolutional neural network (CNN) features. However, second-order CNNs have recently been shown to outperform traditional first-order CNNs for many computer vision tasks. Hence, the aim of this paper is to show the use of second-order statistics of CNN features for remote sensing scene classification. This takes the form of covariance matrices computed locally or globally on the output of a CNN. However, these datapoints do not lie in an Euclidean space but a Riemannian manifold. To manipulate them, Euclidean tools are not adapted. Other metrics should be considered such as the log-Euclidean one. This consists of projecting the set of covariance matrices on a tangent space defined at a reference point. In this tangent plane, which is a vector space, conventional machine learning algorithms can be considered, such as the Fisher vector encoding or SVM classifier. Based on this log-Euclidean framework, we propose a novel transfer learning approach composed of two hybrid architectures based on covariance pooling of CNN features, the first is local and the second is global. They rely on the extraction of features from models pre-trained on the ImageNet dataset processed with some machine learning algorithms. The first hybrid architecture consists of an ensemble learning approach with the log-Euclidean Fisher vector encoding of region covariance matrices computed locally on the first layers of a CNN. The second one concerns an ensemble learning approach based on the covariance pooling of CNN features extracted globally from the deepest layers. These two ensemble learning approaches are then combined together based on the strategy of the most diverse ensembles. For validation and comparison purposes, the proposed approach is tested on various challenging remote sensing datasets. Experimental results exhibit a significant gain of approximately 2% in overall accuracy for the proposed approach compared to a similar state-of-the-art method based on covariance pooling of CNN features (on the UC Merced dataset).

Download Full-text

Using tri-axial accelerometer loggers to identify spawning behaviours of large pelagic fish

Movement Ecology ◽

10.1186/s40462-021-00248-8 ◽

2021 ◽

Vol 9 (1) ◽

Author(s):

Thomas M. Clarke ◽

Sasha K. Whitmarsh ◽

Jenna L. Hounslow ◽

Adrian C. Gleiss ◽

Nicholas L. Payne ◽

...

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Body Position ◽

High Amplitude ◽

Pelagic Fish ◽

Supervised Machine Learning ◽

Accelerometer Data ◽

Direct Observations ◽

Novel Approach ◽

Free Ranging

Abstract Background Tri-axial accelerometers have been used to remotely describe and identify in situ behaviours of a range of animals without requiring direct observations. Datasets collected from these accelerometers (i.e. acceleration, body position) are often large, requiring development of semi-automated analyses to classify behaviours. Marine fishes exhibit many “burst” behaviours with high amplitude accelerations that are difficult to interpret and differentiate. This has constrained the development of accurate automated techniques to identify different “burst” behaviours occurring naturally, where direct observations are not possible. Methods We trained a random forest machine learning algorithm based on 624 h of accelerometer data from six captive yellowtail kingfish during spawning periods. We identified five distinct behaviours (swim, feed, chafe, escape, and courtship), which were used to train the model based on 58 predictive variables. Results Overall accuracy of the model was 94%. Classification of each behavioural class was variable; F1 scores ranged from 0.48 (chafe) – 0.99 (swim). The model was subsequently applied to accelerometer data from eight free-ranging kingfish, and all behaviour classes described from captive fish were predicted by the model to occur, including 19 events of courtship behaviours ranging from 3 s to 108 min in duration. Conclusion Our findings provide a novel approach of applying a supervised machine learning model on free-ranging animals, which has previously been predominantly constrained to direct observations of behaviours and not predicted from an unseen dataset. Additionally, our findings identify typically ambiguous spawning and courtship behaviours of a large pelagic fish as they naturally occur.

Download Full-text

Ensemble learning predicts multiple sclerosis disease course in the SUMMIT study

npj Digital Medicine ◽

10.1038/s41746-020-00338-8 ◽

2020 ◽

Vol 3 (1) ◽

Author(s):

Yijun Zhao ◽

◽

Tong Wang ◽

Riley Bove ◽

Bruce Cree ◽

...

Keyword(s):

Machine Learning ◽

Multiple Sclerosis ◽

San Francisco ◽

Ensemble Learning ◽

Disease Onset ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Learning Approaches ◽

Disease Course ◽

Learning Techniques

AbstractThe rate of disability accumulation varies across multiple sclerosis (MS) patients. Machine learning techniques may offer more powerful means to predict disease course in MS patients. In our study, 724 patients from the Comprehensive Longitudinal Investigation in MS at Brigham and Women’s Hospital (CLIMB study) and 400 patients from the EPIC dataset, University of California, San Francisco, were included in the analysis. The primary outcome was an increase in Expanded Disability Status Scale (EDSS) ≥ 1.5 (worsening) or not (non-worsening) at up to 5 years after the baseline visit. Classification models were built using the CLIMB dataset with patients’ clinical and MRI longitudinal observations in first 2 years, and further validated using the EPIC dataset. We compared the performance of three popular machine learning algorithms (SVM, Logistic Regression, and Random Forest) and three ensemble learning approaches (XGBoost, LightGBM, and a Meta-learner L). A “threshold” was established to trade-off the performance between the two classes. Predictive features were identified and compared among different models. Machine learning models achieved 0.79 and 0.83 AUC scores for the CLIMB and EPIC datasets, respectively, shortly after disease onset. Ensemble learning methods were more effective and robust compared to standalone algorithms. Two ensemble models, XGBoost and LightGBM were superior to the other four models evaluated in our study. Of variables evaluated, EDSS, Pyramidal Function, and Ambulatory Index were the top common predictors in forecasting the MS disease course. Machine learning techniques, in particular ensemble methods offer increased accuracy for the prediction of MS disease course.

Download Full-text

Gene function finding through cross-organism ensemble learning

BioData Mining ◽

10.1186/s13040-021-00239-w ◽

2021 ◽

Vol 14 (1) ◽

Author(s):

Gianluca Moro ◽

Marco Masseroli

Keyword(s):

Machine Learning ◽

Prediction Model ◽

Ensemble Learning ◽

Gene Function ◽

Web Application ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Biological Information ◽

Ensemble Prediction ◽

Learning Method

Abstract Background Structured biological information about genes and proteins is a valuable resource to improve discovery and understanding of complex biological processes via machine learning algorithms. Gene Ontology (GO) controlled annotations describe, in a structured form, features and functions of genes and proteins of many organisms. However, such valuable annotations are not always reliable and sometimes are incomplete, especially for rarely studied organisms. Here, we present GeFF (Gene Function Finder), a novel cross-organism ensemble learning method able to reliably predict new GO annotations of a target organism from GO annotations of another source organism evolutionarily related and better studied. Results Using a supervised method, GeFF predicts unknown annotations from random perturbations of existing annotations. The perturbation consists in randomly deleting a fraction of known annotations in order to produce a reduced annotation set. The key idea is to train a supervised machine learning algorithm with the reduced annotation set to predict, namely to rebuild, the original annotations. The resulting prediction model, in addition to accurately rebuilding the original known annotations for an organism from their perturbed version, also effectively predicts new unknown annotations for the organism. Moreover, the prediction model is also able to discover new unknown annotations in different target organisms without retraining.We combined our novel method with different ensemble learning approaches and compared them to each other and to an equivalent single model technique. We tested the method with five different organisms using their GO annotations: Homo sapiens, Mus musculus, Bos taurus, Gallus gallus and Dictyostelium discoideum. The outcomes demonstrate the effectiveness of the cross-organism ensemble approach, which can be customized with a trade-off between the desired number of predicted new annotations and their precision.A Web application to browse both input annotations used and predicted ones, choosing the ensemble prediction method to use, is publicly available at http://tiny.cc/geff/. Conclusions Our novel cross-organism ensemble learning method provides reliable predicted novel gene annotations, i.e., functions, ranked according to an associated likelihood value. They are very valuable both to speed the annotation curation, focusing it on the prioritized new annotations predicted, and to complement known annotations available.

Download Full-text

Sentiment Classification of Movie Reviews by Supervised Machine Learning Approaches Using Ensemble Learning & Voted Algorithm

2nd International Conference on Data, Engineering and Applications (IDEA) ◽

10.1109/idea49133.2020.9170684 ◽

2020 ◽

Author(s):

Rubeena Parveen ◽

Neelesh Shrivastava ◽

Pradeep Tripathi

Keyword(s):

Machine Learning ◽

Ensemble Learning ◽

Sentiment Classification ◽

Supervised Machine Learning ◽

Learning Approaches

Download Full-text

Exploring the Use of Machine Learning to Automate the Qualitative Coding of Church-related Tweets

Fieldwork in Religion ◽

10.1558/firn.40610 ◽

2020 ◽

Vol 14 (2) ◽

pp. 140-159

Author(s):

Anthony-Paul Cooper ◽

Emmanuel Awuni Kolog ◽

Erkki Sutinen

Keyword(s):

Machine Learning ◽

Online Community ◽

High Volume ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Social Media Data ◽

Twitter Data ◽

Resource Intensity ◽

Media Data ◽

Better Than

This article builds on previous research around the exploration of the content of church-related tweets. It does so by exploring whether the qualitative thematic coding of such tweets can, in part, be automated by the use of machine learning. It compares three supervised machine learning algorithms to understand how useful each algorithm is at a classification task, based on a dataset of human-coded church-related tweets. The study finds that one such algorithm, Naïve-Bayes, performs better than the other algorithms considered, returning Precision, Recall and F-measure values which each exceed an acceptable threshold of 70%. This has far-reaching consequences at a time where the high volume of social media data, in this case, Twitter data, means that the resource-intensity of manual coding approaches can act as a barrier to understanding how the online community interacts with, and talks about, church. The findings presented in this article offer a way forward for scholars of digital theology to better understand the content of online church discourse.

Download Full-text

Application of Supervised Machine Learning Algorithms for Lithofacies Classification.

10.2523/19349-ms ◽

2019 ◽

Author(s):

Subhadeep Sarkar ◽

Chandan Majumdar

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Lithofacies Classification

Download Full-text

Mol2vec: Unsupervised Machine Learning Approach with Chemical Intuition

10.26434/chemrxiv.5513581.v1 ◽

2017 ◽

Author(s):

Sabrina Jaeger ◽

Simone Fulle ◽

Samo Turk

Keyword(s):

Machine Learning ◽

Language Processing ◽

Supervised Machine Learning ◽

Learning Approach ◽

Learning Approaches ◽

Unsupervised Machine Learning ◽

Feature Representations ◽

Machine Learning Approach ◽

The Individual ◽

Vector Representations

Inspired by natural language processing techniques we here introduce Mol2vec which is an unsupervised machine learning approach to learn vector representations of molecular substructures. Similarly, to the Word2vec models where vectors of closely related words are in close proximity in the vector space, Mol2vec learns vector representations of molecular substructures that are pointing in similar directions for chemically related substructures. Compounds can finally be encoded as vectors by summing up vectors of the individual substructures and, for instance, feed into supervised machine learning approaches to predict compound properties. The underlying substructure vector embeddings are obtained by training an unsupervised machine learning approach on a so-called corpus of compounds that consists of all available chemical matter. The resulting Mol2vec model is pre-trained once, yields dense vector representations and overcomes drawbacks of common compound feature representations such as sparseness and bit collisions. The prediction capabilities are demonstrated on several compound property and bioactivity data sets and compared with results obtained for Morgan fingerprints as reference compound representation. Mol2vec can be easily combined with ProtVec, which employs the same Word2vec concept on protein sequences, resulting in a proteochemometric approach that is alignment independent and can be thus also easily used for proteins with low sequence similarities.

Download Full-text

A Deep Analysis and Efficient Implementation of Supervised Machine Learning Algorithms for Enhancing The Classification Ability of System

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v7i3.10941101 ◽

2019 ◽

Vol 7 (3) ◽

pp. 1094-1101

Author(s):

Sandeep Kumar Verma ◽

Turendar Sahu ◽

Manjit Jaiswal

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Efficient Implementation ◽

Machine Learning Algorithms ◽

Supervised Machine Learning

Download Full-text