Quantifying cell densities and biovolumes of phytoplankton communities and functional groups using scanning flow cytometry, machine learning and unsupervised clustering

Mapping Intimacies ◽

10.1101/274357 ◽

2018 ◽

Author(s):

Mridul K. Thomas ◽

Simone Fontana ◽

Marta Reyes ◽

Francesco Pomati

Keyword(s):

Machine Learning ◽

Flow Cytometry ◽

Functional Groups ◽

Clustering Algorithm ◽

Unsupervised Clustering ◽

Learning Tools ◽

Time Resolved ◽

Automated Method ◽

Phytoplankton Communities ◽

Cell Densities

AbstractScanning flow cytometry (SFCM) is characterized by the measurement of time-resolved pulses of fluorescence and scattering, enabling the high-throughput quantification of phytoplankton morphology and pigmentation. Quantifying variation at the single cell and colony level improves our ability to understand dynamics in natural communities. Automated high-frequency monitoring of these communities is presently limited by the absence of repeatable, rapid protocols to analyse SFCM datasets, where images of individual particles are not available. Here we demonstrate a repeatable, semi-automated method to (1) rapidly clean SFCM data from a phytoplankton community by removing signals that do not belong to live phytoplankton cells, (2) classify individual cells into trait clusters that correspond to functional groups, and (3) quantify the biovolumes of individual cells, the total biovolume of the whole community and the total biovolumes of the major functional groups. Our method involves the development of training datasets using lab cultures, the use of an unsupervised clustering algorithm to identify trait clusters, and machine learning tools (random forests) to (1) evaluate variable importance, (2) classify data points, and (3) estimate biovolumes of individual cells. We provide example datasets and R code for our analytical approach that can be adapted for analysis of datasets from other flow cytometers or scanning flow cytometers.

Download Full-text

Quantifying cell densities and biovolumes of phytoplankton communities and functional groups using scanning flow cytometry, machine learning and unsupervised clustering

PLoS ONE ◽

10.1371/journal.pone.0196225 ◽

2018 ◽

Vol 13 (5) ◽

pp. e0196225 ◽

Cited By ~ 11

Author(s):

Mridul K. Thomas ◽

Simone Fontana ◽

Marta Reyes ◽

Francesco Pomati

Keyword(s):

Machine Learning ◽

Flow Cytometry ◽

Functional Groups ◽

Unsupervised Clustering ◽

Phytoplankton Communities ◽

Cell Densities

Download Full-text

Randomized Lasso Links Microbial Taxa with Aquatic Functional Groups Inferred from Flow Cytometry

mSystems ◽

10.1128/msystems.00093-19 ◽

2019 ◽

Vol 4 (5) ◽

Cited By ~ 2

Author(s):

Peter Rubbens ◽

Marian L. Schmidt ◽

Ruben Props ◽

Bopaiah A. Biddanda ◽

Nico Boon ◽

...

Keyword(s):

Machine Learning ◽

Flow Cytometry ◽

Nucleic Acid ◽

Cell Density ◽

Functional Groups ◽

Ecosystem Functioning ◽

Marker Gene ◽

Rrna Gene ◽

Freshwater Lakes ◽

Heterotrophic Production

ABSTRACT High-nucleic-acid (HNA) and low-nucleic-acid (LNA) bacteria are two operational groups identified by flow cytometry (FCM) in aquatic systems. A number of reports have shown that HNA cell density correlates strongly with heterotrophic production, while LNA cell density does not. However, which taxa are specifically associated with these groups, and by extension, productivity has remained elusive. Here, we addressed this knowledge gap by using a machine learning-based variable selection approach that integrated FCM and 16S rRNA gene sequencing data collected from 14 freshwater lakes spanning a broad range in physicochemical conditions. There was a strong association between bacterial heterotrophic production and HNA absolute cell abundances (R2 = 0.65), but not with the more abundant LNA cells. This solidifies findings, mainly from marine systems, that HNA and LNA bacteria could be considered separate functional groups, the former contributing a disproportionately large share of carbon cycling. Taxa selected by the models could predict HNA and LNA absolute cell abundances at all taxonomic levels. Selected operational taxonomic units (OTUs) ranged from low to high relative abundance and were mostly lake system specific (89.5% to 99.2%). A subset of selected OTUs was associated with both LNA and HNA groups (12.5% to 33.3%), suggesting either phenotypic plasticity or within-OTU genetic and physiological heterogeneity. These findings may lead to the identification of system-specific putative ecological indicators for heterotrophic productivity. Generally, our approach allows for the association of OTUs with specific functional groups in diverse ecosystems in order to improve our understanding of (microbial) biodiversity-ecosystem functioning relationships. IMPORTANCE A major goal in microbial ecology is to understand how microbial community structure influences ecosystem functioning. Various methods to directly associate bacterial taxa to functional groups in the environment are being developed. In this study, we applied machine learning methods to relate taxonomic data obtained from marker gene surveys to functional groups identified by flow cytometry. This allowed us to identify the taxa that are associated with heterotrophic productivity in freshwater lakes and indicated that the key contributors were highly system specific, regularly rare members of the community, and that some could possibly switch between being low and high contributors. Our approach provides a promising framework to identify taxa that contribute to ecosystem functioning and can be further developed to explore microbial contributions beyond heterotrophic production.

Download Full-text

Randomized lasso associates freshwater lake-system specific bacterial taxa with heterotrophic production through flow cytometry

10.1101/392852 ◽

2018 ◽

Author(s):

Peter Rubbens ◽

Marian L. Schmidt ◽

Ruben Props ◽

Bopaiah A. Biddanda ◽

Nico Boon ◽

...

Keyword(s):

Machine Learning ◽

Flow Cytometry ◽

Cell Density ◽

Functional Groups ◽

Ecosystem Functioning ◽

Rrna Gene ◽

Freshwater Lakes ◽

Bacterial Physiology ◽

Lake System ◽

Heterotrophic Production

AbstractHigh-(HNA) and low-nucleic acid (LNA) bacteria are two operational groups identified by flow cytometry (FCM) in aquatic systems. HNA cell density often correlates strongly with heterotrophic production, while LNA cell density does not. However, which taxa are specifically associated with these groups, and by extension, productivity has remained elusive. Here, we addressed this knowledge gap by using a machine learning-based variable selection approach that integrated FCM and 16S rRNA gene sequencing data collected from 14 freshwater lakes spanning a broad range in physicochemical conditions. There was a strong association between bacterial heterotrophic production and HNA absolute cell abundances (R2= 0.65), but not with the more abundant LNA cells. This solidifies findings, mainly from marine systems, that HNA and LNA could be considered separate functional groups, the former contributing a disproportionately large share of carbon cycling. Taxa selected by the models could predict HNA and LNA absolute cell abundances at all taxonomic levels, with the highest performance at the OTU level. Selected OTUs ranged from low to high relative abundance and were mostly lake system-specific (89.5%-99.2%). A subset of selected OTUs was associated with both LNA and HNA groups (12.5%-33.3%) suggesting either phenotypic plasticity or within-OTU genetic and physiological heterogeneity. These findings may lead to the identification of systems-specific putative ecological indicators for heterotrophic productivity. Generally, our approach allows for the association of OTUs with specific functional groups in diverse ecosystems in order to improve our understanding of (microbial) biodiversity-ecosystem functioning relationships.ImportanceA major goal in microbial ecology is to understand how microbial community structure influences ecosystem functioning. Research is limited by the ability to readily culture most bacteria present in the environment and the difference in bacterial physiologyin situcompared to in laboratory culture. Various methods to directly associate bacterial taxa to functional groups in the environment are being developed. In this study, we applied machine learning methods to relate taxonomic data obtained from marker gene surveys to functional groups identified by flow cytometry. This allowed us to identify the taxa that are associated with heterotrophic productivity in freshwater lakes and indicated that the key contributors were highly system-specific, regularly rare members of the community, and that some could switch between being low and high contributors. Our approach provides a promising framework to identify taxa that contribute to ecosystem functioning and can be further developed to explore microbial contributions beyond heterotrophic production.

Download Full-text

A Comparative Study of Different Machine Learning Tools

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v7i4.184190 ◽

2019 ◽

Vol 7 (4) ◽

pp. 184-190

Author(s):

Himani Maheshwari ◽

Pooja Goswami ◽

Isha Rana

Keyword(s):

Machine Learning ◽

Comparative Study ◽

Learning Tools

Download Full-text

Application of Machine Learning in Animal Disease Analysis and Prediction

Current Bioinformatics ◽

10.2174/1574893615999200728195613 ◽

2020 ◽

Vol 15 ◽

Author(s):

Shuwen Zhang ◽

Qiang Su ◽

Qin Chen

Keyword(s):

Machine Learning ◽

Unsupervised Learning ◽

Supervised Learning ◽

Clustering Algorithm ◽

Principal Component ◽

Support Vector ◽

Animal Disease ◽

Human Beings ◽

Animal Diseases ◽

Disease Analysis

Abstract: Major animal diseases pose a great threat to animal husbandry and human beings. With the deepening of globalization and the abundance of data resources, the prediction and analysis of animal diseases by using big data are becoming more and more important. The focus of machine learning is to make computers learn how to learn from data and use the learned experience to analyze and predict. Firstly, this paper introduces the animal epidemic situation and machine learning. Then it briefly introduces the application of machine learning in animal disease analysis and prediction. Machine learning is mainly divided into supervised learning and unsupervised learning. Supervised learning includes support vector machines, naive bayes, decision trees, random forests, logistic regression, artificial neural networks, deep learning, and AdaBoost. Unsupervised learning has maximum expectation algorithm, principal component analysis hierarchical clustering algorithm and maxent. Through the discussion of this paper, people have a clearer concept of machine learning and understand its application prospect in animal diseases.

Download Full-text

Improved nutrient management in cereals using Nutrient Expert and machine learning tools: Productivity, profitability and nutrient use efficiency

Agricultural Systems ◽

10.1016/j.agsy.2021.103181 ◽

2021 ◽

Vol 192 ◽

pp. 103181

Author(s):

Jagadish Timsina ◽

Sudarshan Dutta ◽

Krishna Prasad Devkota ◽

Somsubhra Chakraborty ◽

Ram Krishna Neupane ◽

...

Keyword(s):

Machine Learning ◽

Nutrient Management ◽

Nutrient Use Efficiency ◽

Learning Tools ◽

Nutrient Use ◽

Use Efficiency

Download Full-text

Paper2Wire – A Case Study of User-Centred Development of Machine Learning Tools for UX Designers

i-com ◽

10.1515/icom-2021-0002 ◽

2021 ◽

Vol 20 (1) ◽

pp. 19-32

Author(s):

Daniel Buschek ◽

Charlotte Anlauff ◽

Florian Lachner

Keyword(s):

Machine Learning ◽

Development Process ◽

User Study ◽

Concept Development ◽

Lessons Learned ◽

Design Tool ◽

Learning Tools ◽

Interface Elements ◽

Industry Partner

Abstract This paper reflects on a case study of a user-centred concept development process for a Machine Learning (ML) based design tool, conducted at an industry partner. The resulting concept uses ML to match graphical user interface elements in sketches on paper to their digital counterparts to create consistent wireframes. A user study (N=20) with a working prototype shows that this concept is preferred by designers, compared to the previous manual procedure. Reflecting on our process and findings we discuss lessons learned for developing ML tools that respect practitioners’ needs and practices.

Download Full-text

Semi-automated classification of colonial Microcystis by FlowCAM imaging flow cytometry in mesocosm experiment reveals high heterogeneity during seasonal bloom

Scientific Reports ◽

10.1038/s41598-021-88661-2 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Yersultan Mirasbekov ◽

Adina Zhumakhanova ◽

Almira Zhantuyakova ◽

Kuanysh Sarkytbayev ◽

Dmitry V. Malashenkov ◽

...

Keyword(s):

Machine Learning ◽

Flow Cytometry ◽

Spatial Resolution ◽

Mesocosm Experiment ◽

Imaging Flow Cytometry ◽

Leibler Divergence ◽

Temporal And Spatial ◽

High Level ◽

Training Sets

AbstractA machine learning approach was employed to detect and quantify Microcystis colonial morphospecies using FlowCAM-based imaging flow cytometry. The system was trained and tested using samples from a long-term mesocosm experiment (LMWE, Central Jutland, Denmark). The statistical validation of the classification approaches was performed using Hellinger distances, Bray–Curtis dissimilarity, and Kullback–Leibler divergence. The semi-automatic classification based on well-balanced training sets from Microcystis seasonal bloom provided a high level of intergeneric accuracy (96–100%) but relatively low intrageneric accuracy (67–78%). Our results provide a proof-of-concept of how machine learning approaches can be applied to analyze the colonial microalgae. This approach allowed to evaluate Microcystis seasonal bloom in individual mesocosms with high level of temporal and spatial resolution. The observation that some Microcystis morphotypes completely disappeared and re-appeared along the mesocosm experiment timeline supports the hypothesis of the main transition pathways of colonial Microcystis morphoforms. We demonstrated that significant changes in the training sets with colonial images required for accurate classification of Microcystis spp. from time points differed by only two weeks due to Microcystis high phenotypic heterogeneity during the bloom. We conclude that automatic methods not only allow a performance level of human taxonomist, and thus be a valuable time-saving tool in the routine-like identification of colonial phytoplankton taxa, but also can be applied to increase temporal and spatial resolution of the study.

Download Full-text

NLOS Multipath Classification of GNSS Signal Correlation Output Using Machine Learning

Sensors ◽

10.3390/s21072503 ◽

2021 ◽

Vol 21 (7) ◽

pp. 2503

Author(s):

Taro Suzuki ◽

Yoshiharu Amano

Keyword(s):

Machine Learning ◽

Satellite System ◽

Training Data ◽

Support Vector ◽

Positioning Errors ◽

Automated Method ◽

Global Navigation Satellite ◽

Better Than ◽

Signal Correlation

This paper proposes a method for detecting non-line-of-sight (NLOS) multipath, which causes large positioning errors in a global navigation satellite system (GNSS). We use GNSS signal correlation output, which is the most primitive GNSS signal processing output, to detect NLOS multipath based on machine learning. The shape of the multi-correlator outputs is distorted due to the NLOS multipath. The features of the shape of the multi-correlator are used to discriminate the NLOS multipath. We implement two supervised learning methods, a support vector machine (SVM) and a neural network (NN), and compare their performance. In addition, we also propose an automated method of collecting training data for LOS and NLOS signals of machine learning. The evaluation of the proposed NLOS detection method in an urban environment confirmed that NN was better than SVM, and 97.7% of NLOS signals were correctly discriminated.

Download Full-text

Second order Kalman filtering channel estimation and machine learning methods for spectrum sensing in cognitive radio networks

Wireless Networks ◽

10.1007/s11276-021-02627-w ◽

2021 ◽

Author(s):

Olusegun Peter Awe ◽

Daniel Adebowale Babatunde ◽

Sangarapillai Lambotharan ◽

Basil AsSadhan

Keyword(s):

Machine Learning ◽

Kalman Filter ◽

Cognitive Radio ◽

Spectrum Sensing ◽

Cognitive Radio Networks ◽

Clustering Algorithm ◽

Polynomial Regression ◽

Primary User ◽

Radio Networks ◽

Second Order

AbstractWe address the problem of spectrum sensing in decentralized cognitive radio networks using a parametric machine learning method. In particular, to mitigate sensing performance degradation due to the mobility of the secondary users (SUs) in the presence of scatterers, we propose and investigate a classifier that uses a pilot based second order Kalman filter tracker for estimating the slowly varying channel gain between the primary user (PU) transmitter and the mobile SUs. Using the energy measurements at SU terminals as feature vectors, the algorithm is initialized by a K-means clustering algorithm with two centroids corresponding to the active and inactive status of PU transmitter. Under mobility, the centroid corresponding to the active PU status is adapted according to the estimates of the channels given by the Kalman filter and an adaptive K-means clustering technique is used to make classification decisions on the PU activity. Furthermore, to address the possibility that the SU receiver might experience location dependent co-channel interference, we have proposed a quadratic polynomial regression algorithm for estimating the noise plus interference power in the presence of mobility which can be used for adapting the centroid corresponding to inactive PU status. Simulation results demonstrate the efficacy of the proposed algorithm.

Download Full-text