Drivers of harmful algal blooms in coastal areas of Eastern Mediterranean: a machine learning methodological approach

Androniki Tamvakis;  ; George Tsirtsis; Michael Karydis; Kleanthis Patsidis; Giorgos D. Kokkoris

doi:10.3934/mbe.2021322

Drivers of harmful algal blooms in coastal areas of Eastern Mediterranean: a machine learning methodological approach

Mathematical Biosciences and Engineering ◽

10.3934/mbe.2021322 ◽

2021 ◽

Vol 18 (5) ◽

pp. 6484-6505

Author(s):

Androniki Tamvakis ◽

◽

George Tsirtsis ◽

Michael Karydis ◽

Kleanthis Patsidis ◽

...

Keyword(s):

Machine Learning ◽

Harmful Algal Blooms ◽

Algal Blooms ◽

Eastern Mediterranean ◽

Methodological Approach ◽

Algal Species ◽

Predictive Performance ◽

Machine Learning Techniques ◽

Coastal Zones ◽

Abiotic Variables

<abstract> <p>Harmful algal species are present in the Mediterranean Sea and are often associated with toxic events affecting the nearby coastal zones. The presence of 18 marine microalgae, at genus level, associated with potentially harmful characteristics was predicted using a number of machine learning techniques based exclusively on a small set of abiotic variables, already identified as drivers of blooms. Random Forest (RF) algorithm achieved the best predictive performance by correctly identifying the presence of most genera with a mean of 89.2% of total samples. Although, RF has shown lower predictive performance for genera present in a low number of samples, its predictive power remains at least "fair' in these cases. The main tree-based advantage of RF was thereafter used to assess the importance of the input variables in predicting the presence of the algal genera. Temperature had the most powerful effect on genera's presences, although this effect varies among genera. Finally, the genera were clustered based on their response to the considered abiotic variables and common trends in an ecological context were identified.</p> </abstract>

Download Full-text

Machine Learning-Based Scoring Functions. Development and Applications with SAnDReS.

Current Medicinal Chemistry ◽

10.2174/0929867327666200515101820 ◽

2020 ◽

Vol 27 ◽

Author(s):

Gabriela Bitencourt-Ferreira ◽

Camila Rizzotto ◽

Walter Filgueira de Azevedo Junior

Keyword(s):

Machine Learning ◽

Binding Affinity ◽

Drug Targets ◽

Computational Models ◽

Factor Xa ◽

Coagulation Factor ◽

Predictive Performance ◽

Machine Learning Techniques ◽

Scoring Functions ◽

Molegro Virtual Docker

Background: Analysis of atomic coordinates of protein-ligand complexes can provide three-dimensional data to generate computational models to evaluate binding affinity and thermodynamic state functions. Application of machine learning techniques can create models to assess protein-ligand potential energy and binding affinity. These methods show superior predictive performance when compared with classical scoring functions available in docking programs. Objective: Our purpose here is to review the development and application of the program SAnDReS. We describe the creation of machine learning models to assess the binding affinity of protein-ligand complexes. Method: SAnDReS implements machine learning methods available in the scikit-learn library. This program is available for download at https://github.com/azevedolab/sandres. SAnDReS uses crystallographic structures, binding, and thermodynamic data to create targeted scoring functions. Results: Recent applications of the program SAnDReS to drug targets such as Coagulation factor Xa, cyclin-dependent kinases, and HIV-1 protease were able to create targeted scoring functions to predict inhibition of these proteins. These targeted models outperform classical scoring functions. Conclusion: Here, we reviewed the development of machine learning scoring functions to predict binding affinity through the application of the program SAnDReS. Our studies show the superior predictive performance of the SAnDReS-developed models when compared with classical scoring functions available in the programs such as AutoDock4, Molegro Virtual Docker, and AutoDock Vina.

Download Full-text

Application of Machine Learning Techniques to Predict Binding Affinity for Drug Targets: A Study of Cyclin-Dependent Kinase 2

Current Medicinal Chemistry ◽

10.2174/2213275912666191102162959 ◽

2020 ◽

Vol 28 (2) ◽

pp. 253-265 ◽

Cited By ~ 3

Author(s):

Gabriela Bitencourt-Ferreira ◽

Amauri Duarte da Silva ◽

Walter Filgueira de Azevedo

Keyword(s):

Machine Learning ◽

Binding Affinity ◽

Predictive Performance ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Scoring Functions ◽

Cyclin Dependent Kinase ◽

Learning Models ◽

Learning Techniques ◽

Machine Learning Models

Background: The elucidation of the structure of cyclin-dependent kinase 2 (CDK2) made it possible to develop targeted scoring functions for virtual screening aimed to identify new inhibitors for this enzyme. CDK2 is a protein target for the development of drugs intended to modulate cellcycle progression and control. Such drugs have potential anticancer activities. Objective: Our goal here is to review recent applications of machine learning methods to predict ligand- binding affinity for protein targets. To assess the predictive performance of classical scoring functions and targeted scoring functions, we focused our analysis on CDK2 structures. Methods: We have experimental structural data for hundreds of binary complexes of CDK2 with different ligands, many of them with inhibition constant information. We investigate here computational methods to calculate the binding affinity of CDK2 through classical scoring functions and machine- learning models. Results: Analysis of the predictive performance of classical scoring functions available in docking programs such as Molegro Virtual Docker, AutoDock4, and Autodock Vina indicated that these methods failed to predict binding affinity with significant correlation with experimental data. Targeted scoring functions developed through supervised machine learning techniques showed a significant correlation with experimental data. Conclusion: Here, we described the application of supervised machine learning techniques to generate a scoring function to predict binding affinity. Machine learning models showed superior predictive performance when compared with classical scoring functions. Analysis of the computational models obtained through machine learning could capture essential structural features responsible for binding affinity against CDK2.

Download Full-text

Improving the performance of machine learning models for early warning of harmful algal blooms using an adaptive synthetic sampling method

Water Research ◽

10.1016/j.watres.2021.117821 ◽

2021 ◽

pp. 117821

Author(s):

Jin Hwi Kim ◽

Jae-Ki Shin ◽

Hankyu Lee ◽

Dong Hoon Lee ◽

Joo-hyon Kang ◽

...

Keyword(s):

Machine Learning ◽

Early Warning ◽

Harmful Algal Blooms ◽

Algal Blooms ◽

Sampling Method ◽

Learning Models ◽

Machine Learning Models

Download Full-text

INGOT-DR: an interpretable classifier for predicting drug resistance in M. tuberculosis

Algorithms for Molecular Biology ◽

10.1186/s13015-021-00198-1 ◽

2021 ◽

Vol 16 (1) ◽

Author(s):

Hooman Zabeti ◽

Nick Dexter ◽

Amir Hosein Safari ◽

Nafiseh Sedaghat ◽

Maxwell Libbrecht ◽

...

Keyword(s):

Machine Learning ◽

Drug Resistance ◽

Predictive Accuracy ◽

Group Testing ◽

Predictive Performance ◽

Machine Learning Techniques ◽

Evaluation Metrics ◽

Lower Accuracy ◽

Unseen Data ◽

The One

Abstract Motivation Prediction of drug resistance and identification of its mechanisms in bacteria such as Mycobacterium tuberculosis, the etiological agent of tuberculosis, is a challenging problem. Solving this problem requires a transparent, accurate, and flexible predictive model. The methods currently used for this purpose rarely satisfy all of these criteria. On the one hand, approaches based on testing strains against a catalogue of previously identified mutations often yield poor predictive performance; on the other hand, machine learning techniques typically have higher predictive accuracy, but often lack interpretability and may learn patterns that produce accurate predictions for the wrong reasons. Current interpretable methods may either exhibit a lower accuracy or lack the flexibility needed to generalize them to previously unseen data. Contribution In this paper we propose a novel technique, inspired by group testing and Boolean compressed sensing, which yields highly accurate predictions, interpretable results, and is flexible enough to be optimized for various evaluation metrics at the same time. Results We test the predictive accuracy of our approach on five first-line and seven second-line antibiotics used for treating tuberculosis. We find that it has a higher or comparable accuracy to that of commonly used machine learning models, and is able to identify variants in genes with previously reported association to drug resistance. Our method is intrinsically interpretable, and can be customized for different evaluation metrics. Our implementation is available at github.com/hoomanzabeti/INGOT_DR and can be installed via The Python Package Index (Pypi) under ingotdr. This package is also compatible with most of the tools in the Scikit-learn machine learning library.

Download Full-text

Harmful Algal Blooms: Physiology, Behavior, Population Dynamics and Global Impacts- A Review

Sultan Qaboos University Journal for Science [SQUJS] ◽

10.24200/squjs.vol10iss0pp1-30 ◽

2005 ◽

Vol 10 ◽

pp. 1

Author(s):

H.M. Al-Ghelani ◽

A.Y.A AlKindi ◽

S. Amer ◽

Y.K Al-Akhzami

Keyword(s):

Harmful Algal Blooms ◽

Algal Blooms ◽

Global Climate ◽

Algal Species ◽

Molecular Probes ◽

Screening Methods ◽

Global Climate Changes ◽

Global Impacts ◽

Anthropogenic Loading ◽

Global Increase

Harmful, toxic algae are now considered as one of the important players in the newly emerging environmental risk factors. The apparent global increase in harmful algal blooms (HABs) is becoming a serious problem in both aquaculture and fisheries populations. Not only has the magnitude and intensity of public health and economic impacts of these blooms increased in recent years, but the number of geographic locations experiencing toxic algal blooms has also increased dramatically. There are two primary factors causing HABs outbreaks. The natural processes such as upwelling and relaxation, and the anthropogenic loading resulting in eutrophication. However, the influence of global climate changes on algal bloom phenomenon cannot be ignored. The problem warrants development of effective strategies for the management and mitigation of HABs. Progress made in the routine coastal monitoring programs, development of methods for detection of algal species and toxins and coastal modeling activities for predicting HABs reflect the international concerns regarding the impacts of HABs. Innovative techniques using molecular probes will hopefully result in development of rapid, reliable screening methods for phycotoxins and the causative organisms.

Download Full-text

Comparison of Short-Term Toxicity of 14 Common Phycotoxins (Alone and in Combination) To The Survival of Brine Shrimp Artemia Salina

10.21203/rs.3.rs-905984/v1 ◽

2021 ◽

Author(s):

Yu Ting Zhang ◽

Shanshan SONG ◽

Bin ZHANG ◽

Yang ZHANG ◽

Miao TIAN ◽

...

Keyword(s):

Ecological Risk ◽

Harmful Algal Blooms ◽

Algal Blooms ◽

Algal Species ◽

Natural Populations ◽

Marine Ecosystems ◽

Artemia Salina ◽

Marine Organisms ◽

Short Term ◽

The Impact

Abstract Toxic harmful algal blooms (HABs) can cause deleterious effects in marine organisms, threatening the stability of marine ecosystems. It is well known that different strains, natural populations and growth conditions of the same toxic algal species may lead to different amount of phycotoxin production and the ensuing toxicity. To fully assess the ecological risk of toxic HABs, it is of great importance to investigate the toxic effects of phycotoxins in marine organisms. In this study, the short-term toxicity of 14 common phycotoxins (alone and in combination) in the marine zooplankton Artemia salina was investigated. On the basis of 48 h LC50, the order of toxicity in A. salina was AZA3 (with a LC50 of 0.0203 µg/ml)＞AZA2 (0.0273 µg/ml) ＞PTX2 (0.0396 µg/ml)＞DTX1 (0.0819 µg/ml)＞AZA1 (0.106 µg/ml)＞ SPX1 (0.144 µg/ml)＞YTX (0.172 µg/ml)＞dcSTX (0.668 µg/ml)＞OA (0.728 µg/ml)＞STX (1.042 µg/ml)＞GYM (1.069 µg/ml)＞PbTx3 (1.239 µg/ml)＞hYTX (1.799 µg/ml)＞PbTx2 (2.415 µg/ml). For the binary exposure, additive effects of OA and DTX1, DTX1 and hYTX; antagonistic effects of OA and PTX2, OA and STX; and synergetic effects of DTX1 and STX, DTX1 and YTX, DTX1 and PTX2, PTX2 and hYTX on the mortality of A. salina were observed. These results provide valuable toxicological data for assessing the impact of phycotoxins on marine planktonic species and highlight the potential ecological risk of toxic HABs in marine ecosystems.

Download Full-text

Modern machine learning outperforms GLMs at predicting spikes

10.1101/111450 ◽

2017 ◽

Cited By ~ 4

Author(s):

Ari S. Benjamin ◽

Hugo L. Fernandes ◽

Tucker Tomlinson ◽

Pavan Ramkumar ◽

Chris VerSteeg ◽

...

Keyword(s):

Machine Learning ◽

Neural Activity ◽

Linear Models ◽

Feedforward Neural Networks ◽

Predictive Performance ◽

Machine Learning Techniques ◽

Machine Learning Methods ◽

Learning Techniques ◽

Neural Spiking ◽

Modern Machine

AbstractNeuroscience has long focused on finding encoding models that effectively ask “what predicts neural spiking?” and generalized linear models (GLMs) are a typical approach. It is often unknown how much of explainable neural activity is captured, or missed, when fitting a GLM. Here we compared the predictive performance of GLMs to three leading machine learning methods: feedforward neural networks, gradient boosted trees (using XGBoost), and stacked ensembles that combine the predictions of several methods. We predicted spike counts in macaque motor (M1) and somatosensory (S1) cortices from standard representations of reaching kinematics, and in rat hippocampal cells from open field location and orientation. In general, the modern methods (particularly XGBoost and the ensemble) produced more accurate spike predictions and were less sensitive to the preprocessing of features. This discrepancy in performance suggests that standard feature sets may often relate to neural activity in a nonlinear manner not captured by GLMs. Encoding models built with machine learning techniques, which can be largely automated, more accurately predict spikes and can offer meaningful benchmarks for simpler models.

Download Full-text

Review of Harmful Algal Blooms in the Coastal Mediterranean Sea, with a Focus on Greek Waters

Diversity ◽

10.3390/d13080396 ◽

2021 ◽

Vol 13 (8) ◽

pp. 396

Author(s):

Christina Tsikoti ◽

Savvas Genitsaris

Keyword(s):

Mediterranean Sea ◽

Harmful Algal Blooms ◽

Algal Blooms ◽

Ecosystem Health ◽

Algal Species ◽

Coastal Areas ◽

Population Increase ◽

Nutrient Inputs ◽

Environmental Pressures ◽

Marine Eutrophication

Anthropogenic marine eutrophication has been recognized as one of the major threats to aquatic ecosystem health. In recent years, eutrophication phenomena, prompted by global warming and population increase, have stimulated the proliferation of potentially harmful algal taxa resulting in the prevalence of frequent and intense harmful algal blooms (HABs) in coastal areas. Numerous coastal areas of the Mediterranean Sea (MS) are under environmental pressures arising from human activities that are driving ecosystem degradation and resulting in the increase of the supply of nutrient inputs. In this review, we aim to present the recent situation regarding the appearance of HABs in Mediterranean coastal areas linked to anthropogenic eutrophication, to highlight the features and particularities of the MS, and to summarize the harmful phytoplankton outbreaks along the length of coastal areas of many localities. Furthermore, we focus on HABs documented in Greek coastal areas according to the causative algal species, the period of occurrence, and the induced damage in human and ecosystem health. The occurrence of eutrophication-induced HAB incidents during the past two decades is emphasized.

Download Full-text

Bio, psycho, or social: supervised machine learning to classify discursive framing of depression in online health communities

Quality & Quantity ◽

10.1007/s11135-021-01299-0 ◽

2022 ◽

Author(s):

Renáta Németh ◽

Fanni Máté ◽

Eszter Katona ◽

Márton Rakovics ◽

Domonkos Sik

Keyword(s):

Social Sciences ◽

Machine Learning ◽

Data Science ◽

Predictive Performance ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Gradient Boosting ◽

Online Health Communities ◽

Health Communities ◽

Open Question

AbstractSupervised machine learning on textual data has successful industrial/business applications, but it is an open question whether it can be utilized in social knowledge building outside the scope of hermeneutically more trivial cases. Combining sociology and data science raises several methodological and epistemological questions. In our study the discursive framing of depression is explored in online health communities. Three discursive frameworks are introduced: the bio-medical, psychological, and social framings of depression. ~80 000 posts were collected, and a sample of them was manually classified. Conventional bag-of-words models, Gradient Boosting Machine, word-embedding-based models and a state-of-the-art Transformer-based model with transfer learning, called DistilBERT were applied to expand this classification on the whole database. According to our experience ‘discursive framing’ proves to be a complex and hermeneutically difficult concept, which affects the degree of both inter-annotator agreement and predictive performance. Our finding confirms that the level of inter-annotator disagreement provides a good estimate for the objective difficulty of the classification. By identifying the most important terms, we also interpreted the classification algorithms, which is of great importance in social sciences. We are convinced that machine learning techniques can extend the horizon of qualitative text analysis. Our paper supports a smooth fit of the new techniques into the traditional toolbox of social sciences.

Download Full-text

Customer determinants of used auto loan churn: comparing predictive performance using machine learning techniques

Journal of Marketing Analytics ◽

10.1057/s41270-021-00135-6 ◽

2021 ◽

Author(s):

Chandrasekhar Valluri ◽

Sudhakar Raju ◽

Vivek H. Patil

Keyword(s):

Machine Learning ◽

Predictive Performance ◽

Machine Learning Techniques ◽

Learning Techniques

Download Full-text