scholarly journals Drivers of harmful algal blooms in coastal areas of Eastern Mediterranean: a machine learning methodological approach

2021 ◽  
Vol 18 (5) ◽  
pp. 6484-6505
Author(s):  
Androniki Tamvakis ◽  
◽  
George Tsirtsis ◽  
Michael Karydis ◽  
Kleanthis Patsidis ◽  
...  

<abstract> <p>Harmful algal species are present in the Mediterranean Sea and are often associated with toxic events affecting the nearby coastal zones. The presence of 18 marine microalgae, at genus level, associated with potentially harmful characteristics was predicted using a number of machine learning techniques based exclusively on a small set of abiotic variables, already identified as drivers of blooms. Random Forest (RF) algorithm achieved the best predictive performance by correctly identifying the presence of most genera with a mean of 89.2% of total samples. Although, RF has shown lower predictive performance for genera present in a low number of samples, its predictive power remains at least "fair' in these cases. The main tree-based advantage of RF was thereafter used to assess the importance of the input variables in predicting the presence of the algal genera. Temperature had the most powerful effect on genera's presences, although this effect varies among genera. Finally, the genera were clustered based on their response to the considered abiotic variables and common trends in an ecological context were identified.</p> </abstract>

2020 ◽  
Vol 27 ◽  
Author(s):  
Gabriela Bitencourt-Ferreira ◽  
Camila Rizzotto ◽  
Walter Filgueira de Azevedo Junior

Background: Analysis of atomic coordinates of protein-ligand complexes can provide three-dimensional data to generate computational models to evaluate binding affinity and thermodynamic state functions. Application of machine learning techniques can create models to assess protein-ligand potential energy and binding affinity. These methods show superior predictive performance when compared with classical scoring functions available in docking programs. Objective: Our purpose here is to review the development and application of the program SAnDReS. We describe the creation of machine learning models to assess the binding affinity of protein-ligand complexes. Method: SAnDReS implements machine learning methods available in the scikit-learn library. This program is available for download at https://github.com/azevedolab/sandres. SAnDReS uses crystallographic structures, binding, and thermodynamic data to create targeted scoring functions. Results: Recent applications of the program SAnDReS to drug targets such as Coagulation factor Xa, cyclin-dependent kinases, and HIV-1 protease were able to create targeted scoring functions to predict inhibition of these proteins. These targeted models outperform classical scoring functions. Conclusion: Here, we reviewed the development of machine learning scoring functions to predict binding affinity through the application of the program SAnDReS. Our studies show the superior predictive performance of the SAnDReS-developed models when compared with classical scoring functions available in the programs such as AutoDock4, Molegro Virtual Docker, and AutoDock Vina.


2020 ◽  
Vol 28 (2) ◽  
pp. 253-265 ◽  
Author(s):  
Gabriela Bitencourt-Ferreira ◽  
Amauri Duarte da Silva ◽  
Walter Filgueira de Azevedo

Background: The elucidation of the structure of cyclin-dependent kinase 2 (CDK2) made it possible to develop targeted scoring functions for virtual screening aimed to identify new inhibitors for this enzyme. CDK2 is a protein target for the development of drugs intended to modulate cellcycle progression and control. Such drugs have potential anticancer activities. Objective: Our goal here is to review recent applications of machine learning methods to predict ligand- binding affinity for protein targets. To assess the predictive performance of classical scoring functions and targeted scoring functions, we focused our analysis on CDK2 structures. Methods: We have experimental structural data for hundreds of binary complexes of CDK2 with different ligands, many of them with inhibition constant information. We investigate here computational methods to calculate the binding affinity of CDK2 through classical scoring functions and machine- learning models. Results: Analysis of the predictive performance of classical scoring functions available in docking programs such as Molegro Virtual Docker, AutoDock4, and Autodock Vina indicated that these methods failed to predict binding affinity with significant correlation with experimental data. Targeted scoring functions developed through supervised machine learning techniques showed a significant correlation with experimental data. Conclusion: Here, we described the application of supervised machine learning techniques to generate a scoring function to predict binding affinity. Machine learning models showed superior predictive performance when compared with classical scoring functions. Analysis of the computational models obtained through machine learning could capture essential structural features responsible for binding affinity against CDK2.


2021 ◽  
Vol 16 (1) ◽  
Author(s):  
Hooman Zabeti ◽  
Nick Dexter ◽  
Amir Hosein Safari ◽  
Nafiseh Sedaghat ◽  
Maxwell Libbrecht ◽  
...  

Abstract Motivation Prediction of drug resistance and identification of its mechanisms in bacteria such as Mycobacterium tuberculosis, the etiological agent of tuberculosis, is a challenging problem. Solving this problem requires a transparent, accurate, and flexible predictive model. The methods currently used for this purpose rarely satisfy all of these criteria. On the one hand, approaches based on testing strains against a catalogue of previously identified mutations often yield poor predictive performance; on the other hand, machine learning techniques typically have higher predictive accuracy, but often lack interpretability and may learn patterns that produce accurate predictions for the wrong reasons. Current interpretable methods may either exhibit a lower accuracy or lack the flexibility needed to generalize them to previously unseen data. Contribution In this paper we propose a novel technique, inspired by group testing and Boolean compressed sensing, which yields highly accurate predictions, interpretable results, and is flexible enough to be optimized for various evaluation metrics at the same time. Results We test the predictive accuracy of our approach on five first-line and seven second-line antibiotics used for treating tuberculosis. We find that it has a higher or comparable accuracy to that of commonly used machine learning models, and is able to identify variants in genes with previously reported association to drug resistance. Our method is intrinsically interpretable, and can be customized for different evaluation metrics. Our implementation is available at github.com/hoomanzabeti/INGOT_DR and can be installed via The Python Package Index (Pypi) under ingotdr. This package is also compatible with most of the tools in the Scikit-learn machine learning library.


Author(s):  
H.M. Al-Ghelani ◽  
A.Y.A AlKindi ◽  
S. Amer ◽  
Y.K Al-Akhzami

Harmful, toxic algae are now considered as one of the important players in the newly emerging environmental risk factors. The apparent global increase in harmful algal blooms (HABs) is becoming a serious problem in both aquaculture and fisheries populations. Not only has the magnitude and intensity of public health and economic impacts of these blooms increased in recent years, but the number of geographic locations experiencing toxic algal blooms has also increased dramatically. There are two primary factors causing HABs outbreaks. The natural processes such as upwelling and relaxation, and the anthropogenic loading resulting in eutrophication. However, the influence of global climate changes on algal bloom phenomenon cannot be ignored. The problem warrants development of effective strategies for the management and mitigation of HABs. Progress made in the routine coastal monitoring programs, development of methods for detection of algal species and toxins and coastal modeling activities for predicting HABs reflect the international concerns regarding the impacts of HABs. Innovative techniques using molecular probes will hopefully result in development of rapid, reliable screening methods for phycotoxins and the causative organisms.            


2021 ◽  
Author(s):  
Yu Ting Zhang ◽  
Shanshan SONG ◽  
Bin ZHANG ◽  
Yang ZHANG ◽  
Miao TIAN ◽  
...  

Abstract Toxic harmful algal blooms (HABs) can cause deleterious effects in marine organisms, threatening the stability of marine ecosystems. It is well known that different strains, natural populations and growth conditions of the same toxic algal species may lead to different amount of phycotoxin production and the ensuing toxicity. To fully assess the ecological risk of toxic HABs, it is of great importance to investigate the toxic effects of phycotoxins in marine organisms. In this study, the short-term toxicity of 14 common phycotoxins (alone and in combination) in the marine zooplankton Artemia salina was investigated. On the basis of 48 h LC50, the order of toxicity in A. salina was AZA3 (with a LC50 of 0.0203 µg/ml)>AZA2 (0.0273 µg/ml) >PTX2 (0.0396 µg/ml)>DTX1 (0.0819 µg/ml)>AZA1 (0.106 µg/ml)> SPX1 (0.144 µg/ml)>YTX (0.172 µg/ml)>dcSTX (0.668 µg/ml)>OA (0.728 µg/ml)>STX (1.042 µg/ml)>GYM (1.069 µg/ml)>PbTx3 (1.239 µg/ml)>hYTX (1.799 µg/ml)>PbTx2 (2.415 µg/ml). For the binary exposure, additive effects of OA and DTX1, DTX1 and hYTX; antagonistic effects of OA and PTX2, OA and STX; and synergetic effects of DTX1 and STX, DTX1 and YTX, DTX1 and PTX2, PTX2 and hYTX on the mortality of A. salina were observed. These results provide valuable toxicological data for assessing the impact of phycotoxins on marine planktonic species and highlight the potential ecological risk of toxic HABs in marine ecosystems.


2017 ◽  
Author(s):  
Ari S. Benjamin ◽  
Hugo L. Fernandes ◽  
Tucker Tomlinson ◽  
Pavan Ramkumar ◽  
Chris VerSteeg ◽  
...  

AbstractNeuroscience has long focused on finding encoding models that effectively ask “what predicts neural spiking?” and generalized linear models (GLMs) are a typical approach. It is often unknown how much of explainable neural activity is captured, or missed, when fitting a GLM. Here we compared the predictive performance of GLMs to three leading machine learning methods: feedforward neural networks, gradient boosted trees (using XGBoost), and stacked ensembles that combine the predictions of several methods. We predicted spike counts in macaque motor (M1) and somatosensory (S1) cortices from standard representations of reaching kinematics, and in rat hippocampal cells from open field location and orientation. In general, the modern methods (particularly XGBoost and the ensemble) produced more accurate spike predictions and were less sensitive to the preprocessing of features. This discrepancy in performance suggests that standard feature sets may often relate to neural activity in a nonlinear manner not captured by GLMs. Encoding models built with machine learning techniques, which can be largely automated, more accurately predict spikes and can offer meaningful benchmarks for simpler models.


Diversity ◽  
2021 ◽  
Vol 13 (8) ◽  
pp. 396
Author(s):  
Christina Tsikoti ◽  
Savvas Genitsaris

Anthropogenic marine eutrophication has been recognized as one of the major threats to aquatic ecosystem health. In recent years, eutrophication phenomena, prompted by global warming and population increase, have stimulated the proliferation of potentially harmful algal taxa resulting in the prevalence of frequent and intense harmful algal blooms (HABs) in coastal areas. Numerous coastal areas of the Mediterranean Sea (MS) are under environmental pressures arising from human activities that are driving ecosystem degradation and resulting in the increase of the supply of nutrient inputs. In this review, we aim to present the recent situation regarding the appearance of HABs in Mediterranean coastal areas linked to anthropogenic eutrophication, to highlight the features and particularities of the MS, and to summarize the harmful phytoplankton outbreaks along the length of coastal areas of many localities. Furthermore, we focus on HABs documented in Greek coastal areas according to the causative algal species, the period of occurrence, and the induced damage in human and ecosystem health. The occurrence of eutrophication-induced HAB incidents during the past two decades is emphasized.


Author(s):  
Renáta Németh ◽  
Fanni Máté ◽  
Eszter Katona ◽  
Márton Rakovics ◽  
Domonkos Sik

AbstractSupervised machine learning on textual data has successful industrial/business applications, but it is an open question whether it can be utilized in social knowledge building outside the scope of hermeneutically more trivial cases. Combining sociology and data science raises several methodological and epistemological questions. In our study the discursive framing of depression is explored in online health communities. Three discursive frameworks are introduced: the bio-medical, psychological, and social framings of depression. ~80 000 posts were collected, and a sample of them was manually classified. Conventional bag-of-words models, Gradient Boosting Machine, word-embedding-based models and a state-of-the-art Transformer-based model with transfer learning, called DistilBERT were applied to expand this classification on the whole database. According to our experience ‘discursive framing’ proves to be a complex and hermeneutically difficult concept, which affects the degree of both inter-annotator agreement and predictive performance. Our finding confirms that the level of inter-annotator disagreement provides a good estimate for the objective difficulty of the classification. By identifying the most important terms, we also interpreted the classification algorithms, which is of great importance in social sciences. We are convinced that machine learning techniques can extend the horizon of qualitative text analysis. Our paper supports a smooth fit of the new techniques into the traditional toolbox of social sciences.


Sign in / Sign up

Export Citation Format

Share Document