scholarly journals Predicting classifications in marine biomonitoring with supervised machine learning: how much data is required?

2021 ◽  
Vol 4 ◽  
Author(s):  
Verena Dully ◽  
Tom Wilding ◽  
Timo Mühlhaus ◽  
Thorsten Stoeck

Marine coastal ecosystems offer numerous ecosystem services and are therefore subject to a variety of stressors from anthropogenic activities. Environmental biomonitoring programs for effective management and conservation of coastal marine ecosystems are therefore crucial. Traditional monitoring has been based on macrofauna indices which are laborious and require expert knowledge. Recently, eDNA metabarcoding has become increasingly popular as it does not involve macrofauna species identification and is therefore cost and time inexpensive. Studies have shown that ecosystem monitoring based on eDNA metabarcoding is feasible and random forest (RF) algorithms can predict various biological indices, and therefore ecosystem health. To propose adequate designs for future eDNA metabarcoding-based marine coastal monitoring surveys, the aim of the study is to find out (1) What is the lower limit of reads for accurate RF predictions in coastal marine monitoring using microbial communities? (2) Is this limit the same for different monitoring targets? To achieve this goal, we exploited four different Illumina amplicon datasets obtained from bacterial communities in different costal environments. From these datasets, we predicted different objectives relevant for biomonitoring. For each dataset, those corresponding prediction objectives (labels) were predicted using amplicon sequence variants (ASVs) as features. After construction of RF models using all available sequences of a dataset (full model, serving as benchmark for targeted prediction accuracy), we then successively down-sampled each dataset to lower sequence numbers. Prediction accuracies of the reduced models were then compared to the accuracies of the full models to assess the minimum number of features to obtain the targeted prediction accuracy. Our results show that there is no general answer to question (1) and that (2) the limit varies between different monitoring targets. We have identified the most informative criteria that are relevant to assess the sequencing depth required to predict a biomonitoring category using RF. This may guide future study designs and may help to estimate and control costs in applied routine DNA-based biomonitoring using RF to predict the biomonitoring target. In our contribution we will elucidate and discuss these criteria.

2019 ◽  
Vol 289 ◽  
pp. 10010
Author(s):  
Kayo Ohashi ◽  
Jun-ichi Arai ◽  
Toshiaki Mizobuchi

Clarifying the creep behaviour of concrete at early age not only improves the accuracy of temperature stress analysis but also contributes to prediction accuracy and control measures in cracks caused by thermal stress. However, most past researches on creep behaviour were investigated after 28 days. Currently, it is difficult to accurately perceive the creep behaviour of concrete at an early age in the test method of creep which is generally carried out. Therefore, it is necessary to evaluate the creep behaviour of concrete at early age and to establish a convenient test method to estimate the creep behaviour. Therefore, in this study, experiments were carried out for concrete at early age within one week. As the result of the experiments, it was shown that the creep strain is proportional to the load stress of concrete at an early age and the strain of specific creep decreases as the loaded age increases. In addition, based on the experimental results, an estimation equation for creep strain at early age was proposed. Within the scope of this experimental result, it was confirmed that the estimation equation proposed in this study accurately represented the creep behaviour of concrete at early age.


Author(s):  
Mazlin Mokhtar ◽  
Minhaz Farid Ahmed ◽  
Khai Ern Lee ◽  
Lubna Alam ◽  
Choo Ta Goh ◽  
...  

Despite many good policies and institutions, the coastal environment of Langkawi continues to deteriorate. This could be due to lack of effective governance as well as unregulated waste discharge. Evidences collected from the literature during 1996 to 2013 also revealed a significant increase in the concentrations of Zn (R2 = 0.78) and Pb (R2 = 0.12) in the sediment. This appears to be the result of large volume of terrestrial runoff that brings these metals originating from extensive anthropogenic activities. It is a vital indicator of coastal pollution. It is a matter of concern that in many cases Pb concentration in the sediment exceeded the world average value 20 μg/g as well as Canadian Interim Sediment Quality Standard of 35 μg/g for the coastal areas. Similarly, the metal pollution index (MPI) measured over a period of 2007 to 2009 in fish also indicated an increasing trend of pollution in Langkawi. The maximum MPI value (4.87) was recorded in Spanish mackerel. Since pollution of coastal environment has serious implications for marine biodiversity and health of seafood consumers, measures are required to address this problem. Use of constructed wetland might be effective in reducing the coastal pollution as this will filter the effluent and waste before their mixing with the coastal water. Furthermore, enabling the stakeholders to play the environmental stewardship role will ensure better governance of coastal ecosystem and effective implementation of policies, envisaging an improved monitoring of waste/effluent discharge into the coastal marine environment. These measures are among the actions necessary for achieving a sustainable coastal environment of Langkawi.


2019 ◽  
Vol 16 (6) ◽  
pp. 172988141989132
Author(s):  
Ivan Chavdarov ◽  
Bozhidar Naydenov

The proposed study presents an original concept for the design of a walking robot with a minimum number of motors. The robot has a simple design and control system, successfully moves by walking, avoids or overcomes obstacles using only two independently controlled motors. Described are basic geometric and kinematic dependencies related to its movement. It is proposed optimization of basic dimensions of the robot in order to reduce energy losses when moving on flat terrain. Developed and produced is a 3-D printed prototype of the robot. Simulation and experiments for overcoming an obstacle are presented. Trajectories and instantaneous velocities centers of links from the robot are experimentally determined. The phases of walking and the stages of overcoming an obstacle are described. The theoretical and experimental results are compared. The suggested dimensional optimization approaches to reduce energy loss and experimental determination of the instant center of rotation are also applicable to other walking robots.


2020 ◽  
Vol 309 ◽  
pp. 05005
Author(s):  
Yonghong Chen ◽  
Ping Hu ◽  
Dong Zhang

Life cycle cost(LCC) is an important content of equipment integrated logistics support. While the LCC includes the whole life cycle of equipment from development, production, service and maintenance to retirement, in order to effectively manage and control the LCC and better develop integrated logistics support, it is necessary to analyze and predict it. The unbiased grey markov model(UGMM) was introduced into the LCC prediction in the paper, in order to check model accuracy, the posterior difference method(PDM) was used, also the influence by the number of state intervals in UGMM on the prediction accuracy is analyzed and studied. The result indicate that UGMM can be used to predict the LCC, also have the highest prediction accuracy comparing with unbiased grey model and grey separating model, and in order to ensure the prediction accuracy, the state interval should be divided according to the number of sequence.


2000 ◽  
Vol 48 (2) ◽  
pp. 283-306 ◽  
Author(s):  
Carole Smith

A Foucauldian analysis of discourse and power relations suggests that law and the juridical field have lost their pre-eminent role in government via the delegated exercise of sovereign power. According to Foucault, the government of a population is achieved through the wide dispersal of technologies of power which are relatively invisible and which function in discursive sites and practices throughout the social fabric. Expert knowledge occupies a privileged position in government and its essentially discretionary and norm-governed judgements infiltrate and colonise previous sites of power. This paper sets out to challenge a Foucauldian view that principled law has ceded its power and authority to the disciplinary sciences and their expert practitioners. It argues, with particular reference to case law on sterilisation and caesarean sections, that law and the juridical field operate to manipulate and control expert knowledge to their own ends. In so doing, law continually exercises and re-affirms its power as part of the sovereign state. Far from acting, as Foucault suggests, to provide a legitimating gloss on the subversive operations of technologies of power, law turns the tables and itself operates a form of surveillance over the norm-governed exercise of expert knowledge.


With ever rising emission of pollutant gases from different sources like factories, auto mobiles and power, it is a subject of emerging concerns that some strong measures are required to monitor and control these pollutant. Breathing of these gases may cause serious harmful effects to anyone. In these gases, Carbon Monoxide (CO) is often called "Silent Killer" as being colour-less, odour-less and poisonous, it is undetectable by humans. When inhaled it, it deprives blood stream of oxygen and suffocates its victim. In this paper we are proposing a simple system to monitor Carbon Monoxide (CO). Carbon Monoxide (CO) detectors are used to detect CO. This paper also discusses analysis of amount of these CO based a data set from Kaggle and prediction of possible amount of CO in air using regression. The prediction accuracy which is measured as RMSE is 0.17766.


2019 ◽  
Author(s):  
Anna Mikhaylova ◽  
Timothy Thornton

AbstractPredicting gene expression with genetic data has garnered significant attention in recent years. PrediXcan is one of the most widely used gene-based association methods for testing imputed gene expression values with a phenotype due to the invaluable insight the method has shown into the relationship between complex traits and the component of gene expression that can be attributed to genetic variation. The prediction models for PrediXcan, however, were obtained using supervised machine learning methods and training data from the Depression and Gene Network (DGN) and the Genotype-Tissue Expression (GTEx) data, where the majority of subjects are of European descent. Many genetic studies, however, include samples from multi-ethnic populations, and in this paper we assess the accuracy of gene expression predictions with PrediXcan in diverse populations. Using transcriptomic data from the GEUVADIS (Genetic European Variation in Health and Disease) RNA sequencing project and whole genome sequencing data from the 1000 Genomes project, we evaluate and compare the predictive performance of PrediXcan in an African population (Yoruban) and four European populations. Prediction results are obtained using a range of models from PrediXcan weight databases, and Pearson’s correlation coefficient is used to measure prediction accuracy. We demonstrate that the predictive performance of PrediXcan varies across populations (F-test p-value < 0.001), where prediction accuracy is the worst in the Yoruban sample compared to European samples. Moreover, the performance of PrediXcan varies not only among distant populations, but also among closely related populations as well. We also find that the qualitative performance of PrediXcan for the populations considered is consistent across all weight databases used.


Sign in / Sign up

Export Citation Format

Share Document