scholarly journals Precise Water Leak Detection Using Machine Learning and Real-Time Sensor Data

IoT ◽  
2020 ◽  
Vol 1 (2) ◽  
pp. 474-493
Author(s):  
João Alves Coelho ◽  
André Glória ◽  
Pedro Sebastião

Water is a crucial natural resource, and it is widely mishandled, with an estimated one third of world water utilities having loss of water of around 40% due to leakage. This paper presents a proposal for a system based on a wireless sensor network designed to monitor water distribution systems, such as irrigation systems, which, with the help of an autonomous learning algorithm, allows for precise location of water leaks. The complete system architecture is detailed, including hardware, communication, and data analysis. A study to discover the best machine learning algorithm between random forest, decision trees, neural networks, and Support Vector Machine (SVM) to fit leak detection is presented, including the methodology, training, and validation as well as the obtained results. Finally, the developed system is validated in a real-case implementation that shows that it is able to detect leaks with a 75% accuracy.

Author(s):  
Maryam Kammoun ◽  
Amina Kammoun ◽  
Mohamed Abid

Abstract Leakage in water distribution systems is a significant long-standing problem due to the huge economic and ecological losses. Different leak detection studies have been examined in literature using different types of technologies and data. Currently, although machine learning techniques have achieved tremendous progress in outlier detection approaches, they are still limited in terms of water leak detection applications. This research aims to improve the leak detection performances by refining the choices of learning data and techniques. From this perspective, commonly used techniques for leak detection are assessed in this paper, and the characteristics of hydraulic data are investigated. Four intelligent algorithms are compared, namely k-nearest neighbors, support vector machines, logistic regression, and multi-layer perceptron. This study focuses on six experiments based on identifying outliers in various packages of pressure and flow data, yearly data, seasonal data, night data, and flow data difference to detect leakage in water distribution networks. Different scenarios of realistic water demand in two networks from the benchmark dataset LeakDB are used. Results demonstrate that the leak detection accuracy varies between 30% and 100% depending on the experiment and the choices of algorithms and data.


Smart Cities ◽  
2021 ◽  
Vol 4 (4) ◽  
pp. 1293-1315
Author(s):  
Neda Mashhadi ◽  
Isam Shahrour ◽  
Nivine Attoue ◽  
Jamal El Khattabi ◽  
Ammar Aljer

This paper presents an investigation of the capacity of machine learning methods (ML) to localize leakage in water distribution systems (WDS). This issue is critical because water leakage causes economic losses, damages to the surrounding infrastructures, and soil contamination. Progress in real-time monitoring of WDS and ML has created new opportunities to develop data-based methods for water leak localization. However, the managers of WDS need recommendations for the selection of the appropriate ML methods as well their practical use for leakage localization. This paper contributes to this issue through an investigation of the capacity of ML methods to localize leakage in WDS. The campus of Lille University was used as support for this research. The paper is presented as follows: First, flow and pressure data were determined using EPANET software; then, the generated data were used to investigate the capacity of six ML methods to localize water leakage. Finally, the results of the investigations were used for leakage localization from offline water flow data. The results showed excellent performance for leakage localization by the artificial neural network, logistic regression, and random forest, but there were low performances for the unsupervised methods because of overlapping clusters.


2020 ◽  
Author(s):  
Nalika Ulapane ◽  
Karthick Thiyagarajan ◽  
sarath kodagoda

<div>Classification has become a vital task in modern machine learning and Artificial Intelligence applications, including smart sensing. Numerous machine learning techniques are available to perform classification. Similarly, numerous practices, such as feature selection (i.e., selection of a subset of descriptor variables that optimally describe the output), are available to improve classifier performance. In this paper, we consider the case of a given supervised learning classification task that has to be performed making use of continuous-valued features. It is assumed that an optimal subset of features has already been selected. Therefore, no further feature reduction, or feature addition, is to be carried out. Then, we attempt to improve the classification performance by passing the given feature set through a transformation that produces a new feature set which we have named the “Binary Spectrum”. Via a case study example done on some Pulsed Eddy Current sensor data captured from an infrastructure monitoring task, we demonstrate how the classification accuracy of a Support Vector Machine (SVM) classifier increases through the use of this Binary Spectrum feature, indicating the feature transformation’s potential for broader usage.</div><div><br></div>


2019 ◽  
Vol 23 (1) ◽  
pp. 12-21 ◽  
Author(s):  
Shikha N. Khera ◽  
Divya

Information technology (IT) industry in India has been facing a systemic issue of high attrition in the past few years, resulting in monetary and knowledge-based loses to the companies. The aim of this research is to develop a model to predict employee attrition and provide the organizations opportunities to address any issue and improve retention. Predictive model was developed based on supervised machine learning algorithm, support vector machine (SVM). Archival employee data (consisting of 22 input features) were collected from Human Resource databases of three IT companies in India, including their employment status (response variable) at the time of collection. Accuracy results from the confusion matrix for the SVM model showed that the model has an accuracy of 85 per cent. Also, results show that the model performs better in predicting who will leave the firm as compared to predicting who will not leave the company.


2021 ◽  
Vol 10 (5) ◽  
pp. 992
Author(s):  
Martina Barchitta ◽  
Andrea Maugeri ◽  
Giuliana Favara ◽  
Paolo Marco Riela ◽  
Giovanni Gallo ◽  
...  

Patients in intensive care units (ICUs) were at higher risk of worsen prognosis and mortality. Here, we aimed to evaluate the ability of the Simplified Acute Physiology Score (SAPS II) to predict the risk of 7-day mortality, and to test a machine learning algorithm which combines the SAPS II with additional patients’ characteristics at ICU admission. We used data from the “Italian Nosocomial Infections Surveillance in Intensive Care Units” network. Support Vector Machines (SVM) algorithm was used to classify 3782 patients according to sex, patient’s origin, type of ICU admission, non-surgical treatment for acute coronary disease, surgical intervention, SAPS II, presence of invasive devices, trauma, impaired immunity, antibiotic therapy and onset of HAI. The accuracy of SAPS II for predicting patients who died from those who did not was 69.3%, with an Area Under the Curve (AUC) of 0.678. Using the SVM algorithm, instead, we achieved an accuracy of 83.5% and AUC of 0.896. Notably, SAPS II was the variable that weighted more on the model and its removal resulted in an AUC of 0.653 and an accuracy of 68.4%. Overall, these findings suggest the present SVM model as a useful tool to early predict patients at higher risk of death at ICU admission.


2021 ◽  
Vol 9 (1) ◽  
Author(s):  
Linda A. Antonucci ◽  
Alessandra Raio ◽  
Giulio Pergola ◽  
Barbara Gelao ◽  
Marco Papalino ◽  
...  

Abstract Background Recent views posited that negative parenting and attachment insecurity can be considered as general environmental factors of vulnerability for psychosis, specifically for individuals diagnosed with psychosis (PSY). Furthermore, evidence highlighted a tight relationship between attachment style and social cognition abilities, a key PSY behavioral phenotype. The aim of this study is to generate a machine learning algorithm based on the perceived quality of parenting and attachment style-related features to discriminate between PSY and healthy controls (HC) and to investigate its ability to track PSY early stages and risk conditions, as well as its association with social cognition performance. Methods Perceived maternal and paternal parenting, as well as attachment anxiety and avoidance scores, were trained to separate 71 HC from 34 PSY (20 individuals diagnosed with schizophrenia + 14 diagnosed with bipolar disorder with psychotic manifestations) using support vector classification and repeated nested cross-validation. We then validated this model on independent datasets including individuals at the early stages of disease (ESD, i.e. first episode of psychosis or depression, or at-risk mental state for psychosis) and with familial high risk for PSY (FHR, i.e. having a first-degree relative suffering from psychosis). Then, we performed factorial analyses to test the group x classification rate interaction on emotion perception, social inference and managing of emotions abilities. Results The perceived parenting and attachment-based machine learning model discriminated PSY from HC with a Balanced Accuracy (BAC) of 72.2%. Slightly lower classification performance was measured in the ESD sample (HC-ESD BAC = 63.5%), while the model could not discriminate between FHR and HC (BAC = 44.2%). We observed a significant group x classification interaction in PSY and HC from the discovery sample on emotion perception and on the ability to manage emotions (both p = 0.02). The interaction on managing of emotion abilities was replicated in the ESD and HC validation sample (p = 0.03). Conclusion Our results suggest that parenting and attachment-related variables bear significant classification power when applied to both PSY and its early stages and are associated with variability in emotion processing. These variables could therefore be useful in psychosis early recognition programs aimed at softening the psychosis-associated disability.


2021 ◽  
pp. 1-17
Author(s):  
Ahmed Al-Tarawneh ◽  
Ja’afer Al-Saraireh

Twitter is one of the most popular platforms used to share and post ideas. Hackers and anonymous attackers use these platforms maliciously, and their behavior can be used to predict the risk of future attacks, by gathering and classifying hackers’ tweets using machine-learning techniques. Previous approaches for detecting infected tweets are based on human efforts or text analysis, thus they are limited to capturing the hidden text between tweet lines. The main aim of this research paper is to enhance the efficiency of hacker detection for the Twitter platform using the complex networks technique with adapted machine learning algorithms. This work presents a methodology that collects a list of users with their followers who are sharing their posts that have similar interests from a hackers’ community on Twitter. The list is built based on a set of suggested keywords that are the commonly used terms by hackers in their tweets. After that, a complex network is generated for all users to find relations among them in terms of network centrality, closeness, and betweenness. After extracting these values, a dataset of the most influential users in the hacker community is assembled. Subsequently, tweets belonging to users in the extracted dataset are gathered and classified into positive and negative classes. The output of this process is utilized with a machine learning process by applying different algorithms. This research build and investigate an accurate dataset containing real users who belong to a hackers’ community. Correctly, classified instances were measured for accuracy using the average values of K-nearest neighbor, Naive Bayes, Random Tree, and the support vector machine techniques, demonstrating about 90% and 88% accuracy for cross-validation and percentage split respectively. Consequently, the proposed network cyber Twitter model is able to detect hackers, and determine if tweets pose a risk to future institutions and individuals to provide early warning of possible attacks.


Sensors ◽  
2021 ◽  
Vol 21 (2) ◽  
pp. 617
Author(s):  
Umer Saeed ◽  
Young-Doo Lee ◽  
Sana Ullah Jan ◽  
Insoo Koo

Sensors’ existence as a key component of Cyber-Physical Systems makes it susceptible to failures due to complex environments, low-quality production, and aging. When defective, sensors either stop communicating or convey incorrect information. These unsteady situations threaten the safety, economy, and reliability of a system. The objective of this study is to construct a lightweight machine learning-based fault detection and diagnostic system within the limited energy resources, memory, and computation of a Wireless Sensor Network (WSN). In this paper, a Context-Aware Fault Diagnostic (CAFD) scheme is proposed based on an ensemble learning algorithm called Extra-Trees. To evaluate the performance of the proposed scheme, a realistic WSN scenario composed of humidity and temperature sensor observations is replicated with extreme low-intensity faults. Six commonly occurring types of sensor fault are considered: drift, hard-over/bias, spike, erratic/precision degradation, stuck, and data-loss. The proposed CAFD scheme reveals the ability to accurately detect and diagnose low-intensity sensor faults in a timely manner. Moreover, the efficiency of the Extra-Trees algorithm in terms of diagnostic accuracy, F1-score, ROC-AUC, and training time is demonstrated by comparison with cutting-edge machine learning algorithms: a Support Vector Machine and a Neural Network.


Sign in / Sign up

Export Citation Format

Share Document