scholarly journals Pipe Fault Prediction for Water Transmission Mains

Water ◽  
2020 ◽  
Vol 12 (10) ◽  
pp. 2861
Author(s):  
Ariel Gorenstein ◽  
Meir Kalech ◽  
Daniela Fuchs Hanusch ◽  
Sharon Hassid

Every network of supply waterlines experiences thousands of yearly bursts, breaks, leakages, and other failures. These failures waste a great amount of resources, as not only the waterlines need to be repaired, but also water is wasted and the distribution service is interrupted. For that reason, many water facilities employ proactive maintenance strategies in their networks, where they replace likely-to-fail pipes in advance to prevent the failures. In this paper, we aim to establish a reliable prediction model that can accurately predict faults in waterlines prior to their occurrence. We propose a specific segmentation method for long transmission mains, as well as three data-driven models and one rule-based prediction model. We evaluate a real world waterline network used in Israel, operated by Mekorot company, using three common metrics. The results show that the data-driven algorithms outperform the rule-based model by at least 5% in each of the metrics. Additionally, their prediction becomes more accurate as they are trained with more data, but enhancing these data with geographically related features does not improve the accuracy further.

2021 ◽  
Vol 8 ◽  
Author(s):  
Bahar Irfan ◽  
Mehdi Hellou ◽  
Tony Belpaeme

While earlier research in human-robot interaction pre-dominantly uses rule-based architectures for natural language interaction, these approaches are not flexible enough for long-term interactions in the real world due to the large variation in user utterances. In contrast, data-driven approaches map the user input to the agent output directly, hence, provide more flexibility with these variations without requiring any set of rules. However, data-driven approaches are generally applied to single dialogue exchanges with a user and do not build up a memory over long-term conversation with different users, whereas long-term interactions require remembering users and their preferences incrementally and continuously and recalling previous interactions with users to adapt and personalise the interactions, known as the lifelong learning problem. In addition, it is desirable to learn user preferences from a few samples of interactions (i.e., few-shot learning). These are known to be challenging problems in machine learning, while they are trivial for rule-based approaches, creating a trade-off between flexibility and robustness. Correspondingly, in this work, we present the text-based Barista Datasets generated to evaluate the potential of data-driven approaches in generic and personalised long-term human-robot interactions with simulated real-world problems, such as recognition errors, incorrect recalls and changes to the user preferences. Based on these datasets, we explore the performance and the underlying inaccuracies of the state-of-the-art data-driven dialogue models that are strong baselines in other domains of personalisation in single interactions, namely Supervised Embeddings, Sequence-to-Sequence, End-to-End Memory Network, Key-Value Memory Network, and Generative Profile Memory Network. The experiments show that while data-driven approaches are suitable for generic task-oriented dialogue and real-time interactions, no model performs sufficiently well to be deployed in personalised long-term interactions in the real world, because of their inability to learn and use new identities, and their poor performance in recalling user-related data.


2021 ◽  
Vol 7 (15) ◽  
pp. eabe4166
Author(s):  
Philippe Schwaller ◽  
Benjamin Hoover ◽  
Jean-Louis Reymond ◽  
Hendrik Strobelt ◽  
Teodoro Laino

Humans use different domain languages to represent, explore, and communicate scientific concepts. During the last few hundred years, chemists compiled the language of chemical synthesis inferring a series of “reaction rules” from knowing how atoms rearrange during a chemical transformation, a process called atom-mapping. Atom-mapping is a laborious experimental task and, when tackled with computational methods, requires continuous annotation of chemical reactions and the extension of logically consistent directives. Here, we demonstrate that Transformer Neural Networks learn atom-mapping information between products and reactants without supervision or human labeling. Using the Transformer attention weights, we build a chemically agnostic, attention-guided reaction mapper and extract coherent chemical grammar from unannotated sets of reactions. Our method shows remarkable performance in terms of accuracy and speed, even for strongly imbalanced and chemically complex reactions with nontrivial atom-mapping. It provides the missing link between data-driven and rule-based approaches for numerous chemical reaction tasks.


2021 ◽  
Vol 13 (7) ◽  
pp. 168781402110277
Author(s):  
Yankai Hou ◽  
Zhaosheng Zhang ◽  
Peng Liu ◽  
Chunbao Song ◽  
Zhenpo Wang

Accurate estimation of the degree of battery aging is essential to ensure safe operation of electric vehicles. In this paper, using real-world vehicles and their operational data, a battery aging estimation method is proposed based on a dual-polarization equivalent circuit (DPEC) model and multiple data-driven models. The DPEC model and the forgetting factor recursive least-squares method are used to determine the battery system’s ohmic internal resistance, with outliers being filtered using boxplots. Furthermore, eight common data-driven models are used to describe the relationship between battery degradation and the factors influencing this degradation, and these models are analyzed and compared in terms of both estimation accuracy and computational requirements. The results show that the gradient descent tree regression, XGBoost regression, and light GBM regression models are more accurate than the other methods, with root mean square errors of less than 6.9 mΩ. The AdaBoost and random forest regression models are regarded as alternative groups because of their relative instability. The linear regression, support vector machine regression, and k-nearest neighbor regression models are not recommended because of poor accuracy or excessively high computational requirements. This work can serve as a reference for subsequent battery degradation studies based on real-time operational data.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Simone Göttlich ◽  
Sven Spieckermann ◽  
Stephan Stauber ◽  
Andrea Storck

AbstractThe visualization of conveyor systems in the sense of a connected graph is a challenging problem. Starting from communication data provided by the IT system, graph drawing techniques are applied to generate an appealing layout of the conveyor system. From a mathematical point of view, the key idea is to use the concept of stress majorization to minimize a stress function over the positions of the nodes in the graph. Different to the already existing literature, we have to take care of special features inspired by the real-world problems.


1993 ◽  
Vol 02 (01) ◽  
pp. 47-70
Author(s):  
SHARON M. TUTTLE ◽  
CHRISTOPH F. EICK

Forward-chaining rule-based programs, being data-driven, can function in changing environments in which backward-chaining rule-based programs would have problems. But, degugging forward-chaining programs can be tedious; to debug a forward-chaining rule-based program, certain ‘historical’ information about the program run is needed. Programmers should be able to directly request such information, instead of having to rerun the program one step at a time or search a trace of run details. As a first step in designing an explanation system for answering such questions, this paper discusses how a forward-chaining program run’s ‘historical’ details can be stored in its Rete inference network, used to match rule conditions to working memory. This can be done without seriously affecting the network’s run-time performance. We call this generalization of the Rete network a historical Rete network. Various algorithms for maintaining this network are discussed, along with how it can be used during debugging, and a debugging tool, MIRO, that incorporates these techniques is also discussed.


2017 ◽  
Vol 53 (3) ◽  
pp. 1789-1798 ◽  
Author(s):  
Xiaodong Liang ◽  
Scott A. Wallace ◽  
Duc Nguyen

2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Yiqing Zhao ◽  
Saravut J. Weroha ◽  
Ellen L. Goode ◽  
Hongfang Liu ◽  
Chen Wang

Abstract Background Next-generation sequencing provides comprehensive information about individuals’ genetic makeup and is commonplace in oncology clinical practice. However, the utility of genetic information in the clinical decision-making process has not been examined extensively from a real-world, data-driven perspective. Through mining real-world data (RWD) from clinical notes, we could extract patients’ genetic information and further associate treatment decisions with genetic information. Methods We proposed a real-world evidence (RWE) study framework that incorporates context-based natural language processing (NLP) methods and data quality examination before final association analysis. The framework was demonstrated in a Foundation-tested women cancer cohort (N = 196). Upon retrieval of patients’ genetic information using NLP system, we assessed the completeness of genetic data captured in unstructured clinical notes according to a genetic data-model. We examined the distribution of different topics regarding BRCA1/2 throughout patients’ treatment process, and then analyzed the association between BRCA1/2 mutation status and the discussion/prescription of targeted therapy. Results We identified seven topics in the clinical context of genetic mentions including: Information, Evaluation, Insurance, Order, Negative, Positive, and Variants of unknown significance. Our rule-based system achieved a precision of 0.87, recall of 0.93 and F-measure of 0.91. Our machine learning system achieved a precision of 0.901, recall of 0.899 and F-measure of 0.9 for four-topic classification and a precision of 0.833, recall of 0.823 and F-measure of 0.82 for seven-topic classification. We found in result-containing sentences, the capture of BRCA1/2 mutation information was 75%, but detailed variant information (e.g. variant types) is largely missing. Using cleaned RWD, significant associations were found between BRCA1/2 positive mutation and targeted therapies. Conclusions In conclusion, we demonstrated a framework to generate RWE using RWD from different clinical sources. Rule-based NLP system achieved the best performance for resolving contextual variability when extracting RWD from unstructured clinical notes. Data quality issues such as incompleteness and discrepancies exist thus manual data cleaning is needed before further analysis can be performed. Finally, we were able to use cleaned RWD to evaluate the real-world utility of genetic information to initiate a prescription of targeted therapy.


2021 ◽  
Vol 2 (2) ◽  
Author(s):  
Rony Chowdhury Ripan ◽  
Iqbal H. Sarker ◽  
Syed Md. Minhaz Hossain ◽  
Md. Musfique Anwar ◽  
Raza Nowrozy ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document