Pipe Fault Prediction for Water Transmission Mains

Ariel Gorenstein; Meir Kalech; Daniela Fuchs Hanusch; Sharon Hassid

doi:10.3390/w12102861

Pipe Fault Prediction for Water Transmission Mains

Water ◽

10.3390/w12102861 ◽

2020 ◽

Vol 12 (10) ◽

pp. 2861

Author(s):

Ariel Gorenstein ◽

Meir Kalech ◽

Daniela Fuchs Hanusch ◽

Sharon Hassid

Keyword(s):

Prediction Model ◽

Real World ◽

Fault Prediction ◽

Data Driven ◽

Segmentation Method ◽

Reliable Prediction ◽

Rule Based ◽

Maintenance Strategies ◽

Water Transmission ◽

Proactive Maintenance

Every network of supply waterlines experiences thousands of yearly bursts, breaks, leakages, and other failures. These failures waste a great amount of resources, as not only the waterlines need to be repaired, but also water is wasted and the distribution service is interrupted. For that reason, many water facilities employ proactive maintenance strategies in their networks, where they replace likely-to-fail pipes in advance to prevent the failures. In this paper, we aim to establish a reliable prediction model that can accurately predict faults in waterlines prior to their occurrence. We propose a specific segmentation method for long transmission mains, as well as three data-driven models and one rule-based prediction model. We evaluate a real world waterline network used in Israel, operated by Mekorot company, using three common metrics. The results show that the data-driven algorithms outperform the rule-based model by at least 5% in each of the metrics. Additionally, their prediction becomes more accurate as they are trained with more data, but enhancing these data with geographically related features does not improve the accuracy further.

Download Full-text

Coffee With a Hint of Data: Towards Using Data-Driven Approaches in Personalised Long-Term Interactions

Frontiers in Robotics and AI ◽

10.3389/frobt.2021.676814 ◽

2021 ◽

Vol 8 ◽

Author(s):

Bahar Irfan ◽

Mehdi Hellou ◽

Tony Belpaeme

Keyword(s):

Real World ◽

Poor Performance ◽

User Preferences ◽

Human Robot Interaction ◽

Data Driven ◽

Rule Based ◽

The Real ◽

Related Data ◽

Memory Network

While earlier research in human-robot interaction pre-dominantly uses rule-based architectures for natural language interaction, these approaches are not flexible enough for long-term interactions in the real world due to the large variation in user utterances. In contrast, data-driven approaches map the user input to the agent output directly, hence, provide more flexibility with these variations without requiring any set of rules. However, data-driven approaches are generally applied to single dialogue exchanges with a user and do not build up a memory over long-term conversation with different users, whereas long-term interactions require remembering users and their preferences incrementally and continuously and recalling previous interactions with users to adapt and personalise the interactions, known as the lifelong learning problem. In addition, it is desirable to learn user preferences from a few samples of interactions (i.e., few-shot learning). These are known to be challenging problems in machine learning, while they are trivial for rule-based approaches, creating a trade-off between flexibility and robustness. Correspondingly, in this work, we present the text-based Barista Datasets generated to evaluate the potential of data-driven approaches in generic and personalised long-term human-robot interactions with simulated real-world problems, such as recognition errors, incorrect recalls and changes to the user preferences. Based on these datasets, we explore the performance and the underlying inaccuracies of the state-of-the-art data-driven dialogue models that are strong baselines in other domains of personalisation in single interactions, namely Supervised Embeddings, Sequence-to-Sequence, End-to-End Memory Network, Key-Value Memory Network, and Generative Profile Memory Network. The experiments show that while data-driven approaches are suitable for generic task-oriented dialogue and real-time interactions, no model performs sufficiently well to be deployed in personalised long-term interactions in the real world, because of their inability to learn and use new identities, and their poor performance in recalling user-related data.

Download Full-text

The software fault prediction model based on the AltaRica language

2019 IEEE 3rd Information Technology, Networking, Electronic and Automation Control Conference (ITNEC) ◽

10.1109/itnec.2019.8729235 ◽

2019 ◽

Author(s):

Jingyu Song ◽

Bo Chen ◽

Xueliang Li ◽

Yi Yang ◽

Chang Liu ◽

...

Keyword(s):

Prediction Model ◽

Fault Prediction ◽

Software Fault Prediction ◽

Model Based ◽

Software Fault

Download Full-text

Extraction of organic chemistry grammar from unsupervised learning of chemical reactions

Science Advances ◽

10.1126/sciadv.abe4166 ◽

2021 ◽

Vol 7 (15) ◽

pp. eabe4166

Author(s):

Philippe Schwaller ◽

Benjamin Hoover ◽

Jean-Louis Reymond ◽

Hendrik Strobelt ◽

Teodoro Laino

Keyword(s):

Organic Chemistry ◽

Neural Networks ◽

Chemical Synthesis ◽

Unsupervised Learning ◽

Chemical Reactions ◽

Data Driven ◽

Experimental Task ◽

Rule Based ◽

Atom Mapping ◽

Mapping Information

Humans use different domain languages to represent, explore, and communicate scientific concepts. During the last few hundred years, chemists compiled the language of chemical synthesis inferring a series of “reaction rules” from knowing how atoms rearrange during a chemical transformation, a process called atom-mapping. Atom-mapping is a laborious experimental task and, when tackled with computational methods, requires continuous annotation of chemical reactions and the extension of logically consistent directives. Here, we demonstrate that Transformer Neural Networks learn atom-mapping information between products and reactants without supervision or human labeling. Using the Transformer attention weights, we build a chemically agnostic, attention-guided reaction mapper and extract coherent chemical grammar from unannotated sets of reactions. Our method shows remarkable performance in terms of accuracy and speed, even for strongly imbalanced and chemically complex reactions with nontrivial atom-mapping. It provides the missing link between data-driven and rule-based approaches for numerous chemical reaction tasks.

Download Full-text

Research on a novel data-driven aging estimation method for battery systems in real-world electric vehicles

Advances in Mechanical Engineering ◽

10.1177/16878140211027735 ◽

2021 ◽

Vol 13 (7) ◽

pp. 168781402110277

Author(s):

Yankai Hou ◽

Zhaosheng Zhang ◽

Peng Liu ◽

Chunbao Song ◽

Zhenpo Wang

Keyword(s):

Electric Vehicles ◽

Real World ◽

Regression Models ◽

Estimation Method ◽

Recursive Least Squares ◽

Data Driven ◽

Accurate Estimation ◽

Support Vector ◽

Battery Degradation ◽

Operational Data

Accurate estimation of the degree of battery aging is essential to ensure safe operation of electric vehicles. In this paper, using real-world vehicles and their operational data, a battery aging estimation method is proposed based on a dual-polarization equivalent circuit (DPEC) model and multiple data-driven models. The DPEC model and the forgetting factor recursive least-squares method are used to determine the battery system’s ohmic internal resistance, with outliers being filtered using boxplots. Furthermore, eight common data-driven models are used to describe the relationship between battery degradation and the factors influencing this degradation, and these models are analyzed and compared in terms of both estimation accuracy and computational requirements. The results show that the gradient descent tree regression, XGBoost regression, and light GBM regression models are more accurate than the other methods, with root mean square errors of less than 6.9 mΩ. The AdaBoost and random forest regression models are regarded as alternative groups because of their relative instability. The linear regression, support vector machine regression, and k-nearest neighbor regression models are not recommended because of poor accuracy or excessively high computational requirements. This work can serve as a reference for subsequent battery degradation studies based on real-time operational data.

Download Full-text

Data-driven Energy Management Strategy for Plug-in Hybrid Electric Vehicles with Real-World Trip Information

IFAC-PapersOnLine ◽

10.1016/j.ifacol.2020.12.1070 ◽

2020 ◽

Vol 53 (2) ◽

pp. 14224-14229

Author(s):

Yongkeun Choi ◽

Jacopo Guanetti ◽

Scott Moura ◽

Francesco Borrelli

Keyword(s):

Energy Management ◽

Electric Vehicles ◽

Real World ◽

Management Strategy ◽

Hybrid Electric Vehicles ◽

Data Driven ◽

Energy Management Strategy ◽

Hybrid Electric

Download Full-text

Data-driven graph drawing techniques with applications for conveyor systems

Journal of Mathematics in Industry ◽

10.1186/s13362-020-00092-2 ◽

2020 ◽

Vol 10 (1) ◽

Author(s):

Simone Göttlich ◽

Sven Spieckermann ◽

Stephan Stauber ◽

Andrea Storck

Keyword(s):

Real World ◽

Connected Graph ◽

Stress Function ◽

Graph Drawing ◽

Point Of View ◽

Data Driven ◽

Challenging Problem ◽

Conveyor System ◽

System Graph ◽

Real World Problems

AbstractThe visualization of conveyor systems in the sense of a connected graph is a challenging problem. Starting from communication data provided by the IT system, graph drawing techniques are applied to generate an appealing layout of the conveyor system. From a mathematical point of view, the key idea is to use the concept of stress majorization to minimize a stress function over the positions of the nodes in the graph. Different to the already existing literature, we have to take care of special features inspired by the real-world problems.

Download Full-text

HISTORICAL RETE NETWORKS TO SUPPORT THE DEBUGGING OF FORWARD-CHAINING RULE-BASED PROGRAMS

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213093000059 ◽

1993 ◽

Vol 02 (01) ◽

pp. 47-70

Author(s):

SHARON M. TUTTLE ◽

CHRISTOPH F. EICK

Keyword(s):

Working Memory ◽

Data Driven ◽

Rule Based ◽

Historical Information ◽

Forward Chaining ◽

Changing Environments ◽

Time Performance ◽

Inference Network ◽

One Step ◽

Explanation System

Forward-chaining rule-based programs, being data-driven, can function in changing environments in which backward-chaining rule-based programs would have problems. But, degugging forward-chaining programs can be tedious; to debug a forward-chaining rule-based program, certain ‘historical’ information about the program run is needed. Programmers should be able to directly request such information, instead of having to rerun the program one step at a time or search a trace of run details. As a first step in designing an explanation system for answering such questions, this paper discusses how a forward-chaining program run’s ‘historical’ details can be stored in its Rete inference network, used to match rule conditions to working memory. This can be done without seriously affecting the network’s run-time performance. We call this generalization of the Rete network a historical Rete network. Various algorithms for maintaining this network are discussed, along with how it can be used during debugging, and a debugging tool, MIRO, that incorporates these techniques is also discussed.

Download Full-text

Rule-Based Data-Driven Analytics for Wide-Area Fault Detection Using Synchrophasor Data

IEEE Transactions on Industry Applications ◽

10.1109/tia.2016.2644621 ◽

2017 ◽

Vol 53 (3) ◽

pp. 1789-1798 ◽

Cited By ~ 19

Author(s):

Xiaodong Liang ◽

Scott A. Wallace ◽

Duc Nguyen

Keyword(s):

Fault Detection ◽

Data Driven ◽

Wide Area ◽

Rule Based

Download Full-text

Generating real-world evidence from unstructured clinical notes to examine clinical utility of genetic tests: use case in BRCAness

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-020-01364-y ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Yiqing Zhao ◽

Saravut J. Weroha ◽

Ellen L. Goode ◽

Hongfang Liu ◽

Chen Wang

Keyword(s):

Targeted Therapy ◽

Data Quality ◽

Real World ◽

Genetic Information ◽

Genetic Data ◽

Real World Data ◽

Rule Based ◽

Clinical Notes ◽

Real World Evidence ◽

F Measure

Abstract Background Next-generation sequencing provides comprehensive information about individuals’ genetic makeup and is commonplace in oncology clinical practice. However, the utility of genetic information in the clinical decision-making process has not been examined extensively from a real-world, data-driven perspective. Through mining real-world data (RWD) from clinical notes, we could extract patients’ genetic information and further associate treatment decisions with genetic information. Methods We proposed a real-world evidence (RWE) study framework that incorporates context-based natural language processing (NLP) methods and data quality examination before final association analysis. The framework was demonstrated in a Foundation-tested women cancer cohort (N = 196). Upon retrieval of patients’ genetic information using NLP system, we assessed the completeness of genetic data captured in unstructured clinical notes according to a genetic data-model. We examined the distribution of different topics regarding BRCA1/2 throughout patients’ treatment process, and then analyzed the association between BRCA1/2 mutation status and the discussion/prescription of targeted therapy. Results We identified seven topics in the clinical context of genetic mentions including: Information, Evaluation, Insurance, Order, Negative, Positive, and Variants of unknown significance. Our rule-based system achieved a precision of 0.87, recall of 0.93 and F-measure of 0.91. Our machine learning system achieved a precision of 0.901, recall of 0.899 and F-measure of 0.9 for four-topic classification and a precision of 0.833, recall of 0.823 and F-measure of 0.82 for seven-topic classification. We found in result-containing sentences, the capture of BRCA1/2 mutation information was 75%, but detailed variant information (e.g. variant types) is largely missing. Using cleaned RWD, significant associations were found between BRCA1/2 positive mutation and targeted therapies. Conclusions In conclusion, we demonstrated a framework to generate RWE using RWD from different clinical sources. Rule-based NLP system achieved the best performance for resolving contextual variability when extracting RWD from unstructured clinical notes. Data quality issues such as incompleteness and discrepancies exist thus manual data cleaning is needed before further analysis can be performed. Finally, we were able to use cleaned RWD to evaluate the real-world utility of genetic information to initiate a prescription of targeted therapy.

Download Full-text

A Data-Driven Heart Disease Prediction Model Through K-Means Clustering-Based Anomaly Detection

SN Computer Science ◽

10.1007/s42979-021-00518-7 ◽

2021 ◽

Vol 2 (2) ◽

Author(s):

Rony Chowdhury Ripan ◽

Iqbal H. Sarker ◽

Syed Md. Minhaz Hossain ◽

Md. Musfique Anwar ◽

Raza Nowrozy ◽

...

Keyword(s):

Heart Disease ◽

Anomaly Detection ◽

Prediction Model ◽

Data Driven ◽

Disease Prediction

Download Full-text