Data-driven fault detection methods for detecting small-magnitude faults in anaerobic digestion process

Pezhman Kazemi; Jaume Giralt; Christophe Bengoa; Jean-Philippe Steyer

doi:10.2166/wst.2020.026

Data-driven fault detection methods for detecting small-magnitude faults in anaerobic digestion process

Water Science & Technology ◽

10.2166/wst.2020.026 ◽

2020 ◽

Vol 81 (8) ◽

pp. 1740-1748 ◽

Cited By ~ 2

Author(s):

Pezhman Kazemi ◽

Jaume Giralt ◽

Christophe Bengoa ◽

Jean-Philippe Steyer

Keyword(s):

Anaerobic Digestion ◽

Control Charts ◽

Volatile Fatty Acids ◽

Principal Component ◽

Data Driven ◽

Detection Methods ◽

Support Vector ◽

Statistical Control ◽

Excellent Performance ◽

Small Magnitude

Abstract Early detection of small-magnitude faults in anaerobic digestion (AD) processes is a mandatory step for preventing serious consequence in the future. Since volatile fatty acids (VFA) accumulation is widely suggested as a process health indicator, a VFA soft-sensor was developed based on support vector machine (SVM) and used for generating the residuals by comparing real and predicted VFA. The estimated residual signal was applied to univariate statistical control charts such as cumulative sum (CUSUM) and square prediction error (SPE) to detect the faults. A principal component analysis (PCA) model was also developed for comparison with the aforementioned approach. The proposed framework showed excellent performance for detecting small-magnitude faults in the state parameters of AD processes.

Download Full-text

Robust Data-Driven Soft Sensors for Online Monitoring of Volatile Fatty Acids in Anaerobic Digestion Processes

Processes ◽

10.3390/pr8010067 ◽

2020 ◽

Vol 8 (1) ◽

pp. 67 ◽

Cited By ~ 3

Author(s):

Pezhman Kazemi ◽

Jean-Philippe Steyer ◽

Christophe Bengoa ◽

Josep Font ◽

Jaume Giralt

Keyword(s):

Fatty Acids ◽

Anaerobic Digestion ◽

Real Time ◽

Volatile Fatty Acids ◽

Synthetic Data ◽

Data Driven ◽

Organic Load ◽

Support Vector ◽

Soft Sensors ◽

Artificial Neural Network Ann

The concentration of volatile fatty acids (VFAs) is one of the most important measurements for evaluating the performance of anaerobic digestion (AD) processes. In real-time applications, VFAs can be measured by dedicated sensors, which are still currently expensive and very sensitive to harsh environmental conditions. Moreover, sensors usually have a delay that is undesirable for real-time monitoring. Due to these problems, data-driven soft sensors are very attractive alternatives. This study proposes different data-driven methods for estimating reliable VFA values. We evaluated random forest (RF), artificial neural network (ANN), extreme learning machine (ELM), support vector machine (SVM) and genetic programming (GP) based on synthetic data obtained from the international water association (IWA) Benchmark Simulation Model No. 2 (BSM2). The organic load to the AD in BSM2 was modified to simulate the behavior of an anaerobic co-digestion process. The prediction and generalization performances of the different models were also compared. This comparison showed that the GP soft sensor is more precise than the other soft sensors. In addition, the model robustness was assessed to determine the performance of each model under different process states. It is also shown that, in addition to their robustness, GP soft sensors are easy to implement and provide useful insights into the process by providing explicit equations.

Download Full-text

Physical-oriented and machine learning-based emission modeling in a diesel compression ignition engine: Dimensionality reduction and regression

International Journal of Engine Research ◽

10.1177/14680874211070736 ◽

2022 ◽

pp. 146808742110707

Author(s):

Aran Mohammad ◽

Reza Rezaei ◽

Christopher Hayduk ◽

Thaddaeus Delebinski ◽

Saeid Shahpouri ◽

...

Keyword(s):

Principal Component Analysis ◽

Support Vector Machine ◽

Factor Analysis ◽

Dimensionality Reduction ◽

Principal Component ◽

Component Analysis ◽

Data Driven ◽

Support Vector ◽

Emission Models ◽

Emission Modeling

The development of internal combustion engines is affected by the exhaust gas emissions legislation and the striving to increase performance. This demands for engine-out emission models that can be used for engine optimization for real driving emission controls. The prediction capability of physically and data-driven engine-out emission models is influenced by the system inputs, which are specified by the user and can lead to an improved accuracy with increasing number of inputs. Thereby the occurrence of irrelevant inputs becomes more probable, which have a low functional relation to the emissions and can lead to overfitting. Alternatively, data-driven methods can be used to detect irrelevant and redundant inputs. In this work, thermodynamic states are modeled based on 772 stationary measured test bench data from a commercial vehicle diesel engine. Afterward, 37 measured and modeled variables are led into a data-driven dimensionality reduction. For this purpose, approaches of supervised learning, such as lasso regression and linear support vector machine, and unsupervised learning methods like principal component analysis and factor analysis are applied to select and extract the relevant features. The selected and extracted features are used for regression by the support vector machine and the feedforward neural network to model the NOx, CO, HC, and soot emissions. This enables an evaluation of the modeling accuracy as a result of the dimensionality reduction. Using the methods in this work, the 37 variables are reduced to 25, 22, 11, and 16 inputs for NOx, CO, HC, and soot emission modeling while maintaining the accuracy. The features selected using the lasso algorithm provide more accurate learning of the regression models than the extracted features through principal component analysis and factor analysis. This results in test errors RMSETe for modeling NOx, CO, HC, and soot emissions 19.22 ppm, 6.46 ppm, 1.29 ppm, and 0.06 FSN, respectively.

Download Full-text

A data-driven method of health monitoring for spacecraft

Aircraft Engineering and Aerospace Technology ◽

10.1108/aeat-08-2016-0130 ◽

2018 ◽

Vol 90 (2) ◽

pp. 435-451 ◽

Cited By ~ 1

Author(s):

Xu Kang ◽

Dechang Pi

Keyword(s):

Health Monitoring ◽

Attitude Control ◽

Principal Component ◽

Data Driven ◽

Detection Methods ◽

Content Type ◽

Orbit Control ◽

Mode Decomposition ◽

Squared Prediction Error ◽

Pca Method

Purpose The purpose of this paper is to detect the occurrence of anomaly and fault in a spacecraft, investigate various tendencies of telemetry parameters and evaluate the operation state of the spacecraft to monitor the health of the spacecraft. Design/methodology/approach This paper proposes a data-driven method (empirical mode decomposition-sample entropy-principal component analysis [EMD-SE-PCA]) for monitoring the health of the spacecraft, where EMD is used to decompose telemetry data and obtain the trend items, SE is utilised to calculate the sample entropies of trend items and extract the characteristic data and squared prediction error and statistic contribution rate are analysed using PCA to monitor the health of the spacecraft. Findings Experimental results indicate that the EMD-SE-PCA method could detect characteristic parameters that appear abnormally before the anomaly or fault occurring, could provide an abnormal early warning time before anomaly or fault appearing and summarise the contribution of each parameter more accurately than other fault detection methods. Practical implications The proposed EMD-SE-PCA method has high level of accuracy and efficiency. It can be used in monitoring the health of a spacecraft, detecting the anomaly and fault, avoiding them timely and efficiently. Also, the EMD-SE-PCA method could be further applied for monitoring the health of other equipment (e.g. attitude control and orbit control system) in spacecraft and satellites. Originality/value The paper provides a data-driven method EMD-SE-PCA to be applied in the field of practical health monitoring, which could discover the occurrence of anomaly or fault timely and efficiently and is very useful for spacecraft health diagnosis.

Download Full-text

Data-Driven Fault Classification for Non-Inverting Buck–Boost DC–DC Power Converters Based on Expectation Maximisation Principal Component Analysis and Support Vector Machine Approaches

10.1109/peas53589.2021.9628697 ◽

2021 ◽

Author(s):

Yichuan Fu ◽

Zhiwei Gao ◽

Haimeng Wu ◽

Xiuxia Yin ◽

Aihua Zhang

Keyword(s):

Principal Component Analysis ◽

Support Vector Machine ◽

Power Converters ◽

Principal Component ◽

Component Analysis ◽

Data Driven ◽

Fault Classification ◽

Support Vector ◽

Dc Power ◽

Expectation Maximisation

Download Full-text

Metric Learning Method Aided Data-Driven Design of Fault Detection Systems

Mathematical Problems in Engineering ◽

10.1155/2014/974758 ◽

2014 ◽

Vol 2014 ◽

pp. 1-9

Author(s):

Guoyang Yan ◽

Jiangyuan Mei ◽

Shen Yin ◽

Hamid Reza Karimi

Keyword(s):

Fault Detection ◽

Feature Vector ◽

Metric Learning ◽

Principal Component ◽

Industrial Applications ◽

Data Driven ◽

Detection Methods ◽

Feature Extraction Method ◽

Detection Systems ◽

The Relationship

Fault detection is fundamental to many industrial applications. With the development of system complexity, the number of sensors is increasing, which makes traditional fault detection methods lose efficiency. Metric learning is an efficient way to build the relationship between feature vectors with the categories of instances. In this paper, we firstly propose a metric learning-based fault detection framework in fault detection. Meanwhile, a novel feature extraction method based on wavelet transform is used to obtain the feature vector from detection signals. Experiments on Tennessee Eastman (TE) chemical process datasets demonstrate that the proposed method has a better performance when comparing with existing methods, for example, principal component analysis (PCA) and fisher discriminate analysis (FDA).

Download Full-text

Multi-Source Information Fusion Based on Data Driven

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.40-41.121 ◽

2010 ◽

Vol 40-41 ◽

pp. 121-126

Author(s):

Xin Zhang ◽

Li Yang ◽

Yan Zhang

Keyword(s):

Information Fusion ◽

Rough Set Theory ◽

Dimensional Space ◽

System Modeling ◽

Principal Component ◽

Data Driven ◽

Support Vector ◽

Source Information ◽

Knowledge Reduction ◽

Lower Dimensional Space

Take data driven method as theoretical basis, study multi-source information fusion technology. Using online and off-line data of the fusion system, does not rely on system's mathematical model, has avoided question about system modeling by mechanism. Uses principal component analysis method, rough set theory, Support Vector Machine(SVM) and so on, three method fusions and supplementary, through information processing and feature extraction to system's data-in, catches the most important information to lower dimensional space, realizes knowledge reduction. From data level, characteristic level, decision-making three levels realize information fusion. The example indicated that reduced computational complexity, reduced information loss in the fusion process, and enhanced the fusion accuracy.

Download Full-text

Adaptable and Explainable Predictive Maintenance: Semi-Supervised Deep Learning for Anomaly Detection and Diagnosis in Press Machine Data

Applied Sciences ◽

10.3390/app11167376 ◽

2021 ◽

Vol 11 (16) ◽

pp. 7376

Author(s):

Oscar Serradilla ◽

Ekhi Zugasti ◽

Julian Ramirez de Okariz ◽

Jon Rodriguez ◽

Urko Zurutuza

Keyword(s):

Anomaly Detection ◽

Null Space ◽

Model Performance ◽

Principal Component ◽

Predictive Maintenance ◽

Data Driven ◽

Support Vector ◽

Operational Conditions ◽

Detection Model ◽

Cluster Data

Predictive maintenance (PdM) has the potential to reduce industrial costs by anticipating failures and extending the work life of components. Nowadays, factories are monitoring their assets and most collected data belong to correct working conditions. Thereby, semi-supervised data-driven models are relevant to enable PdM application by learning from assets’ data. However, their main challenges for application in industry are achieving high accuracy on anomaly detection, diagnosis of novel failures, and adaptability to changing environmental and operational conditions (EOC). This article aims to tackle these challenges, experimenting with algorithms in press machine data of a production line. Initially, state-of-the-art and classic data-driven anomaly detection model performance is compared, including 2D autoencoder, null-space, principal component analysis (PCA), one-class support vector machines (OC-SVM), and extreme learning machine (ELM) algorithms. Then, diagnosis tools are developed supported on autoencoder’s latent space feature vector, including clustering and projection algorithms to cluster data of synthetic failure types semi-supervised. In addition, explainable artificial intelligence techniques have enabled to track the autoencoder’s loss with input data to detect anomalous signals. Finally, transfer learning is applied to adapt autoencoders to changing EOC data of the same process. The data-driven techniques used in this work can be adapted to address other industrial use cases, helping stakeholders gain trust and thus promote the adoption of data-driven PdM systems in smart factories.

Download Full-text

A Data-Driven Fault Diagnosis Method for Railway Turnouts

Transportation Research Record Journal of the Transportation Research Board ◽

10.1177/0361198119837222 ◽

2019 ◽

Vol 2673 (4) ◽

pp. 448-457 ◽

Cited By ~ 9

Author(s):

Dongxiu Ou ◽

Rui Xue ◽

Ke Cui

Keyword(s):

Fault Diagnosis ◽

Imbalanced Data ◽

Principal Component ◽

Real Data ◽

Feature Reduction ◽

Data Driven ◽

Support Vector ◽

Linear Discriminant ◽

Railway Line ◽

Diagnosis Method

Turnout systems on railways are crucial for safety protection and improvements in efficiency. The statistics show that the most common faults in railway system are turnout system faults. Therefore, many railway systems have adopted the microcomputer monitoring system (MMS) to monitor their health and performance in real time. However, in practice, existing turnout fault diagnosis methods depend largely on human experience. In this paper, we propose a data-driven fault diagnosis method that monitors data from point machines collected using MMS. First, based on a derivative method, data features are extracted by segmenting the original sample. Then, we apply two methods for feature reduction: principal component analysis (PCA) and linear discriminant analysis (LDA). The results show that LDA gave a better performance in the cases studied. A problem that cannot be overlooked is that the imbalanced quantity of rare fault samples and abundant normal samples will reduce the accuracy of classic fault diagnosis models. To deal with this problem of imbalanced data, we propose a modified support vector machine (SVM) method. Finally, an experiment using real data collected from the Guangzhou Railway Line is presented, which demonstrates that our method is reliable and feasible in fault diagnosis. It can further assist engineers to perform timely repairs and maintenance work in the future.

Download Full-text

The Dynamic Quality Control Method Based on Relation Analysis

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.307.433 ◽

2013 ◽

Vol 307 ◽

pp. 433-436 ◽

Cited By ~ 1

Author(s):

Guang Zhou Diao ◽

Li Ping Zhao ◽

Yi Yong Yao

Keyword(s):

Quality Control ◽

Control Charts ◽

Control Method ◽

Principal Component ◽

Support Vector ◽

Multiple Control ◽

Quality Control Method ◽

Relation Analysis ◽

Dynamic Quality ◽

Control Chart Pattern

To improve product quality in manufacturing process, a dynamic quality control method based on relation analysis is proposed. With the method, the dynamic regulated principal component analysis is constructed by introducing discount factor to eliminate the autocorrelation via the data, and the limit of multiple control charts is calculated by squared prediction error (SPE) statistics. Then, a dynamic adjusting policy by support vector machine (SVM) is proposed based on control chart pattern recognition. Finally, a case study for applicability is presented to verify the proposed method.

Download Full-text

FEATURE SELECTION AND MACHINE LEARNING CLASSIFICATION FOR MALWARE DETECTION

Jurnal Teknologi ◽

10.11113/jt.v77.3558 ◽

2015 ◽

Vol 77 (1) ◽

Cited By ~ 13

Author(s):

Ban Mohammed Khammas ◽

Alireza Monemi ◽

Joseph Stephen Bassi ◽

Ismahani Ismail ◽

Sulaiman Mohd Nor ◽

...

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Computer Security ◽

Malware Detection ◽

Principal Component ◽

Machine Learning Techniques ◽

Detection Methods ◽

Support Vector ◽

Machine Learning Classification ◽

Minimum Number

Malware is a computer security problem that can morph to evade traditional detection methods based on known signature matching. Since new malware variants contain patterns that are similar to those in observed malware, machine learning techniques can be used to identify new malware. This work presents a comparative study of several feature selection methods with four different machine learning classifiers in the context of static malware detection based on n-grams analysis. The result shows that the use of Principal Component Analysis (PCA) feature selection and Support Vector Machines (SVM) classification gives the best classification accuracy using a minimum number of features.

Download Full-text