Data-driven fault detection methods for detecting small-magnitude faults in anaerobic digestion process

2020 ◽  
Vol 81 (8) ◽  
pp. 1740-1748 ◽  
Author(s):  
Pezhman Kazemi ◽  
Jaume Giralt ◽  
Christophe Bengoa ◽  
Jean-Philippe Steyer

Abstract Early detection of small-magnitude faults in anaerobic digestion (AD) processes is a mandatory step for preventing serious consequence in the future. Since volatile fatty acids (VFA) accumulation is widely suggested as a process health indicator, a VFA soft-sensor was developed based on support vector machine (SVM) and used for generating the residuals by comparing real and predicted VFA. The estimated residual signal was applied to univariate statistical control charts such as cumulative sum (CUSUM) and square prediction error (SPE) to detect the faults. A principal component analysis (PCA) model was also developed for comparison with the aforementioned approach. The proposed framework showed excellent performance for detecting small-magnitude faults in the state parameters of AD processes.

Processes ◽  
2020 ◽  
Vol 8 (1) ◽  
pp. 67 ◽  
Author(s):  
Pezhman Kazemi ◽  
Jean-Philippe Steyer ◽  
Christophe Bengoa ◽  
Josep Font ◽  
Jaume Giralt

The concentration of volatile fatty acids (VFAs) is one of the most important measurements for evaluating the performance of anaerobic digestion (AD) processes. In real-time applications, VFAs can be measured by dedicated sensors, which are still currently expensive and very sensitive to harsh environmental conditions. Moreover, sensors usually have a delay that is undesirable for real-time monitoring. Due to these problems, data-driven soft sensors are very attractive alternatives. This study proposes different data-driven methods for estimating reliable VFA values. We evaluated random forest (RF), artificial neural network (ANN), extreme learning machine (ELM), support vector machine (SVM) and genetic programming (GP) based on synthetic data obtained from the international water association (IWA) Benchmark Simulation Model No. 2 (BSM2). The organic load to the AD in BSM2 was modified to simulate the behavior of an anaerobic co-digestion process. The prediction and generalization performances of the different models were also compared. This comparison showed that the GP soft sensor is more precise than the other soft sensors. In addition, the model robustness was assessed to determine the performance of each model under different process states. It is also shown that, in addition to their robustness, GP soft sensors are easy to implement and provide useful insights into the process by providing explicit equations.


2022 ◽  
pp. 146808742110707
Author(s):  
Aran Mohammad ◽  
Reza Rezaei ◽  
Christopher Hayduk ◽  
Thaddaeus Delebinski ◽  
Saeid Shahpouri ◽  
...  

The development of internal combustion engines is affected by the exhaust gas emissions legislation and the striving to increase performance. This demands for engine-out emission models that can be used for engine optimization for real driving emission controls. The prediction capability of physically and data-driven engine-out emission models is influenced by the system inputs, which are specified by the user and can lead to an improved accuracy with increasing number of inputs. Thereby the occurrence of irrelevant inputs becomes more probable, which have a low functional relation to the emissions and can lead to overfitting. Alternatively, data-driven methods can be used to detect irrelevant and redundant inputs. In this work, thermodynamic states are modeled based on 772 stationary measured test bench data from a commercial vehicle diesel engine. Afterward, 37 measured and modeled variables are led into a data-driven dimensionality reduction. For this purpose, approaches of supervised learning, such as lasso regression and linear support vector machine, and unsupervised learning methods like principal component analysis and factor analysis are applied to select and extract the relevant features. The selected and extracted features are used for regression by the support vector machine and the feedforward neural network to model the NOx, CO, HC, and soot emissions. This enables an evaluation of the modeling accuracy as a result of the dimensionality reduction. Using the methods in this work, the 37 variables are reduced to 25, 22, 11, and 16 inputs for NOx, CO, HC, and soot emission modeling while maintaining the accuracy. The features selected using the lasso algorithm provide more accurate learning of the regression models than the extracted features through principal component analysis and factor analysis. This results in test errors RMSETe for modeling NOx, CO, HC, and soot emissions 19.22 ppm, 6.46 ppm, 1.29 ppm, and 0.06 FSN, respectively.


2018 ◽  
Vol 90 (2) ◽  
pp. 435-451 ◽  
Author(s):  
Xu Kang ◽  
Dechang Pi

Purpose The purpose of this paper is to detect the occurrence of anomaly and fault in a spacecraft, investigate various tendencies of telemetry parameters and evaluate the operation state of the spacecraft to monitor the health of the spacecraft. Design/methodology/approach This paper proposes a data-driven method (empirical mode decomposition-sample entropy-principal component analysis [EMD-SE-PCA]) for monitoring the health of the spacecraft, where EMD is used to decompose telemetry data and obtain the trend items, SE is utilised to calculate the sample entropies of trend items and extract the characteristic data and squared prediction error and statistic contribution rate are analysed using PCA to monitor the health of the spacecraft. Findings Experimental results indicate that the EMD-SE-PCA method could detect characteristic parameters that appear abnormally before the anomaly or fault occurring, could provide an abnormal early warning time before anomaly or fault appearing and summarise the contribution of each parameter more accurately than other fault detection methods. Practical implications The proposed EMD-SE-PCA method has high level of accuracy and efficiency. It can be used in monitoring the health of a spacecraft, detecting the anomaly and fault, avoiding them timely and efficiently. Also, the EMD-SE-PCA method could be further applied for monitoring the health of other equipment (e.g. attitude control and orbit control system) in spacecraft and satellites. Originality/value The paper provides a data-driven method EMD-SE-PCA to be applied in the field of practical health monitoring, which could discover the occurrence of anomaly or fault timely and efficiently and is very useful for spacecraft health diagnosis.


2014 ◽  
Vol 2014 ◽  
pp. 1-9
Author(s):  
Guoyang Yan ◽  
Jiangyuan Mei ◽  
Shen Yin ◽  
Hamid Reza Karimi

Fault detection is fundamental to many industrial applications. With the development of system complexity, the number of sensors is increasing, which makes traditional fault detection methods lose efficiency. Metric learning is an efficient way to build the relationship between feature vectors with the categories of instances. In this paper, we firstly propose a metric learning-based fault detection framework in fault detection. Meanwhile, a novel feature extraction method based on wavelet transform is used to obtain the feature vector from detection signals. Experiments on Tennessee Eastman (TE) chemical process datasets demonstrate that the proposed method has a better performance when comparing with existing methods, for example, principal component analysis (PCA) and fisher discriminate analysis (FDA).


2010 ◽  
Vol 40-41 ◽  
pp. 121-126
Author(s):  
Xin Zhang ◽  
Li Yang ◽  
Yan Zhang

Take data driven method as theoretical basis, study multi-source information fusion technology. Using online and off-line data of the fusion system, does not rely on system's mathematical model, has avoided question about system modeling by mechanism. Uses principal component analysis method, rough set theory, Support Vector Machine(SVM) and so on, three method fusions and supplementary, through information processing and feature extraction to system's data-in, catches the most important information to lower dimensional space, realizes knowledge reduction. From data level, characteristic level, decision-making three levels realize information fusion. The example indicated that reduced computational complexity, reduced information loss in the fusion process, and enhanced the fusion accuracy.


2021 ◽  
Vol 11 (16) ◽  
pp. 7376
Author(s):  
Oscar Serradilla ◽  
Ekhi Zugasti ◽  
Julian Ramirez de Okariz ◽  
Jon Rodriguez ◽  
Urko Zurutuza

Predictive maintenance (PdM) has the potential to reduce industrial costs by anticipating failures and extending the work life of components. Nowadays, factories are monitoring their assets and most collected data belong to correct working conditions. Thereby, semi-supervised data-driven models are relevant to enable PdM application by learning from assets’ data. However, their main challenges for application in industry are achieving high accuracy on anomaly detection, diagnosis of novel failures, and adaptability to changing environmental and operational conditions (EOC). This article aims to tackle these challenges, experimenting with algorithms in press machine data of a production line. Initially, state-of-the-art and classic data-driven anomaly detection model performance is compared, including 2D autoencoder, null-space, principal component analysis (PCA), one-class support vector machines (OC-SVM), and extreme learning machine (ELM) algorithms. Then, diagnosis tools are developed supported on autoencoder’s latent space feature vector, including clustering and projection algorithms to cluster data of synthetic failure types semi-supervised. In addition, explainable artificial intelligence techniques have enabled to track the autoencoder’s loss with input data to detect anomalous signals. Finally, transfer learning is applied to adapt autoencoders to changing EOC data of the same process. The data-driven techniques used in this work can be adapted to address other industrial use cases, helping stakeholders gain trust and thus promote the adoption of data-driven PdM systems in smart factories.


Author(s):  
Dongxiu Ou ◽  
Rui Xue ◽  
Ke Cui

Turnout systems on railways are crucial for safety protection and improvements in efficiency. The statistics show that the most common faults in railway system are turnout system faults. Therefore, many railway systems have adopted the microcomputer monitoring system (MMS) to monitor their health and performance in real time. However, in practice, existing turnout fault diagnosis methods depend largely on human experience. In this paper, we propose a data-driven fault diagnosis method that monitors data from point machines collected using MMS. First, based on a derivative method, data features are extracted by segmenting the original sample. Then, we apply two methods for feature reduction: principal component analysis (PCA) and linear discriminant analysis (LDA). The results show that LDA gave a better performance in the cases studied. A problem that cannot be overlooked is that the imbalanced quantity of rare fault samples and abundant normal samples will reduce the accuracy of classic fault diagnosis models. To deal with this problem of imbalanced data, we propose a modified support vector machine (SVM) method. Finally, an experiment using real data collected from the Guangzhou Railway Line is presented, which demonstrates that our method is reliable and feasible in fault diagnosis. It can further assist engineers to perform timely repairs and maintenance work in the future.


2013 ◽  
Vol 307 ◽  
pp. 433-436 ◽  
Author(s):  
Guang Zhou Diao ◽  
Li Ping Zhao ◽  
Yi Yong Yao

To improve product quality in manufacturing process, a dynamic quality control method based on relation analysis is proposed. With the method, the dynamic regulated principal component analysis is constructed by introducing discount factor to eliminate the autocorrelation via the data, and the limit of multiple control charts is calculated by squared prediction error (SPE) statistics. Then, a dynamic adjusting policy by support vector machine (SVM) is proposed based on control chart pattern recognition. Finally, a case study for applicability is presented to verify the proposed method.


2015 ◽  
Vol 77 (1) ◽  
Author(s):  
Ban Mohammed Khammas ◽  
Alireza Monemi ◽  
Joseph Stephen Bassi ◽  
Ismahani Ismail ◽  
Sulaiman Mohd Nor ◽  
...  

Malware is a computer security problem that can morph to evade traditional detection methods based on known signature matching. Since new malware variants contain patterns that are similar to those in observed malware, machine learning techniques can be used to identify new malware. This work presents a comparative study of several feature selection methods with four different machine learning classifiers in the context of static malware detection based on n-grams analysis. The result shows that the use of Principal Component Analysis (PCA) feature selection and Support Vector Machines (SVM) classification gives the best classification accuracy using a minimum number of features.


Sign in / Sign up

Export Citation Format

Share Document