Fault Diagnosis of Rotating Machinery Based on Improved Self-Supervised Learning Method and Very Few Labeled Samples

Meirong Wei; Yan Liu; Tao Zhang; Ze Wang; Jiaming Zhu

doi:10.3390/s22010192

Fault Diagnosis of Rotating Machinery Based on Improved Self-Supervised Learning Method and Very Few Labeled Samples

Sensors ◽

10.3390/s22010192 ◽

2021 ◽

Vol 22 (1) ◽

pp. 192

Author(s):

Meirong Wei ◽

Yan Liu ◽

Tao Zhang ◽

Ze Wang ◽

Jiaming Zhu

Keyword(s):

Fault Diagnosis ◽

Supervised Learning ◽

Data Transformation ◽

Training Data ◽

Superior Performance ◽

Learning Method ◽

Training Process ◽

Specific Data ◽

Diagnosis Model ◽

Fully Connected

Convolution neural network (CNN)-based fault diagnosis methods have been widely adopted to obtain representative features and used to classify fault modes due to their prominent feature extraction capability. However, a large number of labeled samples are required to support the algorithm of CNNs, and, in the case of a limited amount of labeled samples, this may lead to overfitting. In this article, a novel ResNet-based method is developed to achieve fault diagnoses for machines with very few samples. To be specific, data transformation combinations (DTCs) are designed based on mutual information. It is worth noting that the selected DTC, which can complete the training process of the 1-D ResNet quickly without increasing the amount of training data, can be randomly used for any batch training data. Meanwhile, a self-supervised learning method called 1-D SimCLR is adopted to obtain an effective feature encoder, which can be optimized with very few unlabeled samples. Then, a fault diagnosis model named DTC-SimCLR is constructed by combining the selected data transformation combination, the obtained feature encoder and a fully-connected layer-based classifier. In DTC-SimCLR, the parameters of the feature encoder are fixed, and the classifier is trained with very few labeled samples. Two machine fault datasets from a cutting tooth and a bearing are conducted to evaluate the performance of DTC-SimCLR. Testing results show that DTC-SimCLR has superior performance and diagnostic accuracy with very few samples.

Download Full-text

SEMI-SUPERVISED SEQUENCE CLASSIFICATION WITH HMMs

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001405004034 ◽

2005 ◽

Vol 19 (02) ◽

pp. 165-182 ◽

Cited By ~ 7

Author(s):

SHI ZHONG

Keyword(s):

Supervised Learning ◽

Learning Strategies ◽

Test Data ◽

Unlabeled Data ◽

Training Data ◽

Model Complexity ◽

Model Parameters ◽

Training Process ◽

Transductive Learning ◽

Model Training

Using unlabeled data to help supervised learning has become an increasingly attractive methodology and proven to be effective in many applications. This paper applies semi-supervised classification algorithms, based on hidden Markov models, to classify sequences. For model-based classification, semi-supervised learning amounts to using both labeled and unlabeled data to train model parameters. We examine three different strategies of using labeled and unlabeled data in the model training process. These strategies differ in how and when labeled and unlabeled data contribute to the model training process. We also compare regular semi-supervised learning, where there are separate unlabeled training data and unlabeled test data, with transductive learning where we do not differentiate between unlabeled training data and unlabeled test data. Our experimental results on synthetic and real EEG time-series show that substantially improved classification accuracy can be achieved by these semi-supervised learning strategies. The effect of model complexity on semi-supervised learning is also studied in our experiments.

Download Full-text

Teaching Semi-Supervised Classifier via Generalized Distillation

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/298 ◽

2018 ◽

Cited By ~ 2

Author(s):

Chen Gong ◽

Xiaojun Chang ◽

Meng Fang ◽

Jian Yang

Keyword(s):

Supervised Learning ◽

Error Bounds ◽

State Of The Art ◽

Training Data ◽

Training Process ◽

Rademacher Complexity ◽

Optimization Framework ◽

Specific Teaching ◽

Teaching Function ◽

Intelligent Teacher

Semi-Supervised Learning (SSL) is able to build reliable classifier with very scarce labeled examples by properly utilizing the abundant unlabeled examples. However, existing SSL algorithms often yield unsatisfactory performance due to the lack of supervision information. To address this issue, this paper formulates SSL as a Generalized Distillation (GD) problem, which treats existing SSL algorithm as a learner and introduces a teacher to guide the learner?s training process. Specifically, the intelligent teacher holds the privileged knowledge that ?explains? the training data but remains unknown to the learner, and the teacher should convey its rich knowledge to the imperfect learner through a specific teaching function. After that, the learner gains knowledge by ?imitating? the output of the teaching function under an optimization framework. Therefore, the learner in our algorithm learns from both the teacher and the training data, so its output can be substantially distilled and enhanced. By deriving the Rademacher complexity and error bounds of the proposed algorithm, the usefulness of the introduced teacher is theoretically demonstrated. The superiority of our algorithm to the related state-of-the-art methods has also been empirically demonstrated by the experiments on different datasets with various sources of privileged knowledge.

Download Full-text

VECA: A Method for Detecting Overfitting in Neural Networks (Student Abstract)

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i10.7167 ◽

2020 ◽

Vol 34 (10) ◽

pp. 13791-13792

Author(s):

Liangzhu Ge ◽

Yuexian Hou ◽

Yaju Jiang ◽

Shuai Yao ◽

Chao Yang

Keyword(s):

Neural Networks ◽

Strong Correlation ◽

Good Predictor ◽

Deep Neural Networks ◽

Training Data ◽

Training Process ◽

Generalization Performance ◽

Validation Set ◽

Fully Connected ◽

Fully Connected Networks

Despite their widespread applications, deep neural networks often tend to overfit the training data. Here, we propose a measure called VECA (Variance of Eigenvalues of Covariance matrix of Activation matrix) and demonstrate that VECA is a good predictor of networks' generalization performance during the training process. Experiments performed on fully-connected networks and convolutional neural networks trained on benchmark image datasets show a strong correlation between test loss and VECA, which suggest that we can calculate the VECA to estimate generalization performance without sacrificing training data to be used as a validation set.

Download Full-text

Identification of Fiducial Points in Serial Data

Journal of Dynamic Systems Measurement and Control ◽

10.1115/1.2896347 ◽

1991 ◽

Vol 113 (1) ◽

pp. 178-183

Author(s):

D. M. Auslander ◽

J. C. Griffin ◽

A. Mayya

Keyword(s):

Selection Process ◽

Training Data ◽

Training Process ◽

Data Types ◽

Specific Data ◽

Serial Data ◽

Weighted Score ◽

Correct Point ◽

Mathematical Justification

A method is described for fiducial point identification that can be tuned to specific data types using training set data having manually marked fiducial points. The role of the “expert” in the training process is limited to providing the correct point identification in the training data. No articulation of the mathematical justification for the choice is needed. The method is based on the calculation of a weighted score for each point in an unknown data record. The score is derived from a doubly normalized computation of values for a set of generic discriminant functions. Candidate points are identified by an order and selection process. Because of the multiple normalization and use of sorting for selection, the method is independent of scale or range of the data to be identified. Neither the training process nor the identification process requires any dimensional input, other than the identification of fiducial points for use in the training process. Examples are given using cardiac electrogram data and ultrasonic data.

Download Full-text

EFFICIENT SUPERVISED LEARNING METHOD WITH AUTOMATIC CLASSIFICATION OF TRAINING DATA SET

Journal of Japan Society of Kansei Engineering ◽

10.5057/jjske.8.837 ◽

2009 ◽

Vol 8 (3) ◽

pp. 837-842

Author(s):

Riki SHIGEMATSU ◽

Toshikazu KATO

Keyword(s):

Supervised Learning ◽

Automatic Classification ◽

Training Data ◽

Learning Method ◽

Data Set

Download Full-text

Energy Efficiency Solutions for Buildings: Automated Fault Diagnosis of Air Handling Units Using Generative Adversarial Networks

Energies ◽

10.3390/en12030527 ◽

2019 ◽

Vol 12 (3) ◽

pp. 527 ◽

Cited By ~ 14

Author(s):

Chaowen Zhong ◽

Ke Yan ◽

Yuting Dai ◽

Ning Jin ◽

Bing Lou

Keyword(s):

Energy Efficiency ◽

Fault Diagnosis ◽

Supervised Learning ◽

Real World ◽

Generative Adversarial Networks ◽

Training Process ◽

Generative Adversarial Network ◽

Adversarial Network ◽

Air Handling Units ◽

Real World Datasets

Automated fault diagnosis (AFD) for various energy consumption components is one of the main topics for energy efficiency solutions. However, the lack of faulty samples in the training process remains as a difficulty for data-driven AFD of heating, ventilation and air conditioning (HVAC) subsystems, such as air handling units (AHU). Existing works show that semi-supervised learning theories can effectively alleviate the issue by iteratively inserting newly tested faulty data samples into the training pool when the same fault happens again. However, a research gap exists between theoretical AFD algorithms and real-world applications. First, for real-world AFD applications, it is hard to predict the time when the same fault happens again. Second, the training set is required to be pre-defined and fixed before being packed into the building management system (BMS) for automatic HVAC fault diagnosis. The semi-supervised learning process of iteratively absorbing testing data into the training pool can be irrelevant for industrial usage of the AFD methods. Generative adversarial network (GAN) is well-known as an unsupervised learning technique to enrich the training pool with fake samples that are close to real faulty samples. In this study, a hybrid generative adversarial network (GAN) is proposed combining Wasserstein GAN with traditional classifiers to perform fault diagnosis mimicking the real-world scenarios with limited faulty training samples in the training process. Experimental results on real-world datasets demonstrate the effectiveness of the proposed approach for fault diagnosis problems of AHU subsystem.

Download Full-text

Hierarchical Classification of Urban ALS Data by Using Geometry and Intensity Information

Sensors ◽

10.3390/s19204583 ◽

2019 ◽

Vol 19 (20) ◽

pp. 4583 ◽

Cited By ~ 1

Author(s):

Xiaoqiang Liu ◽

Yanming Chen ◽

Shuyi Li ◽

Liang Cheng ◽

Manchun Li

Keyword(s):

Supervised Learning ◽

Laser Scanning ◽

Large Scale ◽

Three Dimensional ◽

Hierarchical Classification ◽

Training Data ◽

Classification Model ◽

Learning Method ◽

Intensity Information

Airborne laser scanning (ALS) can acquire both geometry and intensity information of geo-objects, which is important in mapping a large-scale three-dimensional (3D) urban environment. However, the intensity information recorded by ALS will be changed due to the flight height and atmospheric attenuation, which decreases the robustness of the trained supervised classifier. This paper proposes a hierarchical classification method by separately using geometry and intensity information of urban ALS data. The method uses supervised learning for stable geometry information and unsupervised learning for fluctuating intensity information. The experiment results show that the proposed method can utilize the intensity information effectively, based on three aspects, as below. (1) The proposed method improves the accuracy of classification result by using intensity. (2) When the ALS data to be classified are acquired under the same conditions as the training data, the performance of the proposed method is as good as the supervised learning method. (3) When the ALS data to be classified are acquired under different conditions from the training data, the performance of the proposed method is better than the supervised learning method. Therefore, the classification model derived from the proposed method can be transferred to other ALS data whose intensity is inconsistent with the training data. Furthermore, the proposed method can contribute to the hierarchical use of some other ALS information, such as multi-spectral information.

Download Full-text

Evaluation of automated cephalometric analysis based on the latest deep learning method

The Angle Orthodontist ◽

10.2319/021220-100.1 ◽

2021 ◽

Author(s):

Hye-Won Hwang ◽

Jun-Ho Moon ◽

Min-Gyu Kim ◽

Richard E. Donatelli ◽

Shin-Jae Lee

Keyword(s):

Deep Learning ◽

Test Data ◽

Training Data ◽

Superior Performance ◽

Cephalometric Analysis ◽

Data Sets ◽

Learning Method ◽

Test Results ◽

Classification Rate ◽

Data Set

ABSTRACT Objectives To compare an automated cephalometric analysis based on the latest deep learning method of automatically identifying cephalometric landmarks (AI) with previously published AI according to the test style of the worldwide AI challenges at the International Symposium on Biomedical Imaging conferences held by the Institute of Electrical and Electronics Engineers (IEEE ISBI). Materials and Methods This latest AI was developed by using a total of 1983 cephalograms as training data. In the training procedures, a modification of a contemporary deep learning method, YOLO version 3 algorithm, was applied. Test data consisted of 200 cephalograms. To follow the same test style of the AI challenges at IEEE ISBI, a human examiner manually identified the IEEE ISBI-designated 19 cephalometric landmarks, both in training and test data sets, which were used as references for comparison. Then, the latest AI and another human examiner independently detected the same landmarks in the test data set. The test results were compared by the measures that appeared at IEEE ISBI: the success detection rate (SDR) and the success classification rates (SCR). Results SDR of the latest AI in the 2-mm range was 75.5% and SCR was 81.5%. These were greater than any other previous AIs. Compared to the human examiners, AI showed a superior success classification rate in some cephalometric analysis measures. Conclusions This latest AI seems to have superior performance compared to previous AI methods. It also seems to demonstrate cephalometric analysis comparable to human examiners.

Download Full-text

The Lateral Conflict Risk Assessment for Low-altitude Training Airspace using Weakly Supervised Learning Method

Intelligent Automation & Soft Computing ◽

10.31209/2018.100000027 ◽

2018 ◽

Vol 24 (3) ◽

pp. 603-611

Author(s):

Kaijun Xu ◽

Xueting Chen ◽

Yusheng Yao ◽

Shanshan Li

Keyword(s):

Risk Assessment ◽

Supervised Learning ◽

Altitude Training ◽

Learning Method ◽

Weakly Supervised Learning ◽

Low Altitude ◽

Weakly Supervised ◽

Conflict Risk

Download Full-text

An Effective Perturbation Based Semi-Supervised Learning Method for Sound Event Detection

10.21437/interspeech.2020-2329 ◽

2020 ◽

Author(s):

Xu Zheng ◽

Yan Song ◽

Jie Yan ◽

Li-Rong Dai ◽

Ian McLoughlin ◽

...

Keyword(s):

Supervised Learning ◽

Event Detection ◽

Learning Method ◽

Sound Event ◽

Sound Event Detection

Download Full-text