An Imbalanced Data Handling Framework for Industrial Big Data Using a Gaussian Process Regression-Based Generative Adversarial Network

Eunseo Oh; Hyunsoo Lee

doi:10.3390/sym12040669

An Imbalanced Data Handling Framework for Industrial Big Data Using a Gaussian Process Regression-Based Generative Adversarial Network

Symmetry ◽

10.3390/sym12040669 ◽

2020 ◽

Vol 12 (4) ◽

pp. 669 ◽

Cited By ~ 1

Author(s):

Eunseo Oh ◽

Hyunsoo Lee

Keyword(s):

Big Data ◽

Gaussian Process ◽

Missing Values ◽

Gaussian Process Regression ◽

Estimation Methods ◽

Data Handling ◽

Generative Adversarial Network ◽

Data Set ◽

Adversarial Network ◽

Industrial Big Data

The developments in the fields of industrial Internet of Things (IIoT) and big data technologies have made it possible to collect a lot of meaningful industrial process and quality-based data. The gathered data are analyzed using contemporary statistical methods and machine learning techniques. Then, the extracted knowledge can be used for predictive maintenance or prognostic health management. However, it is difficult to gather complete data due to several issues in IIoT, such as devices breaking down, running out of battery, or undergoing scheduled maintenance. Data with missing values are often ignored, as they may contain insufficient information from which to draw conclusions. In order to overcome these issues, we propose a novel, effective missing data handling mechanism for the concepts of symmetry principles. While other existing methods only attempt to estimate missing parts, the proposed method generates a whole set of data set using Gaussian process regression and a generative adversarial network. In order to prove the effectiveness of the proposed framework, we examine a real-world, industrial case involving an air pressure system (APS), where we use the proposed method to make quality predictions and compare the results with existing state-of-the-art estimation methods.

Download Full-text

Exchange Spin Coupling from Gaussian Process Regression

10.26434/chemrxiv.12589541.v3 ◽

2020 ◽

Author(s):

Marc Philipp Bahlke ◽

Natnael Mogos ◽

Jonny Proppe ◽

Carmen Herrmann

Keyword(s):

Machine Learning ◽

Gaussian Process ◽

Gaussian Process Regression ◽

Molecular Magnets ◽

Molecular Structures ◽

Spin Coupling ◽

Structure Property ◽

Data Set ◽

Uncertainty Estimates

Heisenberg exchange spin coupling between metal centers is essential for describing and understanding the electronic structure of many molecular catalysts, metalloenzymes, and molecular magnets for potential application in information technology. We explore the machine-learnability of exchange spin coupling, which has not been studied yet. We employ Gaussian process regression since it can potentially deal with small training sets (as likely associated with the rather complex molecular structures required for exploring spin coupling) and since it provides uncertainty estimates (“error bars”) along with predicted values. We compare a range of descriptors and kernels for 257 small dicopper complexes and find that a simple descriptor based on chemical intuition, consisting only of copper-bridge angles and copper-copper distances, clearly outperforms several more sophisticated descriptors when it comes to extrapolating towards larger experimentally relevant complexes. Exchange spin coupling is similarly easy to learn as the polarizability, while learning dipole moments is much harder. The strength of the sophisticated descriptors lies in their ability to linearize structure-property relationships, to the point that a simple linear ridge regression performs just as well as the kernel-based machine-learning model for our small dicopper data set. The superior extrapolation performance of the simple descriptor is unique to exchange spin coupling, reinforcing the crucial role of choosing a suitable descriptor, and highlighting the interesting question of the role of chemical intuition vs. systematic or automated selection of features for machine learning in chemistry and material science.

Download Full-text

Data Augmentation Using Generative Adversarial Network for Automatic Machine Fault Detection Based on Vibration Signals

Applied Sciences ◽

10.3390/app11052166 ◽

2021 ◽

Vol 11 (5) ◽

pp. 2166

Author(s):

Van Bui ◽

Tung Lam Pham ◽

Huy Nguyen ◽

Yeong Min Jang

Keyword(s):

Fault Detection ◽

Data Augmentation ◽

Model Performance ◽

Original Data ◽

Fault Classification ◽

Training Process ◽

Generative Adversarial Network ◽

Data Set ◽

Adversarial Network ◽

Machine Fault

In the last decade, predictive maintenance has attracted a lot of attention in industrial factories because of its wide use of the Internet of Things and artificial intelligence algorithms for data management. However, in the early phases where the abnormal and faulty machines rarely appeared in factories, there were limited sets of machine fault samples. With limited fault samples, it is difficult to perform a training process for fault classification due to the imbalance of input data. Therefore, data augmentation was required to increase the accuracy of the learning model. However, there were limited methods to generate and evaluate the data applied for data analysis. In this paper, we introduce a method of using the generative adversarial network as the fault signal augmentation method to enrich the dataset. The enhanced data set could increase the accuracy of the machine fault detection model in the training process. We also performed fault detection using a variety of preprocessing approaches and classified the models to evaluate the similarities between the generated data and authentic data. The generated fault data has high similarity with the original data and it significantly improves the accuracy of the model. The accuracy of fault machine detection reaches 99.41% with 20% original fault machine data set and 93.1% with 0% original fault machine data set (only use generate data only). Based on this, we concluded that the generated data could be used to mix with original data and improve the model performance.

Download Full-text

Underwater Acoustic Target Recognition Based on Generative Adversarial Network Data Augmentation

INTER-NOISE and NOISE-CON Congress and Conference Proceedings ◽

10.3397/in-2021-2737 ◽

2021 ◽

Vol 263 (2) ◽

pp. 4558-4564

Author(s):

Minghong Zhang ◽

Xinwei Luo

Keyword(s):

Data Augmentation ◽

Target Recognition ◽

Training Data ◽

Small Samples ◽

Generative Adversarial Network ◽

Data Set ◽

Underwater Acoustic ◽

Adversarial Network ◽

Acoustic Target ◽

The Impact

Underwater acoustic target recognition is an important aspect of underwater acoustic research. In recent years, machine learning has been developed continuously, which is widely and effectively applied in underwater acoustic target recognition. In order to acquire good recognition results and reduce the problem of overfitting, Adequate data sets are essential. However, underwater acoustic samples are relatively rare, which has a certain impact on recognition accuracy. In this paper, in addition of the traditional audio data augmentation method, a new method of data augmentation using generative adversarial network is proposed, which uses generator and discriminator to learn the characteristics of underwater acoustic samples, so as to generate reliable underwater acoustic signals to expand the training data set. The expanded data set is input into the deep neural network, and the transfer learning method is applied to further reduce the impact caused by small samples by fixing part of the pre-trained parameters. The experimental results show that the recognition result of this method is better than the general underwater acoustic recognition method, and the effectiveness of this method is verified.

Download Full-text

GAINESIS: Generative Artificial Intelligence NEtlists SynthesIS

Electronics ◽

10.3390/electronics11020245 ◽

2022 ◽

Vol 11 (2) ◽

pp. 245

Author(s):

Konstantinos G. Liakos ◽

Georgios K. Georgakilas ◽

Fotis C. Plessas ◽

Paris Kitsos

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Power Analysis ◽

Public Libraries ◽

Data Sets ◽

Hardware Trojan ◽

Generative Adversarial Network ◽

Data Set ◽

Encrypted Data ◽

Adversarial Network

A significant problem in the field of hardware security consists of hardware trojan (HT) viruses. The insertion of HTs into a circuit can be applied for each phase of the circuit chain of production. HTs degrade the infected circuit, destroy it or leak encrypted data. Nowadays, efforts are being made to address HTs through machine learning (ML) techniques, mainly for the gate-level netlist (GLN) phase, but there are some restrictions. Specifically, the number and variety of normal and infected circuits that exist through the free public libraries, such as Trust-HUB, are based on the few samples of benchmarks that have been created from circuits large in size. Thus, it is difficult, based on these data, to develop robust ML-based models against HTs. In this paper, we propose a new deep learning (DL) tool named Generative Artificial Intelligence Netlists SynthesIS (GAINESIS). GAINESIS is based on the Wasserstein Conditional Generative Adversarial Network (WCGAN) algorithm and area–power analysis features from the GLN phase and synthesizes new normal and infected circuit samples for this phase. Based on our GAINESIS tool, we synthesized new data sets, different in size, and developed and compared seven ML classifiers. The results demonstrate that our new generated data sets significantly enhance the performance of ML classifiers compared with the initial data set of Trust-HUB.

Download Full-text

Data Augmentation for Electricity Theft Detection Using Conditional Variational Auto-Encoder

Energies ◽

10.3390/en13174291 ◽

2020 ◽

Vol 13 (17) ◽

pp. 4291

Author(s):

Xuejiao Gong ◽

Bo Tang ◽

Ruijin Zhu ◽

Wenlong Liao ◽

Like Song

Keyword(s):

Latent Variables ◽

Data Augmentation ◽

Sampling Technique ◽

Smart Meters ◽

Generative Adversarial Network ◽

Data Set ◽

Electricity Theft ◽

Adversarial Network ◽

Low Dimensional ◽

Power Curves

Due to the strong concealment of electricity theft and the limitation of inspection resources, the number of power theft samples mastered by the power department is insufficient, which limits the accuracy of power theft detection. Therefore, a data augmentation method for electricity theft detection based on the conditional variational auto-encoder (CVAE) is proposed. Firstly, the stealing power curves are mapped into low dimensional latent variables by using the encoder composed of convolutional layers, and the new stealing power curves are reconstructed by the decoder composed of deconvolutional layers. Then, five typical attack models are proposed, and the convolutional neural network is constructed as a classifier according to the data characteristics of stealing power curves. Finally, the effectiveness and adaptability of the proposed method is verified by a smart meters’ data set from London. The simulation results show that the CVAE can take into account the shapes and distribution characteristics of samples at the same time, and the generated stealing power curves have the best effect on the performance improvement of the classifier than the traditional augmentation methods such as the random oversampling method, synthetic minority over-sampling technique, and conditional generative adversarial network. Moreover, it is suitable for different classifiers.

Download Full-text

Dealing with Observation Outages within Navigation Data using Gaussian Process Regression

Journal of Navigation ◽

10.1017/s0373463314000010 ◽

2014 ◽

Vol 67 (4) ◽

pp. 603-615 ◽

Cited By ~ 6

Author(s):

Hongmei Chen ◽

Xianghong Cheng ◽

Haipeng Wang ◽

Xu Han

Keyword(s):

Kalman Filter ◽

Theoretical Analysis ◽

Gaussian Process ◽

Inertial Navigation System ◽

Dynamic Models ◽

Gaussian Process Regression ◽

Parametric Model ◽

Integrated Navigation ◽

Data Set ◽

Navigation Data

Gaussian process regression (GPR) is used in a Spare-grid Quadrature Kalman filter (SGQKF) for Strap-down Inertial Navigation System (SINS)/odometer integrated navigation to bridge uncertain observation outages and maintain an estimate of the evolving SINS biases. The SGQKF uses nonlinearized dynamic models with complex stochastic nonlinearities so the performance degrades significantly during observation outages owing to the uncertainties and noise. The GPR calculates the residual output after factoring in the contributions of the parametric model that is used as a nonlinear SINS error predictor integrated into the SGQKF. The sensor measurements and SINS output deviations from the odometer are collected in a data set during observation availability. The GPR is then applied to predict SINS deviations from the odometer and then the predicted SINS deviations are fed to the SGQKF as an actual update to estimate all SINS biases during observation outages. We demonstrate our method's effectiveness in bridging uncertain observation outages in simulations and in real road tests. The results agree with the theoretical analysis, which demonstrate that SGQKF using GPR can maintain an estimate of the evolving SINS biases during signal outages.

Download Full-text

Design of Painting Art Style Rendering System Based on Convolutional Neural Network

Scientific Programming ◽

10.1155/2021/4708758 ◽

2021 ◽

Vol 2021 ◽

pp. 1-11

Author(s):

Xingyu Xie ◽

Bin Lv

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Generative Adversarial Network ◽

Convolutional Network ◽

Data Set ◽

Adversarial Network ◽

Paired Samples ◽

Network Generator ◽

The Stability ◽

Art Style

Convolutional Neural Network- (CNN-) based GAN models mainly suffer from problems such as data set limitation and rendering efficiency in the segmentation and rendering of painting art. In order to solve these problems, this paper uses the improved cycle generative adversarial network (CycleGAN) to render the current image style. This method replaces the deep residual network (ResNet) of the original network generator with a dense connected convolutional network (DenseNet) and uses the perceptual loss function for adversarial training. The painting art style rendering system built in this paper is based on perceptual adversarial network (PAN) for the improved CycleGAN that suppresses the limitation of the network model on paired samples. The proposed method also improves the quality of the image generated by the artistic style of painting and further improves the stability and speeds up the network convergence speed. Experiments were conducted on the painting art style rendering system based on the proposed model. Experimental results have shown that the image style rendering method based on the perceptual adversarial error to improve the CycleGAN + PAN model can achieve better results. The PSNR value of the generated image is increased by 6.27% on average, and the SSIM values are all increased by about 10%. Therefore, the improved CycleGAN + PAN image painting art style rendering method produces better painting art style images, which has strong application value.

Download Full-text

Four-dimensional mesospheric and lower thermospheric wind ﬁelds using Gaussian process regression on multistatic specular meteor radar observations

10.5194/amt-2021-40 ◽

2021 ◽

Author(s):

Ryan Volz ◽

Jorge L. Chau ◽

Philip J. Erickson ◽

Juha P. Vierinen ◽

J. Miguel Urco ◽

...

Keyword(s):

Gaussian Process ◽

Wind Velocity ◽

Wind Field ◽

Gaussian Process Regression ◽

Small Scale ◽

Model Parameters ◽

Meteor Radar ◽

Wind Fields ◽

Data Set ◽

Uncertainty Estimates

Abstract. Mesoscale dynamics in the mesosphere and lower thermosphere (MLT) region have been difficult to study from either ground- or satellite-based observations. For understanding of atmospheric coupling processes, important spatial scales at these altitudes range between tens to hundreds of kilometers in the horizontal plane. To date, this scale size is challenging observationally, and so structures are usually parameterized in global circulation models. The advent of multistatic specular meteor radar networks allows exploration of MLT mesocale dynamics on these scales using an increased number of detections and a diversity of viewing angles inherent to multistatic networks. In this work, we introduce a four dimensional wind field inversion method that makes use of Gaussian process regression (GPR), a non-parametric and Bayesian approach. The method takes measured projected wind velocities and prior distributions of the wind velocity as a function of space and time, specified by the user or estimated from the data, and produces posterior distributions for the wind velocity. Computation of the predictive posterior distribution is performed on sampled points of interest and is not necessarily regularly sampled. The main benefits of the GPR method include this non-gridded sampling, the built-in statistical uncertainty estimates, and the ability to horizontally-resolve winds on relatively small scales. The performance of the GPR implementation has been evaluated on Monte Carlo simulations with known distributions using the same spatial and temporal sampling as one day of real meteor measurements. Based on the simulation results we find that the GPR implementation is robust, providing wind fields that are statistically unbiased and with statistical variances that depend on the geometry and are proportional to the prior velocity variances. A conservative and fast approach can be straightforwardly implemented by employing overestimated prior variances and distances, while a more robust but computationally intensive approach can be implemented by employing training and fitting of model parameters. The latter GPR approach has been applied to a 24-hour data set and shown to compare well to previously used homogeneous and gradient methods. Small scale features have reasonably low statistical uncertainties, implying geophysical wind field horizontal structures as low as 20–50 km. We suggest that this GPR approach forms a suitable method for MLT regional and weather studies.

Download Full-text

Parametric Gaussian process regression for big data

Computational Mechanics ◽

10.1007/s00466-019-01711-5 ◽

2019 ◽

Vol 64 (2) ◽

pp. 409-416 ◽

Cited By ~ 4

Author(s):

Maziar Raissi ◽

Hessam Babaee ◽

George Em Karniadakis

Keyword(s):

Big Data ◽

Gaussian Process ◽

Gaussian Process Regression

Download Full-text

A video inpainting method for unmanned vehicle based on fusion of time series optical flow information and spatial information

International Journal of Advanced Robotic Systems ◽

10.1177/17298814211053103 ◽

2021 ◽

Vol 18 (5) ◽

pp. 172988142110531

Author(s):

Rui Zhao ◽

Hengyu Li ◽

Jingyi Liu ◽

Huayan Pu ◽

Shaorong Xie ◽

...

Keyword(s):

Optical Flow ◽

Spatial Information ◽

Vision System ◽

External Environment ◽

Multiple Perspectives ◽

Generative Adversarial Network ◽

Data Set ◽

Adversarial Network ◽

Video Inpainting ◽

Flow Information

In this article, the problem of video inpainting combines multiview spatial information and interframe information between video sequences. A vision system is an important way for autonomous vehicles to obtain information about the external environment. Loss or distortion of visual images caused by camera damage or pollution seriously makes an impact on the vision system ability to correctly perceive and understand the external environment. In this article, we solve the problem of image restoration by combining the optical flow information between frames in the video with the spatial information from multiple perspectives. To solve the problems of noise in the single-frame images of video frames, we propose a complete two-stage video repair method. We combine the spatial information of images from different perspectives and the optical flow information of the video sequence to assist and constrain the repair of damaged images in the video. This method combines the interframe information of the front and rear image frames with the multiview image information in the video and performs video repair based on optical flow and a conditional generation adversarial network. This method regards video inpainting as a pixel propagation problem, uses the interframe information in the video for video inpainting, and introduces multiview information to assist the repair based on a conditional generative adversarial network. This method was trained and tested in Zurich using a data set recorded by a pair of cameras mounted on a mobile platform.

Download Full-text