An Unsupervised Generative Adversarial Network-Based Method for Defect Inspection of Texture Surfaces

Jichun Wang; Guodong Yi; Shuyou Zhang; Yang Wang

doi:10.3390/app11010283

An Unsupervised Generative Adversarial Network-Based Method for Defect Inspection of Texture Surfaces

Applied Sciences ◽

10.3390/app11010283 ◽

2020 ◽

Vol 11 (1) ◽

pp. 283

Author(s):

Jichun Wang ◽

Guodong Yi ◽

Shuyou Zhang ◽

Yang Wang

Keyword(s):

Input Image ◽

Defect Inspection ◽

Generative Adversarial Network ◽

Texture Surface ◽

Industrial Community ◽

Adversarial Network ◽

Image Patches ◽

Learning Capabilities ◽

Initial Segmentation ◽

To Receive

Recently, deep learning-based defect inspection methods have begun to receive more attention—from both researchers and the industrial community—due to their powerful representation and learning capabilities. These methods, however, require a large number of samples and manual annotation to achieve an acceptable detection rate. In this paper, we propose an unsupervised method of detecting and locating defects on patterned texture surface images which, in the training phase, needs only a moderate number of defect-free samples. An extended deep convolutional generative adversarial network (DCGAN) is utilized to reconstruct input image patches; the resulting residual map can be used to realize the initial segmentation defects. To further improve the accuracy of defect segmentation, a submodule termed “local difference analysis” (LDA) is embedded into the overall module to eliminate false positives. We conduct comparative experiments on a series of datasets and the final results verify the effectiveness of the proposed method.

Download Full-text

Measuring Traffic Volumes Using an Autoencoder with No Need to Tag Images with Labels

Electronics ◽

10.3390/electronics9050702 ◽

2020 ◽

Vol 9 (5) ◽

pp. 702

Author(s):

Seungbin Roh ◽

Johyun Shin ◽

Keemin Sohn

Keyword(s):

Input Image ◽

Video Frame ◽

Generative Adversarial Network ◽

Detection Algorithms ◽

Adversarial Network ◽

Simpler Algorithm ◽

Proposed Model ◽

Traffic Volumes ◽

Step Algorithm ◽

Almost All

Almost all vision technologies that are used to measure traffic volume use a two-step procedure that involves tracking and detecting. Object detection algorithms such as YOLO and Fast-RCNN have been successfully applied to detecting vehicles. The tracking of vehicles requires an additional algorithm that can trace the vehicles that appear in a previous video frame to their appearance in a subsequent frame. This two-step algorithm prevails in the field but requires substantial computation resources for training, testing, and evaluation. The present study devised a simpler algorithm based on an autoencoder that requires no labeled data for training. An autoencoder was trained on the pixel intensities of a virtual line placed on images in an unsupervised manner. The last hidden node of the former encoding portion of the autoencoder generates a scalar signal that can be used to judge whether a vehicle is passing. A cycle-consistent generative adversarial network (CycleGAN) was used to transform an original input photo of complex vehicle images and backgrounds into a simple illustration input image that enhances the performance of the autoencoder in judging the presence of a vehicle. The proposed model is much lighter and faster than a YOLO-based model, and accuracy of the proposed model is equivalent to, or better than, a YOLO-based model. In measuring traffic volumes, the proposed approach turned out to be robust in terms of both accuracy and efficiency.

Download Full-text

EnsNet: Ensconce Text in the Wild

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.3301801 ◽

2019 ◽

Vol 33 ◽

pp. 801-808 ◽

Cited By ~ 5

Author(s):

Shuaitao Zhang ◽

Yuliang Liu ◽

Lianwen Jin ◽

Yaoxiong Huang ◽

Songxuan Lai

Keyword(s):

Generative Adversarial Network ◽

Local Consistency ◽

Adversarial Network ◽

Image Patches ◽

Scene Text ◽

In The Wild ◽

Lateral Connection ◽

Previous State ◽

End To End ◽

General Object

A new method is proposed for removing text from natural images. The challenge is to first accurately localize text on the stroke-level and then replace it with a visually plausible background. Unlike previous methods that require image patches to erase scene text, our method, namely ensconce network (EnsNet), can operate end-to-end on a single image without any prior knowledge. The overall structure is an end-to-end trainable FCN-ResNet-18 network with a conditional generative adversarial network (cGAN). The feature of the former is first enhanced by a novel lateral connection structure and then refined by four carefully designed losses: multiscale regression loss and content loss, which capture the global discrepancy of different level features; texture loss and total variation loss, which primarily target filling the text region and preserving the reality of the background. The latter is a novel local-sensitive GAN, which attentively assesses the local consistency of the text erased regions. Both qualitative and quantitative sensitivity experiments on synthetic images and the ICDAR 2013 dataset demonstrate that each component of the EnsNet is essential to achieve a good performance. Moreover, our EnsNet can significantly outperform previous state-of-the-art methods in terms of all metrics. In addition, a qualitative experiment conducted on the SBMNet dataset further demonstrates that the proposed method can also preform well on general object (such as pedestrians) removal tasks. EnsNet is extremely fast, which can preform at 333 fps on an i5-8600 CPU device.

Download Full-text

Super-Resolution Enhancement Method Based on Generative Adversarial Network for Integral Imaging Microscopy

Sensors ◽

10.3390/s21062164 ◽

2021 ◽

Vol 21 (6) ◽

pp. 2164

Author(s):

Md. Shahinur Alam ◽

Ki-Chul Kwon ◽

Munkh-Uchral Erdenebat ◽

Mohammed Y. Abbass ◽

Md. Ashraful Alam ◽

...

Keyword(s):

High Resolution ◽

Resolution Enhancement ◽

Three Dimensional ◽

Super Resolution ◽

Input Image ◽

Low Resolution ◽

Generative Adversarial Network ◽

Microscopic Object ◽

Integral Imaging ◽

Adversarial Network

The integral imaging microscopy system provides a three-dimensional visualization of a microscopic object. However, it has a low-resolution problem due to the fundamental limitation of the F-number (the aperture stops) by using micro lens array (MLA) and a poor illumination environment. In this paper, a generative adversarial network (GAN)-based super-resolution algorithm is proposed to enhance the resolution where the directional view image is directly fed as input. In a GAN network, the generator regresses the high-resolution output from the low-resolution input image, whereas the discriminator distinguishes between the original and generated image. In the generator part, we use consecutive residual blocks with the content loss to retrieve the photo-realistic original image. It can restore the edges and enhance the resolution by ×2, ×4, and even ×8 times without seriously hampering the image quality. The model is tested with a variety of low-resolution microscopic sample images and successfully generates high-resolution directional view images with better illumination. The quantitative analysis shows that the proposed model performs better for microscopic images than the existing algorithms.

Download Full-text

Explainable Medical Image Segmentation via Generative Adversarial Networks and Layer-wise Relevance Propagation

Nordic Machine Intelligence ◽

10.5617/nmi.9126 ◽

2021 ◽

Vol 1 (1) ◽

pp. 20-22

Author(s):

Awadelrahman M. A. Ahmed ◽

Leen A. M. Ali

Keyword(s):

Image Segmentation ◽

Medical Image ◽

Input Image ◽

Medical Image Segmentation ◽

Jaccard Index ◽

Generative Adversarial Networks ◽

Generative Adversarial Network ◽

Adversarial Network ◽

Adversarial Networks

This paper contributes in automating medical image segmentation by proposing generative adversarial network based models to segment both polyps and instruments in endoscopy images. A main contribution of this paper is providing explanations for the predictions using layer-wise relevance propagation approach, showing which pixels in the input image are more relevant to the predictions. The models achieved 0.46 and 0.70, on Jaccard index and 0.84 and 0.96 accuracy, on the polyp segmentation and the instrument segmentation, respectively.

Download Full-text

A Generative Adversarial Network for Infrared and Visible Image Fusion Based on Semantic Segmentation

Entropy ◽

10.3390/e23030376 ◽

2021 ◽

Vol 23 (3) ◽

pp. 376

Author(s):

Jilei Hou ◽

Dazhi Zhang ◽

Wei Wu ◽

Jiayi Ma ◽

Huabing Zhou

Keyword(s):

Image Fusion ◽

Information Source ◽

Semantic Segmentation ◽

Input Image ◽

Infrared Images ◽

Generative Adversarial Network ◽

Visible Image ◽

Adversarial Network ◽

Visible Images ◽

High Level

This paper proposes a new generative adversarial network for infrared and visible image fusion based on semantic segmentation (SSGAN), which can consider not only the low-level features of infrared and visible images, but also the high-level semantic information. Source images can be divided into foregrounds and backgrounds by semantic masks. The generator with a dual-encoder-single-decoder framework is used to extract the feature of foregrounds and backgrounds by different encoder paths. Moreover, the discriminator’s input image is designed based on semantic segmentation, which is obtained by combining the foregrounds of the infrared images with the backgrounds of the visible images. Consequently, the prominence of thermal targets in the infrared images and texture details in the visible images can be preserved in the fused images simultaneously. Qualitative and quantitative experiments on publicly available datasets demonstrate that the proposed approach can significantly outperform the state-of-the-art methods.

Download Full-text

Image Anonymization using Deep Convolutional Generative Adversarial Network

Journal of Physics Conference Series ◽

10.1088/1742-6596/2089/1/012012 ◽

2021 ◽

Vol 2089 (1) ◽

pp. 012012

Author(s):

K Nitalaksheswara Rao ◽

P Jayasree ◽

Ch.V.Murali Krishna ◽

K Sai Prasanth ◽

Ch Satyananda Reddy

Keyword(s):

Deep Learning ◽

Data Privacy ◽

Input Image ◽

Space Representation ◽

Synthetic Image ◽

Original Image ◽

Generative Adversarial Network ◽

Model Inversion ◽

Adversarial Network ◽

Recent Developments

Abstract Advancement in deep learning requires significantly huge amount of data for training purpose, where protection of individual data plays a key role in data privacy and publication. Recent developments in deep learning demonstarte a huge challenge for traditionally used approch for image anonymization, such as model inversion attack, where adversary repeatedly query the model, inorder to reconstrut the original image from the anonymized image. In order to apply more protection on image anonymization, an approach is presented here to convert the input (raw) image into a new synthetic image by applying optimized noise to the latent space representation (LSR) of the original image. The synthetic image is anonymized by adding well designed noise calculated over the gradient during the learning process, where the resultant image is both realistic and immune to model inversion attack. More presicely, we extend the approach proposed by T. Kim and J. Yang, 2019 by using Deep Convolutional Generative Adversarial Network (DCGAN) in order to make the approach more efficient. Our aim is to improve the efficiency of the model by changing the loss function to achieve optimal privacy in less time and computation. Finally, the proposed approach is demonstrated using a benchmark dataset. The experimental study presents that the proposed method can efficiently convert the input image into another synthetic image which is of high quality as well as immune to model inversion attack.

Download Full-text

Tag Disentangled Generative Adversarial Network for Object Image Re-rendering

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/404 ◽

2017 ◽

Cited By ~ 18

Author(s):

Chaoyue Wang ◽

Chaohui Wang ◽

Chang Xu ◽

Dacheng Tao

Keyword(s):

Input Image ◽

Generative Adversarial Networks ◽

Single Image ◽

Generative Adversarial Network ◽

Training Strategy ◽

Adversarial Network ◽

Adversarial Networks ◽

Adversarial Training ◽

Realistic Images

In this paper, we propose a principled Tag Disentangled Generative Adversarial Networks (TD-GAN) for re-rendering new images for the object of interest from a single image of it by specifying multiple scene properties (such as viewpoint, illumination, expression, etc.). The whole framework consists of a disentangling network, a generative network, a tag mapping net, and a discriminative network, which are trained jointly based on a given set of images that are completely/partially tagged (i.e., supervised/semi-supervised setting). Given an input image, the disentangling network extracts disentangled and interpretable representations, which are then used to generate images by the generative network. In order to boost the quality of disentangled representations, the tag mapping net is integrated to explore the consistency between the image and its tags. Furthermore, the discriminative network is introduced to implement the adversarial training strategy for generating more realistic images. Experiments on two challenging datasets demonstrate the state-of-the-art performance of the proposed framework in the problem of interest.

Download Full-text

Semantic Image to Image Translation using Machine Learning Algorithms

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.f4712.049620 ◽

2020 ◽

Vol 9 (4) ◽

pp. 1973-1976

Keyword(s):

Input Image ◽

Machine Learning Algorithms ◽

Real Image ◽

Generative Adversarial Network ◽

Training Time ◽

Photographic Images ◽

Output Image ◽

Adversarial Network ◽

Image Translation

In semantic image-to-image translation, the goal will be to learn mapping between an input image and the output image. A model of semantic image to image translation problem using Cycle GAN algorithm is proposed. Given a set of paired or unpaired images a transformation is learned to translate the input image into the specified domain. The dataset considered is cityscape dataset. In the cityscape dataset, the semantic images are converted into photographic images. Here a Generative Adversarial Network algorithm called Cycle GAN algorithm with cycle consistency loss is used. The cycle GAN algorithm can be used to transform the semantic image into a photographic or real image. The cycle consistency loss compares the real image and the output image of the second generator and gives the loss functions. In this paper, the model shows that by considering more training time we get the accurate results and the image quality will be improved. The model can be used when images from one domain needs to be converted into another domain inorder to obtain high quality of images.

Download Full-text

Unpaired medical image colorization using generative adversarial network

Multimedia Tools and Applications ◽

10.1007/s11042-020-10468-6 ◽

2021 ◽

Author(s):

Yihuai Liang ◽

Dongho Lee ◽

Yan Li ◽

Byeong-Seok Shin

Keyword(s):

Loss Function ◽

Medical Image ◽

Color Image ◽

Medical Images ◽

Diagnostic Errors ◽

Training Image ◽

Input Image ◽

Generative Adversarial Network ◽

Adversarial Network

AbstractWe consider medical image transformation problems where a grayscale image is transformed into a color image. The colorized medical image should have the same features as the input image because extra synthesized features can increase the possibility of diagnostic errors. In this paper, to secure colorized medical images and improve the quality of synthesized images, as well as to leverage unpaired training image data, a colorization network is proposed based on the cycle generative adversarial network (CycleGAN) model, combining a perceptual loss function and a total variation (TV) loss function. Visual comparisons and experimental indicators from the NRMSE, PSNR, and SSIM metrics are used to evaluate the performance of the proposed method. The experimental results show that GAN-based style conversion can be applied to colorization of medical images. As well, the introduction of perceptual loss and TV loss can improve the quality of images produced as a result of colorization better than the result generated by only using the CycleGAN model.

Download Full-text

Enlargement of the Field of View Based on Image Region Prediction Using Thermal Videos

Mathematics ◽

10.3390/math9192379 ◽

2021 ◽

Vol 9 (19) ◽

pp. 2379

Author(s):

Ganbayar Batchuluun ◽

Na Rae Baek ◽

Kang Ryoung Park

Keyword(s):

Similarity Index ◽

Structural Similarity ◽

Input Image ◽

The Body ◽

Human Detection ◽

Field Of View ◽

Image Region ◽

Generative Adversarial Network ◽

Adversarial Network ◽

Image Prediction

Various studies have been conducted for detecting humans in images. However, there are the cases where a part of human body disappears in the input image and leaves the camera field of view (FOV). Moreover, there are the cases where a pedestrian comes into the FOV as a part of the body slowly appears. In these cases, human detection and tracking fail by existing methods. Therefore, we propose the method for predicting a wider region than the FOV of a thermal camera based on the image prediction generative adversarial network version 2 (IPGAN-2). When an experiment was conducted using the marathon subdataset of the Boston University-thermal infrared video benchmark open dataset, the proposed method showed higher image prediction (structural similarity index measure (SSIM) of 0.9437) and object detection (F1 score of 0.866, accuracy of 0.914, and intersection over union (IoU) of 0.730) accuracies than state-of-the-art methods.

Download Full-text