scholarly journals An Unsupervised Generative Adversarial Network-Based Method for Defect Inspection of Texture Surfaces

2020 ◽  
Vol 11 (1) ◽  
pp. 283
Author(s):  
Jichun Wang ◽  
Guodong Yi ◽  
Shuyou Zhang ◽  
Yang Wang

Recently, deep learning-based defect inspection methods have begun to receive more attention—from both researchers and the industrial community—due to their powerful representation and learning capabilities. These methods, however, require a large number of samples and manual annotation to achieve an acceptable detection rate. In this paper, we propose an unsupervised method of detecting and locating defects on patterned texture surface images which, in the training phase, needs only a moderate number of defect-free samples. An extended deep convolutional generative adversarial network (DCGAN) is utilized to reconstruct input image patches; the resulting residual map can be used to realize the initial segmentation defects. To further improve the accuracy of defect segmentation, a submodule termed “local difference analysis” (LDA) is embedded into the overall module to eliminate false positives. We conduct comparative experiments on a series of datasets and the final results verify the effectiveness of the proposed method.

Electronics ◽  
2020 ◽  
Vol 9 (5) ◽  
pp. 702
Author(s):  
Seungbin Roh ◽  
Johyun Shin ◽  
Keemin Sohn

Almost all vision technologies that are used to measure traffic volume use a two-step procedure that involves tracking and detecting. Object detection algorithms such as YOLO and Fast-RCNN have been successfully applied to detecting vehicles. The tracking of vehicles requires an additional algorithm that can trace the vehicles that appear in a previous video frame to their appearance in a subsequent frame. This two-step algorithm prevails in the field but requires substantial computation resources for training, testing, and evaluation. The present study devised a simpler algorithm based on an autoencoder that requires no labeled data for training. An autoencoder was trained on the pixel intensities of a virtual line placed on images in an unsupervised manner. The last hidden node of the former encoding portion of the autoencoder generates a scalar signal that can be used to judge whether a vehicle is passing. A cycle-consistent generative adversarial network (CycleGAN) was used to transform an original input photo of complex vehicle images and backgrounds into a simple illustration input image that enhances the performance of the autoencoder in judging the presence of a vehicle. The proposed model is much lighter and faster than a YOLO-based model, and accuracy of the proposed model is equivalent to, or better than, a YOLO-based model. In measuring traffic volumes, the proposed approach turned out to be robust in terms of both accuracy and efficiency.


Author(s):  
Shuaitao Zhang ◽  
Yuliang Liu ◽  
Lianwen Jin ◽  
Yaoxiong Huang ◽  
Songxuan Lai

A new method is proposed for removing text from natural images. The challenge is to first accurately localize text on the stroke-level and then replace it with a visually plausible background. Unlike previous methods that require image patches to erase scene text, our method, namely ensconce network (EnsNet), can operate end-to-end on a single image without any prior knowledge. The overall structure is an end-to-end trainable FCN-ResNet-18 network with a conditional generative adversarial network (cGAN). The feature of the former is first enhanced by a novel lateral connection structure and then refined by four carefully designed losses: multiscale regression loss and content loss, which capture the global discrepancy of different level features; texture loss and total variation loss, which primarily target filling the text region and preserving the reality of the background. The latter is a novel local-sensitive GAN, which attentively assesses the local consistency of the text erased regions. Both qualitative and quantitative sensitivity experiments on synthetic images and the ICDAR 2013 dataset demonstrate that each component of the EnsNet is essential to achieve a good performance. Moreover, our EnsNet can significantly outperform previous state-of-the-art methods in terms of all metrics. In addition, a qualitative experiment conducted on the SBMNet dataset further demonstrates that the proposed method can also preform well on general object (such as pedestrians) removal tasks. EnsNet is extremely fast, which can preform at 333 fps on an i5-8600 CPU device.


Sensors ◽  
2021 ◽  
Vol 21 (6) ◽  
pp. 2164
Author(s):  
Md. Shahinur Alam ◽  
Ki-Chul Kwon ◽  
Munkh-Uchral Erdenebat ◽  
Mohammed Y. Abbass ◽  
Md. Ashraful Alam ◽  
...  

The integral imaging microscopy system provides a three-dimensional visualization of a microscopic object. However, it has a low-resolution problem due to the fundamental limitation of the F-number (the aperture stops) by using micro lens array (MLA) and a poor illumination environment. In this paper, a generative adversarial network (GAN)-based super-resolution algorithm is proposed to enhance the resolution where the directional view image is directly fed as input. In a GAN network, the generator regresses the high-resolution output from the low-resolution input image, whereas the discriminator distinguishes between the original and generated image. In the generator part, we use consecutive residual blocks with the content loss to retrieve the photo-realistic original image. It can restore the edges and enhance the resolution by ×2, ×4, and even ×8 times without seriously hampering the image quality. The model is tested with a variety of low-resolution microscopic sample images and successfully generates high-resolution directional view images with better illumination. The quantitative analysis shows that the proposed model performs better for microscopic images than the existing algorithms.


2021 ◽  
Vol 1 (1) ◽  
pp. 20-22
Author(s):  
Awadelrahman M. A. Ahmed ◽  
Leen A. M. Ali

This paper contributes in automating medical image segmentation by proposing generative adversarial network based models to segment both polyps and instruments in endoscopy images. A main contribution of this paper is providing explanations for the predictions using layer-wise relevance propagation approach, showing which pixels in the input image are more relevant to the predictions. The models achieved 0.46 and 0.70, on Jaccard index and 0.84 and 0.96 accuracy, on the polyp segmentation and the instrument segmentation, respectively.


Entropy ◽  
2021 ◽  
Vol 23 (3) ◽  
pp. 376
Author(s):  
Jilei Hou ◽  
Dazhi Zhang ◽  
Wei Wu ◽  
Jiayi Ma ◽  
Huabing Zhou

This paper proposes a new generative adversarial network for infrared and visible image fusion based on semantic segmentation (SSGAN), which can consider not only the low-level features of infrared and visible images, but also the high-level semantic information. Source images can be divided into foregrounds and backgrounds by semantic masks. The generator with a dual-encoder-single-decoder framework is used to extract the feature of foregrounds and backgrounds by different encoder paths. Moreover, the discriminator’s input image is designed based on semantic segmentation, which is obtained by combining the foregrounds of the infrared images with the backgrounds of the visible images. Consequently, the prominence of thermal targets in the infrared images and texture details in the visible images can be preserved in the fused images simultaneously. Qualitative and quantitative experiments on publicly available datasets demonstrate that the proposed approach can significantly outperform the state-of-the-art methods.


2021 ◽  
Vol 2089 (1) ◽  
pp. 012012
Author(s):  
K Nitalaksheswara Rao ◽  
P Jayasree ◽  
Ch.V.Murali Krishna ◽  
K Sai Prasanth ◽  
Ch Satyananda Reddy

Abstract Advancement in deep learning requires significantly huge amount of data for training purpose, where protection of individual data plays a key role in data privacy and publication. Recent developments in deep learning demonstarte a huge challenge for traditionally used approch for image anonymization, such as model inversion attack, where adversary repeatedly query the model, inorder to reconstrut the original image from the anonymized image. In order to apply more protection on image anonymization, an approach is presented here to convert the input (raw) image into a new synthetic image by applying optimized noise to the latent space representation (LSR) of the original image. The synthetic image is anonymized by adding well designed noise calculated over the gradient during the learning process, where the resultant image is both realistic and immune to model inversion attack. More presicely, we extend the approach proposed by T. Kim and J. Yang, 2019 by using Deep Convolutional Generative Adversarial Network (DCGAN) in order to make the approach more efficient. Our aim is to improve the efficiency of the model by changing the loss function to achieve optimal privacy in less time and computation. Finally, the proposed approach is demonstrated using a benchmark dataset. The experimental study presents that the proposed method can efficiently convert the input image into another synthetic image which is of high quality as well as immune to model inversion attack.


Author(s):  
Chaoyue Wang ◽  
Chaohui Wang ◽  
Chang Xu ◽  
Dacheng Tao

In this paper, we propose a principled Tag Disentangled Generative Adversarial Networks (TD-GAN) for re-rendering new images for the object of interest from a single image of it by specifying multiple scene properties (such as viewpoint, illumination, expression, etc.). The whole framework consists of a disentangling network, a generative network, a tag mapping net, and a discriminative network, which are trained jointly based on a given set of images that are completely/partially tagged (i.e., supervised/semi-supervised setting). Given an input image, the disentangling network extracts disentangled and interpretable representations, which are then used to generate images by the generative network. In order to boost the quality of disentangled representations, the tag mapping net is integrated to explore the consistency between the image and its tags. Furthermore, the discriminative network is introduced to implement the adversarial training strategy for generating more realistic images. Experiments on two challenging datasets demonstrate the state-of-the-art performance of the proposed framework in the problem of interest.


In semantic image-to-image translation, the goal will be to learn mapping between an input image and the output image. A model of semantic image to image translation problem using Cycle GAN algorithm is proposed. Given a set of paired or unpaired images a transformation is learned to translate the input image into the specified domain. The dataset considered is cityscape dataset. In the cityscape dataset, the semantic images are converted into photographic images. Here a Generative Adversarial Network algorithm called Cycle GAN algorithm with cycle consistency loss is used. The cycle GAN algorithm can be used to transform the semantic image into a photographic or real image. The cycle consistency loss compares the real image and the output image of the second generator and gives the loss functions. In this paper, the model shows that by considering more training time we get the accurate results and the image quality will be improved. The model can be used when images from one domain needs to be converted into another domain inorder to obtain high quality of images.


Author(s):  
Yihuai Liang ◽  
Dongho Lee ◽  
Yan Li ◽  
Byeong-Seok Shin

AbstractWe consider medical image transformation problems where a grayscale image is transformed into a color image. The colorized medical image should have the same features as the input image because extra synthesized features can increase the possibility of diagnostic errors. In this paper, to secure colorized medical images and improve the quality of synthesized images, as well as to leverage unpaired training image data, a colorization network is proposed based on the cycle generative adversarial network (CycleGAN) model, combining a perceptual loss function and a total variation (TV) loss function. Visual comparisons and experimental indicators from the NRMSE, PSNR, and SSIM metrics are used to evaluate the performance of the proposed method. The experimental results show that GAN-based style conversion can be applied to colorization of medical images. As well, the introduction of perceptual loss and TV loss can improve the quality of images produced as a result of colorization better than the result generated by only using the CycleGAN model.


Mathematics ◽  
2021 ◽  
Vol 9 (19) ◽  
pp. 2379
Author(s):  
Ganbayar Batchuluun ◽  
Na Rae Baek ◽  
Kang Ryoung Park

Various studies have been conducted for detecting humans in images. However, there are the cases where a part of human body disappears in the input image and leaves the camera field of view (FOV). Moreover, there are the cases where a pedestrian comes into the FOV as a part of the body slowly appears. In these cases, human detection and tracking fail by existing methods. Therefore, we propose the method for predicting a wider region than the FOV of a thermal camera based on the image prediction generative adversarial network version 2 (IPGAN-2). When an experiment was conducted using the marathon subdataset of the Boston University-thermal infrared video benchmark open dataset, the proposed method showed higher image prediction (structural similarity index measure (SSIM) of 0.9437) and object detection (F1 score of 0.866, accuracy of 0.914, and intersection over union (IoU) of 0.730) accuracies than state-of-the-art methods.


Sign in / Sign up

Export Citation Format

Share Document