scholarly journals Detecting and Measuring Defects in Wafer Die Using GAN and YOLOv3

2020 ◽  
Vol 10 (23) ◽  
pp. 8725
Author(s):  
Ssu-Han Chen ◽  
Chih-Hsiang Kang ◽  
Der-Baau Perng

This research used deep learning methods to develop a set of algorithms to detect die particle defects. Generative adversarial network (GAN) generated natural and realistic images, which improved the ability of you only look once version 3 (YOLOv3) to detect die defects. Then defects were measured based on the bounding boxes predicted by YOLOv3, which potentially provided the criteria for die quality sorting. The pseudo defective images generated by GAN from the real defective images were used as the training image set. The results obtained after training with the combination of the real and pseudo defective images were 7.33% higher in testing average precision (AP) and more accurate by one decimal place in testing coordinate error than after training with the real images alone. The GAN can enhance the diversity of defects, which improves the versatility of YOLOv3 somewhat. In summary, the method of combining GAN and YOLOv3 employed in this study creates a feature-free algorithm that does not require a massive collection of defective samples and does not require additional annotation of pseudo defects. The proposed method is feasible and advantageous for cases that deal with various kinds of die patterns.

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Ji Eun Park ◽  
Dain Eun ◽  
Ho Sung Kim ◽  
Da Hyun Lee ◽  
Ryoung Woo Jang ◽  
...  

AbstractGenerative adversarial network (GAN) creates synthetic images to increase data quantity, but whether GAN ensures meaningful morphologic variations is still unknown. We investigated whether GAN-based synthetic images provide sufficient morphologic variations to improve molecular-based prediction, as a rare disease of isocitrate dehydrogenase (IDH)-mutant glioblastomas. GAN was initially trained on 500 normal brains and 110 IDH-mutant high-grade astocytomas, and paired contrast-enhanced T1-weighted and FLAIR MRI data were generated. Diagnostic models were developed from real IDH-wild type (n = 80) with real IDH-mutant glioblastomas (n = 38), or with synthetic IDH-mutant glioblastomas, or augmented by adding both real and synthetic IDH-mutant glioblastomas. Turing tests showed synthetic data showed reality (classification rate of 55%). Both the real and synthetic data showed that a more frontal or insular location (odds ratio [OR] 1.34 vs. 1.52; P = 0.04) and distinct non-enhancing tumor margins (OR 2.68 vs. 3.88; P < 0.001), which become significant predictors of IDH-mutation. In an independent validation set, diagnostic accuracy was higher for the augmented model (90.9% [40/44] and 93.2% [41/44] for each reader, respectively) than for the real model (84.1% [37/44] and 86.4% [38/44] for each reader, respectively). The GAN-based synthetic images yield morphologically variable, realistic-seeming IDH-mutant glioblastomas. GAN will be useful to create a realistic training set in terms of morphologic variations and quality, thereby improving diagnostic performance in a clinical model.


Mathematics ◽  
2019 ◽  
Vol 7 (10) ◽  
pp. 883 ◽  
Author(s):  
Shuyu Li ◽  
Sejun Jang ◽  
Yunsick Sung

In traditional music composition, the composer has a special knowledge of music and combines emotion and creative experience to create music. As computer technology has evolved, various music-related technologies have been developed. To create new music, a considerable amount of time is required. Therefore, a system is required that can automatically compose music from input music. This study proposes a novel melody composition method that enhanced the original generative adversarial network (GAN) model based on individual bars. Two discriminators were used to form the enhanced GAN model: one was a long short-term memory (LSTM) model that was used to ensure correlation between the bars, and the other was a convolutional neural network (CNN) model that was used to ensure rationality of the bar structure. Experiments were conducted using bar encoding and the enhanced GAN model to compose a new melody and evaluate the quality of the composition melody. In the evaluation method, the TFIDF algorithm was also used to calculate the structural differences between four types of musical instrument digital interface (MIDI) file (i.e., randomly composed melody, melody composed by the original GAN, melody composed by the proposed method, and the real melody). Using the TFIDF algorithm, the structures of the melody composed were compared by the proposed method with the real melody and the structure of the traditional melody was compared with the structure of the real melody. The experimental results showed that the melody composed by the proposed method had more similarity with real melody structure with a difference of only 8% than that of the traditional melody structure.


Sensors ◽  
2019 ◽  
Vol 19 (13) ◽  
pp. 2919 ◽  
Author(s):  
Wangyong He ◽  
Zhongzhao Xie ◽  
Yongbo Li ◽  
Xinmei Wang ◽  
Wendi Cai

Hand pose estimation is a critical technology of computer vision and human-computer interaction. Deep-learning methods require a considerable amount of tagged data. Accordingly, numerous labeled training data are required. This paper aims to generate depth hand images. Given a ground-truth 3D hand pose, the developed method can generate depth hand images. To be specific, a ground truth can be 3D hand poses with the hand structure contained, while the synthesized image has an identical size to that of the training image and a similar visual appearance to the training set. The developed method, inspired by the progress in the generative adversarial network (GAN) and image-style transfer, helps model the latent statistical relationship between the ground-truth hand pose and the corresponding depth hand image. The images synthesized using the developed method are demonstrated to be feasible for enhancing performance. On public hand pose datasets (NYU, MSRA, ICVL), comprehensive experiments prove that the developed method outperforms the existing works.


2020 ◽  
Vol 34 (07) ◽  
pp. 11490-11498
Author(s):  
Che-Tsung Lin ◽  
Yen-Yi Wu ◽  
Po-Hao Hsu ◽  
Shang-Hong Lai

Unpaired image-to-image translation is proven quite effective in boosting a CNN-based object detector for a different domain by means of data augmentation that can well preserve the image-objects in the translated images. Recently, multimodal GAN (Generative Adversarial Network) models have been proposed and were expected to further boost the detector accuracy by generating a diverse collection of images in the target domain, given only a single/labelled image in the source domain. However, images generated by multimodal GANs would achieve even worse detection accuracy than the ones by a unimodal GAN with better object preservation. In this work, we introduce cycle-structure consistency for generating diverse and structure-preserved translated images across complex domains, such as between day and night, for object detector training. Qualitative results show that our model, Multimodal AugGAN, can generate diverse and realistic images for the target domain. For quantitative comparisons, we evaluate other competing methods and ours by using the generated images to train YOLO, Faster R-CNN and FCN models and prove that our model achieves significant improvement and outperforms other methods on the detection accuracies and the FCN scores. Also, we demonstrate that our model could provide more diverse object appearances in the target domain through comparison on the perceptual distance metric.


Electronics ◽  
2020 ◽  
Vol 9 (3) ◽  
pp. 395 ◽  
Author(s):  
Naeem Ul Islam ◽  
Sungmin Lee ◽  
Jaebyung Park

Image-to-image translation based on deep learning has attracted interest in the robotics and vision community because of its potential impact on terrain analysis and image representation, interpretation, modification, and enhancement. Currently, the most successful approach for generating a translated image is a conditional generative adversarial network (cGAN) for training an autoencoder with skip connections. Despite its impressive performance, it has low accuracy and a lack of consistency; further, its training is imbalanced. This paper proposes a balanced training strategy for image-to-image translation, resulting in an accurate and consistent network. The proposed approach uses two generators and a single discriminator. The generators translate images from one domain to another. The discriminator takes the input of three different configurations and guides both the generators to generate realistic images in their corresponding domains while ensuring high accuracy and consistency. Experiments are conducted on different datasets. In particular, the proposed approach outperforms the cGAN in realistic image translation in terms of accuracy and consistency in training.


2021 ◽  
Vol 59 (11) ◽  
pp. 838-847
Author(s):  
In-Kyu Hwang ◽  
Hyun-Ji Lee ◽  
Sang-Jun Jeong ◽  
In-Sung Cho ◽  
Hee-Soo Kim

In this study, we constructed a deep convolutional generative adversarial network (DCGAN) to generate the microstructural images that imitate the real microstructures of binary Al-Si cast alloys. We prepared four combinations of alloys, Al-6wt%Si, Al-9wt%Si, Al-12wt%Si and Al-15wt%Si for machine learning. DCGAN is composed of a generator and a discriminator. The discriminator has a typical convolutional neural network (CNN), and the generator has an inverse shaped CNN. The fake images generated using DCGAN were similar to real microstructural images. However, they showed some strange morphology, including dendrites without directionality, and deformed Si crystals. Verification with Inception V3 revealed that the fake images generated using DCGAN were well classified into the target categories. Even the visually imperfect images in the initial training iterations showed high similarity to the target. It seems that the imperfect images had enough microstructural characteristics to satisfy the classification, even though human cannot recognize the images. Cross validation was carried out using real, fake and other test images. When the training dataset had the fake images only, the real and test images showed high similarities to the target categories. When the training dataset contained both the real and fake images, the similarity at the target categories were high enough to meet the right answers. We concluded that the DCGAN developed for microstructural images in this study is highly useful for data augmentation for rare microstructures.


2021 ◽  
Author(s):  
Eleni Chiou ◽  
Vanya Valindria ◽  
Francesco Giganti ◽  
Shonit Punwani ◽  
Iasonas Kokkinos ◽  
...  

AbstractPurposeVERDICT maps have shown promising results in clinical settings discriminating normal from malignant tissue and identifying specific Gleason grades non-invasively. However, the quantitative estimation of VERDICT maps requires a specific diffusion-weighed imaging (DWI) acquisition. In this study we investigate the feasibility of synthesizing VERDICT maps from DWI data from multi-parametric (mp)-MRI which is widely used in clinical practice for prostate cancer diagnosis.MethodsWe use data from 67 patients who underwent both mp-MRI and VERDICT MRI. We compute the ground truth VERDICT maps from VERDICT MRI and we propose a generative adversarial network (GAN)-based approach to synthesize VERDICT maps from mp-MRI DWI data. We use correlation analysis and mean squared error to quantitatively evaluate the quality of the synthetic VERDICT maps compared to the real ones.ResultsQuantitative results show that the mean values of tumour areas in the synthetic and the real VERDICT maps were strongly correlated while qualitative results indicate that our method can generate realistic VERDICT maps from DWI from mp-MRI acquisitions.ConclusionRealistic VERDICT maps can be generated using DWI from standard mp-MRI. The synthetic maps preserve important quantitative information enabling the exploitation of VERDICT MRI for precise prostate cancer characterization with a single mp-MRI acquisition.


Author(s):  
L. E. Christovam ◽  
M. H. Shimabukuro ◽  
M. L. B. T. Galo ◽  
E. Honkavaara

Abstract. Most methods developed to map crop fields with high-quality are based on optical image time-series. However, often accuracy of these approaches is deteriorated due to clouds and cloud shadows, which can decrease the availably of optical data required to represent crop phenological stages. In this sense, the objective of this study was to implement and evaluate the conditional Generative Adversarial Network (cGAN) that has been indicated as a potential tool to address the cloud and cloud shadow removal; we also compared it with the Witthaker Smother (WS), which is a well-known data cleaning algorithm. The dataset used to train and assess the methods was the Luis Eduardo Magalhães benchmark for tropical agricultural remote sensing applications. We selected one MSI/Sentinel-2 and C-SAR/Sentinel-1 image pair taken in days as close as possible. A total of 5000 image pair patches were generated to train the cGAN model, which was used to derive synthetic optical pixels for a testing area. Visual analysis, spectral behaviour comparison, and classification were used to evaluate and compare the pixels generated with the cGAN and WS against the pixel values from the real image. The cGAN provided consistent pixel values for most crop types in comparison to the real pixel values and outperformed the WS significantly. The results indicated that the cGAN has potential to fill cloud and cloud shadow gaps in optical image time-series.


2020 ◽  
Author(s):  
Kazuma Kokomoto ◽  
Rena Okawa ◽  
Kazuhiko Nakano ◽  
Kazunori Nozaki

Abstract Dentists need experience with plenty of clinical cases to practice specialized skills. However, the need to protect patients’ private information limits the ability to utilize lots of intraoral images obtained from clinical cases. In this study, since generating realistic images could making utilizing lots of intraoral images possible, intraoral images are generated by using a progressive growing of generative adversarial network. 35,254 intraoral images were used as training data with resolutions of 128×128, 256×256, 512×512, and 1,024×1,024. The results of training datasets with and without data augmentation were compared. The sliced Wasserstein distance (SWD) was calculated to evaluate the generated images. Next, 50 real images and 50 generated images for each resolution were randomly selected and shuffled. Twelve pediatric dentists were asked to observe these images and assess whether each was real or generated. The accuracy of the assessment of the 1,024×1,024 images was significantly higher than that of the other resolutions. In conclusion, generated intraoral images with resolutions of 512×512 or lower were so realistic that the dentists could not distinguish whether they were real or generated. This implies that generated images can be used for dental education or data augmentation for deep learning free from privacy restrictions.


Sign in / Sign up

Export Citation Format

Share Document