Detecting and Measuring Defects in Wafer Die Using GAN and YOLOv3

Ssu-Han Chen; Chih-Hsiang Kang; Der-Baau Perng

doi:10.3390/app10238725

Detecting and Measuring Defects in Wafer Die Using GAN and YOLOv3

Applied Sciences ◽

10.3390/app10238725 ◽

2020 ◽

Vol 10 (23) ◽

pp. 8725

Author(s):

Ssu-Han Chen ◽

Chih-Hsiang Kang ◽

Der-Baau Perng

Keyword(s):

Training Image ◽

Decimal Place ◽

Average Precision ◽

Generative Adversarial Network ◽

The Real ◽

Adversarial Network ◽

Image Set ◽

Bounding Boxes ◽

Realistic Images ◽

Quality Sorting

This research used deep learning methods to develop a set of algorithms to detect die particle defects. Generative adversarial network (GAN) generated natural and realistic images, which improved the ability of you only look once version 3 (YOLOv3) to detect die defects. Then defects were measured based on the bounding boxes predicted by YOLOv3, which potentially provided the criteria for die quality sorting. The pseudo defective images generated by GAN from the real defective images were used as the training image set. The results obtained after training with the combination of the real and pseudo defective images were 7.33% higher in testing average precision (AP) and more accurate by one decimal place in testing coordinate error than after training with the real images alone. The GAN can enhance the diversity of defects, which improves the versatility of YOLOv3 somewhat. In summary, the method of combining GAN and YOLOv3 employed in this study creates a feature-free algorithm that does not require a massive collection of defective samples and does not require additional annotation of pseudo defects. The proposed method is feasible and advantageous for cases that deal with various kinds of die patterns.

Download Full-text

Generative adversarial network for glioblastoma ensures morphologic variations and improves diagnostic model for isocitrate dehydrogenase mutant type

Scientific Reports ◽

10.1038/s41598-021-89477-w ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Ji Eun Park ◽

Dain Eun ◽

Ho Sung Kim ◽

Da Hyun Lee ◽

Ryoung Woo Jang ◽

...

Keyword(s):

Isocitrate Dehydrogenase ◽

Synthetic Data ◽

Diagnostic Model ◽

Generative Adversarial Network ◽

Classification Rate ◽

Mutant Type ◽

Clinical Model ◽

The Real ◽

Adversarial Network ◽

Synthetic Images

AbstractGenerative adversarial network (GAN) creates synthetic images to increase data quantity, but whether GAN ensures meaningful morphologic variations is still unknown. We investigated whether GAN-based synthetic images provide sufficient morphologic variations to improve molecular-based prediction, as a rare disease of isocitrate dehydrogenase (IDH)-mutant glioblastomas. GAN was initially trained on 500 normal brains and 110 IDH-mutant high-grade astocytomas, and paired contrast-enhanced T1-weighted and FLAIR MRI data were generated. Diagnostic models were developed from real IDH-wild type (n = 80) with real IDH-mutant glioblastomas (n = 38), or with synthetic IDH-mutant glioblastomas, or augmented by adding both real and synthetic IDH-mutant glioblastomas. Turing tests showed synthetic data showed reality (classification rate of 55%). Both the real and synthetic data showed that a more frontal or insular location (odds ratio [OR] 1.34 vs. 1.52; P = 0.04) and distinct non-enhancing tumor margins (OR 2.68 vs. 3.88; P < 0.001), which become significant predictors of IDH-mutation. In an independent validation set, diagnostic accuracy was higher for the augmented model (90.9% [40/44] and 93.2% [41/44] for each reader, respectively) than for the real model (84.1% [37/44] and 86.4% [38/44] for each reader, respectively). The GAN-based synthetic images yield morphologically variable, realistic-seeming IDH-mutant glioblastomas. GAN will be useful to create a realistic training set in terms of morphologic variations and quality, thereby improving diagnostic performance in a clinical model.

Download Full-text

Automatic Melody Composition Using Enhanced GAN

Mathematics ◽

10.3390/math7100883 ◽

2019 ◽

Vol 7 (10) ◽

pp. 883 ◽

Cited By ~ 1

Author(s):

Shuyu Li ◽

Sejun Jang ◽

Yunsick Sung

Keyword(s):

Evaluation Method ◽

Short Term Memory ◽

Traditional Music ◽

Musical Instrument ◽

Generative Adversarial Network ◽

Digital Interface ◽

The Real ◽

Adversarial Network ◽

Creative Experience ◽

Special Knowledge

In traditional music composition, the composer has a special knowledge of music and combines emotion and creative experience to create music. As computer technology has evolved, various music-related technologies have been developed. To create new music, a considerable amount of time is required. Therefore, a system is required that can automatically compose music from input music. This study proposes a novel melody composition method that enhanced the original generative adversarial network (GAN) model based on individual bars. Two discriminators were used to form the enhanced GAN model: one was a long short-term memory (LSTM) model that was used to ensure correlation between the bars, and the other was a convolutional neural network (CNN) model that was used to ensure rationality of the bar structure. Experiments were conducted using bar encoding and the enhanced GAN model to compose a new melody and evaluate the quality of the composition melody. In the evaluation method, the TFIDF algorithm was also used to calculate the structural differences between four types of musical instrument digital interface (MIDI) file (i.e., randomly composed melody, melody composed by the original GAN, melody composed by the proposed method, and the real melody). Using the TFIDF algorithm, the structures of the melody composed were compared by the proposed method with the real melody and the structure of the traditional melody was compared with the structure of the real melody. The experimental results showed that the melody composed by the proposed method had more similarity with real melody structure with a difference of only 8% than that of the traditional melody structure.

Download Full-text

Synthesizing Depth Hand Images with GANs and Style Transfer for Hand Pose Estimation

Sensors ◽

10.3390/s19132919 ◽

2019 ◽

Vol 19 (13) ◽

pp. 2919 ◽

Cited By ~ 2

Author(s):

Wangyong He ◽

Zhongzhao Xie ◽

Yongbo Li ◽

Xinmei Wang ◽

Wendi Cai

Keyword(s):

Pose Estimation ◽

Ground Truth ◽

Training Image ◽

Training Data ◽

Generative Adversarial Network ◽

Style Transfer ◽

Visual Appearance ◽

Hand Pose Estimation ◽

Adversarial Network ◽

Hand Pose

Hand pose estimation is a critical technology of computer vision and human-computer interaction. Deep-learning methods require a considerable amount of tagged data. Accordingly, numerous labeled training data are required. This paper aims to generate depth hand images. Given a ground-truth 3D hand pose, the developed method can generate depth hand images. To be specific, a ground truth can be 3D hand poses with the hand structure contained, while the synthesized image has an identical size to that of the training image and a similar visual appearance to the training set. The developed method, inspired by the progress in the generative adversarial network (GAN) and image-style transfer, helps model the latent statistical relationship between the ground-truth hand pose and the corresponding depth hand image. The images synthesized using the developed method are demonstrated to be feasible for enhancing performance. On public hand pose datasets (NYU, MSRA, ICVL), comprehensive experiments prove that the developed method outperforms the existing works.

Download Full-text

Multimodal Structure-Consistent Image-to-Image Translation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6814 ◽

2020 ◽

Vol 34 (07) ◽

pp. 11490-11498

Author(s):

Che-Tsung Lin ◽

Yen-Yi Wu ◽

Po-Hao Hsu ◽

Shang-Hong Lai

Keyword(s):

Data Augmentation ◽

Network Models ◽

Detection Accuracy ◽

Target Domain ◽

Generative Adversarial Network ◽

Image Objects ◽

Perceptual Distance ◽

Adversarial Network ◽

Image Translation ◽

Realistic Images

Unpaired image-to-image translation is proven quite effective in boosting a CNN-based object detector for a different domain by means of data augmentation that can well preserve the image-objects in the translated images. Recently, multimodal GAN (Generative Adversarial Network) models have been proposed and were expected to further boost the detector accuracy by generating a diverse collection of images in the target domain, given only a single/labelled image in the source domain. However, images generated by multimodal GANs would achieve even worse detection accuracy than the ones by a unimodal GAN with better object preservation. In this work, we introduce cycle-structure consistency for generating diverse and structure-preserved translated images across complex domains, such as between day and night, for object detector training. Qualitative results show that our model, Multimodal AugGAN, can generate diverse and realistic images for the target domain. For quantitative comparisons, we evaluate other competing methods and ours by using the generated images to train YOLO, Faster R-CNN and FCN models and prove that our model achieves significant improvement and outperforms other methods on the detection accuracies and the FCN scores. Also, we demonstrate that our model could provide more diverse object appearances in the target domain through comparison on the perceptual distance metric.

Download Full-text

Accurate and Consistent Image-to-Image Conditional Adversarial Network

Electronics ◽

10.3390/electronics9030395 ◽

2020 ◽

Vol 9 (3) ◽

pp. 395 ◽

Cited By ~ 1

Author(s):

Naeem Ul Islam ◽

Sungmin Lee ◽

Jaebyung Park

Keyword(s):

Image Representation ◽

Terrain Analysis ◽

Generative Adversarial Network ◽

Training Strategy ◽

Adversarial Network ◽

Image Translation ◽

Successful Approach ◽

Realistic Image ◽

Potential Impact ◽

Realistic Images

Image-to-image translation based on deep learning has attracted interest in the robotics and vision community because of its potential impact on terrain analysis and image representation, interpretation, modification, and enhancement. Currently, the most successful approach for generating a translated image is a conditional generative adversarial network (cGAN) for training an autoencoder with skip connections. Despite its impressive performance, it has low accuracy and a lack of consistency; further, its training is imbalanced. This paper proposes a balanced training strategy for image-to-image translation, resulting in an accurate and consistent network. The proposed approach uses two generators and a single discriminator. The generators translate images from one domain to another. The discriminator takes the input of three different configurations and guides both the generators to generate realistic images in their corresponding domains while ensuring high accuracy and consistency. Experiments are conducted on different datasets. In particular, the proposed approach outperforms the cGAN in realistic image translation in terms of accuracy and consistency in training.

Download Full-text

Generating the Microstructure of Al-Si Cast Alloys Using Machine Learning

Korean Journal of Metals and Materials ◽

10.3365/kjmm.2021.59.11.838 ◽

2021 ◽

Vol 59 (11) ◽

pp. 838-847

Author(s):

In-Kyu Hwang ◽

Hyun-Ji Lee ◽

Sang-Jun Jeong ◽

In-Sung Cho ◽

Hee-Soo Kim

Keyword(s):

Machine Learning ◽

Data Augmentation ◽

Training Dataset ◽

Initial Training ◽

Microstructural Characteristics ◽

Generative Adversarial Network ◽

The Real ◽

Adversarial Network ◽

Cast Alloys ◽

The Right

In this study, we constructed a deep convolutional generative adversarial network (DCGAN) to generate the microstructural images that imitate the real microstructures of binary Al-Si cast alloys. We prepared four combinations of alloys, Al-6wt%Si, Al-9wt%Si, Al-12wt%Si and Al-15wt%Si for machine learning. DCGAN is composed of a generator and a discriminator. The discriminator has a typical convolutional neural network (CNN), and the generator has an inverse shaped CNN. The fake images generated using DCGAN were similar to real microstructural images. However, they showed some strange morphology, including dendrites without directionality, and deformed Si crystals. Verification with Inception V3 revealed that the fake images generated using DCGAN were well classified into the target categories. Even the visually imperfect images in the initial training iterations showed high similarity to the target. It seems that the imperfect images had enough microstructural characteristics to satisfy the classification, even though human cannot recognize the images. Cross validation was carried out using real, fake and other test images. When the training dataset had the fake images only, the real and test images showed high similarities to the target categories. When the training dataset contained both the real and fake images, the similarity at the target categories were high enough to meet the right answers. We concluded that the DCGAN developed for microstructural images in this study is highly useful for data augmentation for rare microstructures.

Download Full-text

Synthesizing VERDICT maps from standard diffusion mp-MRI data using GANs

10.1101/2021.02.16.431521 ◽

2021 ◽

Author(s):

Eleni Chiou ◽

Vanya Valindria ◽

Francesco Giganti ◽

Shonit Punwani ◽

Iasonas Kokkinos ◽

...

Keyword(s):

Prostate Cancer ◽

Mean Squared Error ◽

Quantitative Estimation ◽

Ground Truth ◽

Quantitative Information ◽

Strongly Correlated ◽

Generative Adversarial Network ◽

Mean Values ◽

The Real ◽

Adversarial Network

AbstractPurposeVERDICT maps have shown promising results in clinical settings discriminating normal from malignant tissue and identifying specific Gleason grades non-invasively. However, the quantitative estimation of VERDICT maps requires a specific diffusion-weighed imaging (DWI) acquisition. In this study we investigate the feasibility of synthesizing VERDICT maps from DWI data from multi-parametric (mp)-MRI which is widely used in clinical practice for prostate cancer diagnosis.MethodsWe use data from 67 patients who underwent both mp-MRI and VERDICT MRI. We compute the ground truth VERDICT maps from VERDICT MRI and we propose a generative adversarial network (GAN)-based approach to synthesize VERDICT maps from mp-MRI DWI data. We use correlation analysis and mean squared error to quantitatively evaluate the quality of the synthetic VERDICT maps compared to the real ones.ResultsQuantitative results show that the mean values of tumour areas in the synthetic and the real VERDICT maps were strongly correlated while qualitative results indicate that our method can generate realistic VERDICT maps from DWI from mp-MRI acquisitions.ConclusionRealistic VERDICT maps can be generated using DWI from standard mp-MRI. The synthetic maps preserve important quantitative information enabling the exploitation of VERDICT MRI for precise prostate cancer characterization with a single mp-MRI acquisition.

Download Full-text

EVALUATION OF SAR TO OPTICAL IMAGE TRANSLATION USING CONDITIONAL GENERATIVE ADVERSARIAL NETWORK FOR CLOUD REMOVAL IN A CROP DATASET

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xliii-b3-2021-823-2021 ◽

2021 ◽

Vol XLIII-B3-2021 ◽

pp. 823-828

Author(s):

L. E. Christovam ◽

M. H. Shimabukuro ◽

M. L. B. T. Galo ◽

E. Honkavaara

Keyword(s):

Time Series ◽

Visual Analysis ◽

Optical Image ◽

Optical Data ◽

Image Pair ◽

Generative Adversarial Network ◽

The Real ◽

Adversarial Network ◽

Sensing Applications ◽

Cloud Removal

Abstract. Most methods developed to map crop fields with high-quality are based on optical image time-series. However, often accuracy of these approaches is deteriorated due to clouds and cloud shadows, which can decrease the availably of optical data required to represent crop phenological stages. In this sense, the objective of this study was to implement and evaluate the conditional Generative Adversarial Network (cGAN) that has been indicated as a potential tool to address the cloud and cloud shadow removal; we also compared it with the Witthaker Smother (WS), which is a well-known data cleaning algorithm. The dataset used to train and assess the methods was the Luis Eduardo Magalhães benchmark for tropical agricultural remote sensing applications. We selected one MSI/Sentinel-2 and C-SAR/Sentinel-1 image pair taken in days as close as possible. A total of 5000 image pair patches were generated to train the cGAN model, which was used to derive synthetic optical pixels for a testing area. Visual analysis, spectral behaviour comparison, and classification were used to evaluate and compare the pixels generated with the cGAN and WS against the pixel values from the real image. The cGAN provided consistent pixel values for most crop types in comparison to the real pixel values and outperformed the WS significantly. The results indicated that the cGAN has potential to fill cloud and cloud shadow gaps in optical image time-series.

Download Full-text

Intraoral Image Generation by Progressive Growing of Generative Adversarial Network and Evaluation of Generated Image Quality by Dentists

10.21203/rs.3.rs-117942/v1 ◽

2020 ◽

Author(s):

Kazuma Kokomoto ◽

Rena Okawa ◽

Kazuhiko Nakano ◽

Kazunori Nozaki

Keyword(s):

Private Information ◽

Data Augmentation ◽

Wasserstein Distance ◽

Training Data ◽

Image Generation ◽

Generative Adversarial Network ◽

Adversarial Network ◽

Pediatric Dentists ◽

Realistic Images ◽

Clinical Cases

Abstract Dentists need experience with plenty of clinical cases to practice specialized skills. However, the need to protect patients’ private information limits the ability to utilize lots of intraoral images obtained from clinical cases. In this study, since generating realistic images could making utilizing lots of intraoral images possible, intraoral images are generated by using a progressive growing of generative adversarial network. 35,254 intraoral images were used as training data with resolutions of 128×128, 256×256, 512×512, and 1,024×1,024. The results of training datasets with and without data augmentation were compared. The sliced Wasserstein distance (SWD) was calculated to evaluate the generated images. Next, 50 real images and 50 generated images for each resolution were randomly selected and shuffled. Twelve pediatric dentists were asked to observe these images and assess whether each was real or generated. The accuracy of the assessment of the 1,024×1,024 images was significantly higher than that of the other resolutions. In conclusion, generated intraoral images with resolutions of 512×512 or lower were so realistic that the dentists could not distinguish whether they were real or generated. This implies that generated images can be used for dental education or data augmentation for deep learning free from privacy restrictions.

Download Full-text