scholarly journals An improved pix2pix model based on Gabor filter for robust color image rendering

2021 ◽  
Vol 19 (1) ◽  
pp. 86-101
Author(s):  
Hong-an Li ◽  
◽  
Min Zhang ◽  
Zhenhua Yu ◽  
Zhanli Li ◽  
...  

<abstract><p>In recent years, with the development of deep learning, image color rendering method has become a research hotspot once again. To overcome the detail problems of color overstepping and boundary blurring in the robust image color rendering method, as well as the problems of unstable training based on generative adversarial networks, we propose an color rendering method using Gabor filter based improved pix2pix for robust image. Firstly, the multi-direction/multi-scale selection characteristic of Gabor filter is used to preprocess the image to be rendered, which can retain the detailed features of the image while preprocessing to avoid the loss of features. Moreover, among the Gabor texture feature maps with 6 scales and 4 directions, the texture map with the scale of 7 and the direction of 0° has the comparable rendering performance. Finally, by improving the loss function of pix2pix model and adding the penalty term, not only the training can be stabilized, but also the ideal color image can be obtained. To reflect image color rendering quality of different models more objectively, PSNR and SSIM indexes are adopted to evaluate the rendered images. The experimental results of the proposed method show that the robust image rendered by this method has better visual performance and reduces the influence of light and noise on the image to a certain extent.</p></abstract>

2021 ◽  
Vol 11 (2) ◽  
pp. 721
Author(s):  
Hyung Yong Kim ◽  
Ji Won Yoon ◽  
Sung Jun Cheon ◽  
Woo Hyun Kang ◽  
Nam Soo Kim

Recently, generative adversarial networks (GANs) have been successfully applied to speech enhancement. However, there still remain two issues that need to be addressed: (1) GAN-based training is typically unstable due to its non-convex property, and (2) most of the conventional methods do not fully take advantage of the speech characteristics, which could result in a sub-optimal solution. In order to deal with these problems, we propose a progressive generator that can handle the speech in a multi-resolution fashion. Additionally, we propose a multi-scale discriminator that discriminates the real and generated speech at various sampling rates to stabilize GAN training. The proposed structure was compared with the conventional GAN-based speech enhancement algorithms using the VoiceBank-DEMAND dataset. Experimental results showed that the proposed approach can make the training faster and more stable, which improves the performance on various metrics for speech enhancement.


Sensors ◽  
2020 ◽  
Vol 20 (22) ◽  
pp. 6673
Author(s):  
Lichuan Zou ◽  
Hong Zhang ◽  
Chao Wang ◽  
Fan Wu ◽  
Feng Gu

In high-resolution Synthetic Aperture Radar (SAR) ship detection, the number of SAR samples seriously affects the performance of the algorithms based on deep learning. In this paper, aiming at the application requirements of high-resolution ship detection in small samples, a high-resolution SAR ship detection method combining an improved sample generation network, Multiscale Wasserstein Auxiliary Classifier Generative Adversarial Networks (MW-ACGAN) and the Yolo v3 network is proposed. Firstly, the multi-scale Wasserstein distance and gradient penalty loss are used to improve the original Auxiliary Classifier Generative Adversarial Networks (ACGAN), so that the improved network can stably generate high-resolution SAR ship images. Secondly, the multi-scale loss term is added to the network, so the multi-scale image output layers are added, and multi-scale SAR ship images can be generated. Then, the original ship data set and the generated data are combined into a composite data set to train the Yolo v3 target detection network, so as to solve the problem of low detection accuracy under small sample data set. The experimental results of Gaofen-3 (GF-3) 3 m SAR data show that the MW-ACGAN network can generate multi-scale and multi-class ship slices, and the confidence level of ResNet18 is higher than that of ACGAN network, with an average score of 0.91. The detection results of Yolo v3 network model show that the detection accuracy trained by the composite data set is as high as 94%, which is far better than that trained only by the original SAR data set. These results show that our method can make the best use of the original data set, improve the accuracy of ship detection.


Author(s):  
Ojas A. Ramwala ◽  
Smeet A. Dhakecha ◽  
Chirag N. Paunwala ◽  
Mita C. Paunwala

Documents are an essential source of valuable information and knowledge, and photographs are a great way of reminiscing old memories and past events. However, it becomes difficult to preserve the quality of such ancient documents and old photographs for an extremely long time, as these images usually get damaged or creased due to various extrinsic effects. Utilizing image editing software like Photoshop to manually reconstruct such old photographs and documents is a strenuous and an enduring process. This paper attempts to leverage the generative modeling capabilities of Conditional Generative Adversarial Networks by utilizing specialized architectures for the Generator and the Discriminator. The proposed Reminiscent Net has a U-Net-based Generator with numerous feature maps for complete information transfer with the incorporation of location and contextual details, and the absence of dense layers allows utilization of diverse sized images. Implementation of the PatchGAN-based Discriminator that penalizes the image at the scale of patches has been proposed. NADAM optimizer has been implemented to enable faster and better convergence of the loss function. The proposed method produces visually appealing de-creased images, and experiments indicate that the architecture performs better than various novel approaches, both qualitatively and quantitatively.


Sign in / Sign up

Export Citation Format

Share Document