scholarly journals BFRVSR: A Bidirectional Frame Recurrent Method for Video Super-Resolution

2020 ◽  
Vol 10 (23) ◽  
pp. 8749
Author(s):  
Xiongxiong Xue ◽  
Zhenqi Han ◽  
Weiqin Tong ◽  
Mingqi Li ◽  
Lizhuang Liu

Video super-resolution is a challenging task. One possible solution, called the sliding window method, tries to divide the generation of high-resolution video sequences into independent subtasks. Another popular method, named the recurrent algorithm, utilizes the generated high-resolution images of previous frames to generate the high-resolution image. However, both methods have some unavoidable disadvantages. The former method usually leads to bad temporal consistency and has higher computational cost, while the latter method cannot always make full use of information contained by optical flow or any other calculated features. Thus, more investigations need to be done to explore the balance between these two methods. In this work, a bidirectional frame recurrent video super-resolution method is proposed. To be specific, reverse training is proposed that also utilizes a generated high-resolution frame to help estimate the high-resolution version of the former frame. The bidirectional recurrent method guarantees temporal consistency and also makes full use of the adjacent information due to the bidirectional training operation, while the computational cost is acceptable. Experimental results demonstrate that the bidirectional super-resolution framework gives remarkable performance and it solves time-related problems.

Author(s):  
Xiongxiong Xue ◽  
Zhenqi Han ◽  
Weiqin Tong ◽  
Mingqi Li ◽  
Lizhuang Liu

Video super-resolution, which utilizes the relevant information of several low-resolution frames to generate high-resolution images, is a challenging task. One possible solution called sliding window method tries to divide the generation of high-resolution video sequences into independent sub-tasks, and only adjacent low-resolution images are used to estimate the high-resolution version of the central low-resolution image. Another popular method named recurrent algorithm proposes to utilize not only the low-resolution images but also the generated high-resolution images of previous frames to generate the high-resolution image. However, both methods have some unavoidable disadvantages. The former one usually leads to bad temporal consistency and requires higher computational cost while the latter method always can not make full use of information contained by optical flow or any other calculated features. Thus more investigations need to be done to explore the balance between these two methods. In this work, a bidirectional frame recurrent video super-resolution method is proposed. To be specific, a reverse training is proposed that the generated high-resolution frame is also utilized to help estimate the high-resolution version of the former frame. With the contribution of reverse training and the forward training, the idea of bidirectional recurrent method not only guarantees the temporal consistency but also make full use of the adjacent information due to the bidirectional training operation while the computational cost is acceptable. Experimental results demonstrate that the bidirectional super-resolution framework gives remarkable performance that it solves the time-related problems when the generated high-resolution image is impressive compared with recurrent-based video super-resolution method.


2020 ◽  
Vol 10 (12) ◽  
pp. 4282
Author(s):  
Ghada Zamzmi ◽  
Sivaramakrishnan Rajaraman ◽  
Sameer Antani

Medical images are acquired at different resolutions based on clinical goals or available technology. In general, however, high-resolution images with fine structural details are preferred for visual task analysis. Recognizing this significance, several deep learning networks have been proposed to enhance medical images for reliable automated interpretation. These deep networks are often computationally complex and require a massive number of parameters, which restrict them to highly capable computing platforms with large memory banks. In this paper, we propose an efficient deep learning approach, called Hydra, which simultaneously reduces computational complexity and improves performance. The Hydra consists of a trunk and several computing heads. The trunk is a super-resolution model that learns the mapping from low-resolution to high-resolution images. It has a simple architecture that is trained using multiple scales at once to minimize a proposed learning-loss function. We also propose to append multiple task-specific heads to the trained Hydra trunk for simultaneous learning of multiple visual tasks in medical images. The Hydra is evaluated on publicly available chest X-ray image collections to perform image enhancement, lung segmentation, and abnormality classification. Our experimental results support our claims and demonstrate that the proposed approach can improve the performance of super-resolution and visual task analysis in medical images at a remarkably reduced computational cost.


Author(s):  
ROOPA R ◽  
MRS. VANI.K. S ◽  
MRS. NAGAVENI. V

Image Processing is any form of signal processing for which the image is an input such as a photograph or video frame. The output of image processing may be either an image or a set of characteristics or parameters related to the image. In many facial analysis systems like Face Recognition face is used as an important biometric. Facial analysis systems need High Resolution images for their processing. The video obtained from inexpensive surveillance cameras are of poor quality. Processing of poor quality images leads to unexpected results. To detect face images from a video captured by inexpensive surveillance cameras, we will use AdaBoost algorithm. If we feed those detected face images having low resolution and low quality to face recognition systems they will produce some unstable and erroneous results. Because these systems have problem working with low resolution images. Hence we need a method to bridge the gap between on one hand low- resolution and low-quality images and on the other hand facial analysis systems. Our approach is to use a Reconstruction Based Super Resolution method. In Reconstruction Based Super Resolution method we will generate a face-log containing images of similar frontal faces of the highest possible quality using head pose estimation technique. Then, we use a Learning Based Super-Resolution algorithm applied to the result of the reconstruction-based part to improve the quality by another factor of two. Hence the total system quality factor will be improved by four.


2014 ◽  
Vol 568-570 ◽  
pp. 659-662
Author(s):  
Xue Jun Zhang ◽  
Bing Liang Hu

The paper proposes a new approach to single-image super resolution (SR), which is based on sparse representation. Previous researchers just focus on the global intensive patch, without local intensive patch. The performance of dictionary trained by the local saliency intensive patch is more significant. Motivated by this, we joined the saliency detection to detect marked area in the image. We proposed a sparse representation for saliency patch of the low-resolution input, and used the coefficients of this representation to generate the high-resolution output. Compared to precious approaches which simply sample a large amount of image patch pairs, the saliency dictionary pair is a more compact representation of the patch pairs, reducing the computational cost substantially. Through the experiment, we demonstrate that our algorithm generates high-resolution images that are competitive or even superior in quality to images produced by other similar SR methods.


2019 ◽  
Vol 11 (21) ◽  
pp. 2593
Author(s):  
Li ◽  
Zhang ◽  
Jiao ◽  
Liu ◽  
Yang ◽  
...  

In the convolutional sparse coding-based image super-resolution problem, the coefficients of low- and high-resolution images in the same position are assumed to be equivalent, which enforces an identical structure of low- and high-resolution images. However, in fact the structure of high-resolution images is much more complicated than that of low-resolution images. In order to reduce the coupling between low- and high-resolution representations, a semi-coupled convolutional sparse learning method (SCCSL) is proposed for image super-resolution. The proposed method uses nonlinear convolution operations as the mapping function between low- and high-resolution features, and conventional linear mapping can be seen as a special case of the proposed method. Secondly, the neighborhoods within the filter size are used to calculate the current pixel, improving the flexibility of our proposed model. In addition, the filter size is adjustable. In order to illustrate the effectiveness of SCCSL method, we compare it with four state-of-the-art methods of 15 commonly used images. Experimental results show that this work provides a more flexible and efficient approach for image super-resolution problem.


Sensors ◽  
2020 ◽  
Vol 20 (16) ◽  
pp. 4601
Author(s):  
Juan Wen ◽  
Yangjing Shi ◽  
Xiaoshi Zhou ◽  
Yiming Xue

Currently, various agricultural image classification tasks are carried out on high-resolution images. However, in some cases, we cannot get enough high-resolution images for classification, which significantly affects classification performance. In this paper, we design a crop disease classification network based on Enhanced Super-Resolution Generative adversarial networks (ESRGAN) when only an insufficient number of low-resolution target images are available. First, ESRGAN is used to recover super-resolution crop images from low-resolution images. Transfer learning is applied in model training to compensate for the lack of training samples. Then, we test the performance of the generated super-resolution images in crop disease classification task. Extensive experiments show that using the fine-tuned ESRGAN model can recover realistic crop information and improve the accuracy of crop disease classification, compared with the other four image super-resolution methods.


2020 ◽  
Vol 8 (4) ◽  
pp. 304-310
Author(s):  
Windra Swastika ◽  
Ekky Rino Fajar Sakti ◽  
Mochamad Subianto

Low-resolution images can be reconstructed into high-resolution images using the Super-resolution Convolution Neural Network (SRCNN) algorithm. This study aims to improve the vehicle license plate number's recognition accuracy by generating a high-resolution vehicle image using the SRCNN. The recognition is carried out by two types of character recognition methods: Tesseract OCR and SPNet. The training data for SRCNN uses the DIV2K dataset consisting of 900 images, while the training data for character recognition uses the Chars74 dataset. The high-resolution images constructed using SRCNN can increase the average accuracy of vehicle license plate number recognition by 16.9 % using Tesseract and 13.8 % with SPNet.


Author(s):  
Alejandro Güemes ◽  
Carlos Sanmiguel Vila ◽  
Stefano Discetti

A data-driven approach to reconstruct high-resolution flow fields is presented. The method is based on exploiting the recent advances of SRGANs (Super-Resolution Generative Adversarial Networks) to enhance the resolution of Particle Image Velocimetry (PIV). The proposed approach exploits the availability of incomplete projections on high-resolution fields using the same set of images processed by standard PIV. Such incomplete projection is made available by sparse particle-based measurements such as super-resolution particle tracking velocimetry. Consequently, in contrast to other works, the method does not need a dual set of low/high-resolution images, and can be applied directly on a single set of raw images for training and estimation. This data-enhanced particle approach is assessed employing two datasets generated from direct numerical simulations: a fluidic pinball and a turbulent channel flow. The results prove that this data-driven method is able to enhance the resolution of PIV measurements even in complex flows without the need of a separate high-resolution experiment for training.


2021 ◽  
Author(s):  
Jiali Wang ◽  
Zhengchun Liu ◽  
Ian Foster ◽  
Won Chang ◽  
Rajkumar Kettimuthu ◽  
...  

Abstract. This study develops a neural network-based approach for emulating high-resolution modeled precipitation data with comparable statistical properties but at greatly reduced computational cost. The key idea is to use combination of low- and high- resolution simulations to train a neural network to map from the former to the latter. Specifically, we define two types of CNNs, one that stacks variables directly and one that encodes each variable before stacking, and we train each CNN type both with a conventional loss function, such as mean square error (MSE), and with a conditional generative adversarial network (CGAN), for a total of four CNN variants.We compare the four new CNN-derived high-resolution precipitation results with precipitation generated from original high resolution simulations, a bilinear interpolater and the state-of-the-art CNN-based super-resolution (SR) technique. Results show that the SR technique produces results similar to those of the bilinear interpolator with smoother spatial and temporal distributions and smaller data variabilities and extremes than the high resolution simulations. While the new CNNs trained by MSE generate better results over some regions than the interpolator and SR technique do, their predictions are still not as close as ground truth. The CNNs trained by CGAN generate more realistic and physically reasonable results, better capturing not only data variability in time and space but also extremes such as intense and long-lasting storms. The new proposed CNN-based downscaling approach can downscale precipitation from 50 km to 12 km in 14 min for 30 years once the network is trained (training takes 4 hours using 1 GPU), while the conventional dynamical downscaling would take 1 months using 600 CPU cores to generate simulations at the resolution of 12 km over contiguous United States.


Author(s):  
Zheng Wang ◽  
Mang Ye ◽  
Fan Yang ◽  
Xiang Bai ◽  
Shin'ichi Satoh

Person re-identification (REID) is an important task in video surveillance and forensics applications. Most of previous approaches are based on a key assumption that all person images have uniform and sufficiently high resolutions. Actually, various low-resolutions and scale mismatching always exist in open world REID. We name this kind of problem as Scale-Adaptive Low Resolution Person Re-identification (SALR-REID). The most intuitive way to address this problem is to increase various low-resolutions (not only low, but also with different scales) to a uniform high-resolution. SR-GAN is one of the most competitive image super-resolution deep networks, designed with a fixed upscaling factor. However, it is still not suitable for SALR-REID task, which requires a network not only synthesizing high-resolution images with different upscaling factors, but also extracting discriminative image feature for judging person’s identity. (1) To promote the ability of scale-adaptive upscaling, we cascade multiple SRGANs in series. (2) To supplement the ability of image feature representation, we plug-in a reidentification network. With a unified formulation, a Cascaded Super-Resolution GAN (CSR-GAN) framework is proposed. Extensive evaluations on two simulated datasets and one public dataset demonstrate the advantages of our method over related state-of-the-art methods.


Sign in / Sign up

Export Citation Format

Share Document