Multi-task Learning-based All-in-one Collaboration Framework for Degraded Image Super-resolution

Author(s):  
Xin Jin ◽  
Jianfeng Xu ◽  
Kazuyuki Tasaka ◽  
Zhibo Chen

In this article, we address the degraded image super-resolution problem in a multi-task learning (MTL) manner. To better share representations between multiple tasks, we propose an all-in-one collaboration framework (ACF) with a learnable “junction” unit to handle two major problems that exist in MTL—“How to share” and “How much to share.” Specifically, ACF consists of a sharing phase and a reconstruction phase. Considering the intrinsic characteristic of multiple image degradations, we propose to first deal with the compression artifact, motion blur, and spatial structure information of the input image in parallel under a three-branch architecture in the sharing phase. Subsequently, in the reconstruction phase, we up-sample the previous features for high-resolution image reconstruction with a channel-wise and spatial attention mechanism. To coordinate two phases, we introduce a learnable “junction” unit with a dual-voting mechanism to selectively filter or preserve shared feature representations that come from sharing phase, learning an optimal combination for the following reconstruction phase. Finally, a curriculum learning-based training scheme is further proposed to improve the convergence of the whole framework. Extensive experimental results on synthetic and real-world low-resolution images show that the proposed all-in-one collaboration framework not only produces favorable high-resolution results while removing serious degradation, but also has high computational efficiency, outperforming state-of-the-art methods. We also have applied ACF to some image-quality sensitive practical task, such as pose estimation, to improve estimation accuracy of low-resolution images.

Author(s):  
Dong Seon Cheng ◽  
Marco Cristani ◽  
Vittorio Murino

Image super-resolution is one of the most appealing applications of image processing, capable of retrieving a high resolution image by fusing several registered low resolution images depicting an object of interest. However, employing super-resolution in video data is challenging: a video sequence generally contains a lot of scattered information regarding several objects of interest in cluttered scenes. Especially with hand-held cameras, the overall quality may be poor due to low resolution or unsteadiness. The objective of this chapter is to demonstrate why standard image super-resolution fails in video data, which are the problems that arise, and how we can overcome these problems. In our first contribution, we propose a novel Bayesian framework for super-resolution of persistent objects of interest in video sequences. We call this process Distillation. In the traditional formulation of the image super-resolution problem, the observed target is (1) always the same, (2) acquired using a camera making small movements, and (3) found in a number of low resolution images sufficient to recover high-frequency information. These assumptions are usually unsatisfied in real world video acquisitions and often beyond the control of the video operator. With Distillation, we aim to extend and to generalize the image super-resolution task, embedding it in a structured framework that accurately distills all the informative bits of an object of interest. In practice, the Distillation process: i) individuates, in a semi supervised way, a set of objects of interest, clustering the related video frames and registering them with respect to global rigid transformations; ii) for each one, produces a high resolution image, by weighting each pixel according to the information retrieved about the object of interest. As a second contribution, we extend the Distillation process to deal with objects of interest whose transformations in the appearance are not (only) rigid. Such process, built on top of the Distillation, is hierarchical, in the sense that a process of clustering is applied recursively, beginning with the analysis of whole frames, and selectively focusing on smaller sub-regions whose isolated motion can be reasonably assumed as rigid. The ultimate product of the overall process is a strip of images that describe at high resolution the dynamics of the video, switching between alternative local descriptions in response to visual changes. Our approach is first tested on synthetic data, obtaining encouraging comparative results with respect to known super-resolution techniques, and a good robustness against noise. Second, real data coming from different videos are considered, trying to solve the major details of the objects in motion.


2019 ◽  
Vol 78 ◽  
pp. 236-245 ◽  
Author(s):  
Dewan Fahim Noor ◽  
Yue Li ◽  
Zhu Li ◽  
Shuvra Bhattacharyya ◽  
George York

Author(s):  
Guoqing Zhang ◽  
Yuhao Chen ◽  
Weisi Lin ◽  
Arun Chandran ◽  
Xuan Jing

As a prevailing task in video surveillance and forensics field, person re-identification (re-ID) aims to match person images captured from non-overlapped cameras. In unconstrained scenarios, person images often suffer from the resolution mismatch problem, i.e., Cross-Resolution Person Re-ID. To overcome this problem, most existing methods restore low resolution (LR) images to high resolution (HR) by super-resolution (SR). However, they only focus on the HR feature extraction and ignore the valid information from original LR images. In this work, we explore the influence of resolutions on feature extraction and develop a novel method for cross-resolution person re-ID called Multi-Resolution Representations Joint Learning (MRJL). Our method consists of a Resolution Reconstruction Network (RRN) and a Dual Feature Fusion Network (DFFN). The RRN uses an input image to construct a HR version and a LR version with an encoder and two decoders, while the DFFN adopts a dual-branch structure to generate person representations from multi-resolution images. Comprehensive experiments on five benchmarks verify the superiority of the proposed MRJL over the relevent state-of-the-art methods.


Author(s):  
Zheng Wang ◽  
Mang Ye ◽  
Fan Yang ◽  
Xiang Bai ◽  
Shin'ichi Satoh

Person re-identification (REID) is an important task in video surveillance and forensics applications. Most of previous approaches are based on a key assumption that all person images have uniform and sufficiently high resolutions. Actually, various low-resolutions and scale mismatching always exist in open world REID. We name this kind of problem as Scale-Adaptive Low Resolution Person Re-identification (SALR-REID). The most intuitive way to address this problem is to increase various low-resolutions (not only low, but also with different scales) to a uniform high-resolution. SR-GAN is one of the most competitive image super-resolution deep networks, designed with a fixed upscaling factor. However, it is still not suitable for SALR-REID task, which requires a network not only synthesizing high-resolution images with different upscaling factors, but also extracting discriminative image feature for judging person’s identity. (1) To promote the ability of scale-adaptive upscaling, we cascade multiple SRGANs in series. (2) To supplement the ability of image feature representation, we plug-in a reidentification network. With a unified formulation, a Cascaded Super-Resolution GAN (CSR-GAN) framework is proposed. Extensive evaluations on two simulated datasets and one public dataset demonstrate the advantages of our method over related state-of-the-art methods.


2018 ◽  
Vol 10 (10) ◽  
pp. 1574 ◽  
Author(s):  
Dongsheng Gao ◽  
Zhentao Hu ◽  
Renzhen Ye

Due to sensor limitations, hyperspectral images (HSIs) are acquired by hyperspectral sensors with high-spectral-resolution but low-spatial-resolution. It is difficult for sensors to acquire images with high-spatial-resolution and high-spectral-resolution simultaneously. Hyperspectral image super-resolution tries to enhance the spatial resolution of HSI by software techniques. In recent years, various methods have been proposed to fuse HSI and multispectral image (MSI) from an unmixing or a spectral dictionary perspective. However, these methods extract the spectral information from each image individually, and therefore ignore the cross-correlation between the observed HSI and MSI. It is difficult to achieve high-spatial-resolution while preserving the spatial-spectral consistency between low-resolution HSI and high-resolution HSI. In this paper, a self-dictionary regression based method is proposed to utilize cross-correlation between the observed HSI and MSI. Both the observed low-resolution HSI and MSI are simultaneously considered to estimate the endmember dictionary and the abundance code. To preserve the spectral consistency, the endmember dictionary is extracted by performing a common sparse basis selection on the concatenation of observed HSI and MSI. Then, a consistent constraint is exploited to ensure the spatial consistency between the abundance code of low-resolution HSI and the abundance code of high-resolution HSI. Extensive experiments on three datasets demonstrate that the proposed method outperforms the state-of-the-art methods.


Sensors ◽  
2021 ◽  
Vol 21 (6) ◽  
pp. 2164
Author(s):  
Md. Shahinur Alam ◽  
Ki-Chul Kwon ◽  
Munkh-Uchral Erdenebat ◽  
Mohammed Y. Abbass ◽  
Md. Ashraful Alam ◽  
...  

The integral imaging microscopy system provides a three-dimensional visualization of a microscopic object. However, it has a low-resolution problem due to the fundamental limitation of the F-number (the aperture stops) by using micro lens array (MLA) and a poor illumination environment. In this paper, a generative adversarial network (GAN)-based super-resolution algorithm is proposed to enhance the resolution where the directional view image is directly fed as input. In a GAN network, the generator regresses the high-resolution output from the low-resolution input image, whereas the discriminator distinguishes between the original and generated image. In the generator part, we use consecutive residual blocks with the content loss to retrieve the photo-realistic original image. It can restore the edges and enhance the resolution by ×2, ×4, and even ×8 times without seriously hampering the image quality. The model is tested with a variety of low-resolution microscopic sample images and successfully generates high-resolution directional view images with better illumination. The quantitative analysis shows that the proposed model performs better for microscopic images than the existing algorithms.


Author(s):  
Vikas Kumar ◽  
Tanupriya Choudhury ◽  
Suresh Chandra Satapathy ◽  
Ravi Tomar ◽  
Archit Aggarwal

Recently, huge progress has been achieved in the field of single image super resolution which augments the resolution of images. The idea behind super resolution is to convert low-resolution images into high-resolution images. SRCNN (Single Resolution Convolutional Neural Network) was a huge improvement over the existing methods of single-image super resolution. However, video super-resolution, despite being an active field of research, is yet to benefit from deep learning. Using still images and videos downloaded from various sources, we explore the possibility of using SRCNN along with image fusion techniques (minima, maxima, average, PCA, DWT) to improve over existing video super resolution methods. Video Super-Resolution has inherent difficulties such as unexpected motion, blur and noise. We propose Video Super Resolution – Image Fusion (VSR-IF) architecture which utilizes information from multiple frames to produce a single high- resolution frame for a video. We use SRCNN as a reference model to obtain high resolution adjacent frames and use a concatenation layer to group those frames into a single frame. Since, our method is data-driven and requires only minimal initial training, it is faster than other video super resolution methods. After testing our program, we find that our technique shows a significant improvement over SCRNN and other single image and frame super resolution techniques.


Sensors ◽  
2021 ◽  
Vol 21 (23) ◽  
pp. 7903
Author(s):  
Muhammad Hassan Maqsood ◽  
Rafia Mumtaz ◽  
Ihsan Ul Haq ◽  
Uferah Shafi ◽  
Syed Mohammad Hassan Zaidi ◽  
...  

Wheat yellow rust is a common agricultural disease that affects the crop every year across the world. The disease not only negatively impacts the quality of the yield but the quantity as well, which results in adverse impact on economy and food supply. It is highly desired to develop methods for fast and accurate detection of yellow rust in wheat crop; however, high-resolution images are not always available which hinders the ability of trained models in detection tasks. The approach presented in this study harnesses the power of super-resolution generative adversarial networks (SRGAN) for upsampling the images before using them to train deep learning models for the detection of wheat yellow rust. After preprocessing the data for noise removal, SRGANs are used for upsampling the images to increase their resolution which helps convolutional neural network (CNN) in learning high-quality features during training. This study empirically shows that SRGANs can be used effectively to improve the quality of images and produce significantly better results when compared with models trained using low-resolution images. This is evident from the results obtained on upsampled images, i.e., 83% of overall test accuracy, which are substantially better than the overall test accuracy achieved for low-resolution images, i.e., 75%. The proposed approach can be used in other real-world scenarios where images are of low resolution due to the unavailability of high-resolution camera in edge devices.


2014 ◽  
Vol 568-570 ◽  
pp. 652-655 ◽  
Author(s):  
Zhao Li ◽  
Le Wang ◽  
Tao Yu ◽  
Bing Liang Hu

This paper presents a novel method for solving single-image super-resolution problems, based upon low-rank representation (LRR). Given a set of a low-resolution image patches, LRR seeks the lowest-rank representation among all the candidates that represent all patches as the linear combination of the patches in a low-resolution dictionary. By jointly training two dictionaries for the low-resolution and high-resolution images, we can enforce the similarity of LLRs between the low-resolution and high-resolution image pair with respect to their own dictionaries. Therefore, the LRR of a low-resolution image can be applied with the high-resolution dictionary to generate a high-resolution image. Unlike the well-known sparse representation, which computes the sparsest representation of each image patch individually, LRR aims at finding the lowest-rank representation of a collection of patches jointly. LRR better captures the global structure of image. Experiments show that our method gives good results both visually and quantitatively.


2021 ◽  
Vol 13 (12) ◽  
pp. 2308
Author(s):  
Masoomeh Aslahishahri ◽  
Kevin G. Stanley ◽  
Hema Duddu ◽  
Steve Shirtliffe ◽  
Sally Vail ◽  
...  

Unmanned aerial vehicle (UAV) imaging is a promising data acquisition technique for image-based plant phenotyping. However, UAV images have a lower spatial resolution than similarly equipped in field ground-based vehicle systems, such as carts, because of their distance from the crop canopy, which can be particularly problematic for measuring small-sized plant features. In this study, the performance of three deep learning-based super resolution models, employed as a pre-processing tool to enhance the spatial resolution of low resolution images of three different kinds of crops were evaluated. To train a super resolution model, aerial images employing two separate sensors co-mounted on a UAV flown over lentil, wheat and canola breeding trials were collected. A software workflow to pre-process and align real-world low resolution and high-resolution images and use them as inputs and targets for training super resolution models was created. To demonstrate the effectiveness of real-world images, three different experiments employing synthetic images, manually downsampled high resolution images, or real-world low resolution images as input to the models were conducted. The performance of the super resolution models demonstrates that the models trained with synthetic images cannot generalize to real-world images and fail to reproduce comparable images with the targets. However, the same models trained with real-world datasets can reconstruct higher-fidelity outputs, which are better suited for measuring plant phenotypes.


Sign in / Sign up

Export Citation Format

Share Document