Multi-task Learning-based All-in-one Collaboration Framework for Degraded Image Super-resolution

Xin Jin; Jianfeng Xu; Kazuyuki Tasaka; Zhibo Chen

doi:10.1145/3417333

Multi-task Learning-based All-in-one Collaboration Framework for Degraded Image Super-resolution

ACM Transactions on Multimedia Computing Communications and Applications ◽

10.1145/3417333 ◽

2021 ◽

Vol 17 (1) ◽

pp. 1-21

Author(s):

Xin Jin ◽

Jianfeng Xu ◽

Kazuyuki Tasaka ◽

Zhibo Chen

Keyword(s):

High Resolution ◽

Super Resolution ◽

Input Image ◽

Motion Blur ◽

Estimation Accuracy ◽

Low Resolution ◽

Task Learning ◽

Degraded Image ◽

Image Super Resolution ◽

Low Resolution Images

In this article, we address the degraded image super-resolution problem in a multi-task learning (MTL) manner. To better share representations between multiple tasks, we propose an all-in-one collaboration framework (ACF) with a learnable “junction” unit to handle two major problems that exist in MTL—“How to share” and “How much to share.” Specifically, ACF consists of a sharing phase and a reconstruction phase. Considering the intrinsic characteristic of multiple image degradations, we propose to first deal with the compression artifact, motion blur, and spatial structure information of the input image in parallel under a three-branch architecture in the sharing phase. Subsequently, in the reconstruction phase, we up-sample the previous features for high-resolution image reconstruction with a channel-wise and spatial attention mechanism. To coordinate two phases, we introduce a learnable “junction” unit with a dual-voting mechanism to selectively filter or preserve shared feature representations that come from sharing phase, learning an optimal combination for the following reconstruction phase. Finally, a curriculum learning-based training scheme is further proposed to improve the convergence of the whole framework. Extensive experimental results on synthetic and real-world low-resolution images show that the proposed all-in-one collaboration framework not only produces favorable high-resolution results while removing serious degradation, but also has high computational efficiency, outperforming state-of-the-art methods. We also have applied ACF to some image-quality sensitive practical task, such as pose estimation, to improve estimation accuracy of low-resolution images.

Download Full-text

Distillation

Machine Learning for Human Motion Analysis ◽

10.4018/978-1-60566-900-7.ch013 ◽

2010 ◽

pp. 244-264

Author(s):

Dong Seon Cheng ◽

Marco Cristani ◽

Vittorio Murino

Keyword(s):

High Resolution ◽

Super Resolution ◽

Video Data ◽

Distillation Process ◽

Low Resolution ◽

Resolution Image ◽

High Resolution Image ◽

Image Super Resolution ◽

Comparative Results ◽

Low Resolution Images

Image super-resolution is one of the most appealing applications of image processing, capable of retrieving a high resolution image by fusing several registered low resolution images depicting an object of interest. However, employing super-resolution in video data is challenging: a video sequence generally contains a lot of scattered information regarding several objects of interest in cluttered scenes. Especially with hand-held cameras, the overall quality may be poor due to low resolution or unsteadiness. The objective of this chapter is to demonstrate why standard image super-resolution fails in video data, which are the problems that arise, and how we can overcome these problems. In our first contribution, we propose a novel Bayesian framework for super-resolution of persistent objects of interest in video sequences. We call this process Distillation. In the traditional formulation of the image super-resolution problem, the observed target is (1) always the same, (2) acquired using a camera making small movements, and (3) found in a number of low resolution images sufficient to recover high-frequency information. These assumptions are usually unsatisfied in real world video acquisitions and often beyond the control of the video operator. With Distillation, we aim to extend and to generalize the image super-resolution task, embedding it in a structured framework that accurately distills all the informative bits of an object of interest. In practice, the Distillation process: i) individuates, in a semi supervised way, a set of objects of interest, clustering the related video frames and registering them with respect to global rigid transformations; ii) for each one, produces a high resolution image, by weighting each pixel according to the information retrieved about the object of interest. As a second contribution, we extend the Distillation process to deal with objects of interest whose transformations in the appearance are not (only) rigid. Such process, built on top of the Distillation, is hierarchical, in the sense that a process of clustering is applied recursively, beginning with the analysis of whole frames, and selectively focusing on smaller sub-regions whose isolated motion can be reasonably assumed as rigid. The ultimate product of the overall process is a strip of images that describe at high resolution the dynamics of the video, switching between alternative local descriptions in response to visual changes. Our approach is first tested on synthetic data, obtaining encouraging comparative results with respect to known super-resolution techniques, and a good robustness against noise. Second, real data coming from different videos are considered, trying to solve the major details of the objects in motion.

Download Full-text

Multi-Scale Gradient Image Super-Resolution for Preserving SIFT Key Points in Low-Resolution Images

Signal Processing Image Communication ◽

10.1016/j.image.2019.06.013 ◽

2019 ◽

Vol 78 ◽

pp. 236-245 ◽

Cited By ~ 3

Author(s):

Dewan Fahim Noor ◽

Yue Li ◽

Zhu Li ◽

Shuvra Bhattacharyya ◽

George York

Keyword(s):

Super Resolution ◽

Low Resolution ◽

Gradient Image ◽

Multi Scale ◽

Key Points ◽

Image Super Resolution ◽

Low Resolution Images

Download Full-text

Low Resolution Information Also Matters: Learning Multi-Resolution Representations for Person Re-Identification

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/179 ◽

2021 ◽

Author(s):

Guoqing Zhang ◽

Yuhao Chen ◽

Weisi Lin ◽

Arun Chandran ◽

Xuan Jing

Keyword(s):

Feature Extraction ◽

High Resolution ◽

Feature Fusion ◽

State Of The Art ◽

Super Resolution ◽

Input Image ◽

Low Resolution ◽

Joint Learning ◽

Novel Method ◽

Valid Information

As a prevailing task in video surveillance and forensics field, person re-identification (re-ID) aims to match person images captured from non-overlapped cameras. In unconstrained scenarios, person images often suffer from the resolution mismatch problem, i.e., Cross-Resolution Person Re-ID. To overcome this problem, most existing methods restore low resolution (LR) images to high resolution (HR) by super-resolution (SR). However, they only focus on the HR feature extraction and ignore the valid information from original LR images. In this work, we explore the influence of resolutions on feature extraction and develop a novel method for cross-resolution person re-ID called Multi-Resolution Representations Joint Learning (MRJL). Our method consists of a Resolution Reconstruction Network (RRN) and a Dual Feature Fusion Network (DFFN). The RRN uses an input image to construct a HR version and a LR version with an encoder and two decoders, while the DFFN adopts a dual-branch structure to generate person representations from multi-resolution images. Comprehensive experiments on five benchmarks verify the superiority of the proposed MRJL over the relevent state-of-the-art methods.

Download Full-text

Cascaded SR-GAN for Scale-Adaptive Low Resolution Person Re-identification

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/541 ◽

2018 ◽

Cited By ~ 25

Author(s):

Zheng Wang ◽

Mang Ye ◽

Fan Yang ◽

Xiang Bai ◽

Shin'ichi Satoh

Keyword(s):

High Resolution ◽

Super Resolution ◽

Feature Representation ◽

Image Feature ◽

Low Resolution ◽

Open World ◽

In Series ◽

High Resolution Images ◽

Public Dataset ◽

Image Super Resolution

Person re-identification (REID) is an important task in video surveillance and forensics applications. Most of previous approaches are based on a key assumption that all person images have uniform and sufficiently high resolutions. Actually, various low-resolutions and scale mismatching always exist in open world REID. We name this kind of problem as Scale-Adaptive Low Resolution Person Re-identification (SALR-REID). The most intuitive way to address this problem is to increase various low-resolutions (not only low, but also with different scales) to a uniform high-resolution. SR-GAN is one of the most competitive image super-resolution deep networks, designed with a fixed upscaling factor. However, it is still not suitable for SALR-REID task, which requires a network not only synthesizing high-resolution images with different upscaling factors, but also extracting discriminative image feature for judging person’s identity. (1) To promote the ability of scale-adaptive upscaling, we cascade multiple SRGANs in series. (2) To supplement the ability of image feature representation, we plug-in a reidentification network. With a unified formulation, a Cascaded Super-Resolution GAN (CSR-GAN) framework is proposed. Extensive evaluations on two simulated datasets and one public dataset demonstrate the advantages of our method over related state-of-the-art methods.

Download Full-text

Self-Dictionary Regression for Hyperspectral Image Super-Resolution

Remote Sensing ◽

10.3390/rs10101574 ◽

2018 ◽

Vol 10 (10) ◽

pp. 1574 ◽

Cited By ~ 4

Author(s):

Dongsheng Gao ◽

Zhentao Hu ◽

Renzhen Ye

Keyword(s):

High Resolution ◽

Spatial Resolution ◽

Spectral Resolution ◽

Cross Correlation ◽

Hyperspectral Image ◽

High Spatial Resolution ◽

Super Resolution ◽

High Spectral Resolution ◽

Low Resolution ◽

Image Super Resolution

Due to sensor limitations, hyperspectral images (HSIs) are acquired by hyperspectral sensors with high-spectral-resolution but low-spatial-resolution. It is difficult for sensors to acquire images with high-spatial-resolution and high-spectral-resolution simultaneously. Hyperspectral image super-resolution tries to enhance the spatial resolution of HSI by software techniques. In recent years, various methods have been proposed to fuse HSI and multispectral image (MSI) from an unmixing or a spectral dictionary perspective. However, these methods extract the spectral information from each image individually, and therefore ignore the cross-correlation between the observed HSI and MSI. It is difficult to achieve high-spatial-resolution while preserving the spatial-spectral consistency between low-resolution HSI and high-resolution HSI. In this paper, a self-dictionary regression based method is proposed to utilize cross-correlation between the observed HSI and MSI. Both the observed low-resolution HSI and MSI are simultaneously considered to estimate the endmember dictionary and the abundance code. To preserve the spectral consistency, the endmember dictionary is extracted by performing a common sparse basis selection on the concatenation of observed HSI and MSI. Then, a consistent constraint is exploited to ensure the spatial consistency between the abundance code of low-resolution HSI and the abundance code of high-resolution HSI. Extensive experiments on three datasets demonstrate that the proposed method outperforms the state-of-the-art methods.

Download Full-text

Super-Resolution Enhancement Method Based on Generative Adversarial Network for Integral Imaging Microscopy

Sensors ◽

10.3390/s21062164 ◽

2021 ◽

Vol 21 (6) ◽

pp. 2164

Author(s):

Md. Shahinur Alam ◽

Ki-Chul Kwon ◽

Munkh-Uchral Erdenebat ◽

Mohammed Y. Abbass ◽

Md. Ashraful Alam ◽

...

Keyword(s):

High Resolution ◽

Resolution Enhancement ◽

Three Dimensional ◽

Super Resolution ◽

Input Image ◽

Low Resolution ◽

Generative Adversarial Network ◽

Microscopic Object ◽

Integral Imaging ◽

Adversarial Network

The integral imaging microscopy system provides a three-dimensional visualization of a microscopic object. However, it has a low-resolution problem due to the fundamental limitation of the F-number (the aperture stops) by using micro lens array (MLA) and a poor illumination environment. In this paper, a generative adversarial network (GAN)-based super-resolution algorithm is proposed to enhance the resolution where the directional view image is directly fed as input. In a GAN network, the generator regresses the high-resolution output from the low-resolution input image, whereas the discriminator distinguishes between the original and generated image. In the generator part, we use consecutive residual blocks with the content loss to retrieve the photo-realistic original image. It can restore the edges and enhance the resolution by ×2, ×4, and even ×8 times without seriously hampering the image quality. The model is tested with a variety of low-resolution microscopic sample images and successfully generates high-resolution directional view images with better illumination. The quantitative analysis shows that the proposed model performs better for microscopic images than the existing algorithms.

Download Full-text

Video super resolution using convolutional neural network and image fusion techniques

International Journal of Knowledge-based and Intelligent Engineering Systems ◽

10.3233/kes-190037 ◽

2021 ◽

Vol 24 (4) ◽

pp. 279-287

Author(s):

Vikas Kumar ◽

Tanupriya Choudhury ◽

Suresh Chandra Satapathy ◽

Ravi Tomar ◽

Archit Aggarwal

Keyword(s):

Neural Network ◽

High Resolution ◽

Convolutional Neural Network ◽

Image Fusion ◽

Reference Model ◽

Super Resolution ◽

Motion Blur ◽

Single Image ◽

Image Super Resolution ◽

Single Image Super Resolution

Recently, huge progress has been achieved in the field of single image super resolution which augments the resolution of images. The idea behind super resolution is to convert low-resolution images into high-resolution images. SRCNN (Single Resolution Convolutional Neural Network) was a huge improvement over the existing methods of single-image super resolution. However, video super-resolution, despite being an active field of research, is yet to benefit from deep learning. Using still images and videos downloaded from various sources, we explore the possibility of using SRCNN along with image fusion techniques (minima, maxima, average, PCA, DWT) to improve over existing video super resolution methods. Video Super-Resolution has inherent difficulties such as unexpected motion, blur and noise. We propose Video Super Resolution – Image Fusion (VSR-IF) architecture which utilizes information from multiple frames to produce a single high- resolution frame for a video. We use SRCNN as a reference model to obtain high resolution adjacent frames and use a concatenation layer to group those frames into a single frame. Since, our method is data-driven and requires only minimal initial training, it is faster than other video super resolution methods. After testing our program, we find that our technique shows a significant improvement over SCRNN and other single image and frame super resolution techniques.

Download Full-text

Super Resolution Generative Adversarial Network (SRGANs) for Wheat Stripe Rust Classification

Sensors ◽

10.3390/s21237903 ◽

2021 ◽

Vol 21 (23) ◽

pp. 7903

Author(s):

Muhammad Hassan Maqsood ◽

Rafia Mumtaz ◽

Ihsan Ul Haq ◽

Uferah Shafi ◽

Syed Mohammad Hassan Zaidi ◽

...

Keyword(s):

High Resolution ◽

Yellow Rust ◽

Super Resolution ◽

Noise Removal ◽

Generative Adversarial Networks ◽

Wheat Crop ◽

Test Accuracy ◽

Low Resolution ◽

Low Resolution Images

Wheat yellow rust is a common agricultural disease that affects the crop every year across the world. The disease not only negatively impacts the quality of the yield but the quantity as well, which results in adverse impact on economy and food supply. It is highly desired to develop methods for fast and accurate detection of yellow rust in wheat crop; however, high-resolution images are not always available which hinders the ability of trained models in detection tasks. The approach presented in this study harnesses the power of super-resolution generative adversarial networks (SRGAN) for upsampling the images before using them to train deep learning models for the detection of wheat yellow rust. After preprocessing the data for noise removal, SRGANs are used for upsampling the images to increase their resolution which helps convolutional neural network (CNN) in learning high-quality features during training. This study empirically shows that SRGANs can be used effectively to improve the quality of images and produce significantly better results when compared with models trained using low-resolution images. This is evident from the results obtained on upsampled images, i.e., 83% of overall test accuracy, which are substantially better than the overall test accuracy achieved for low-resolution images, i.e., 75%. The proposed approach can be used in other real-world scenarios where images are of low resolution due to the unavailability of high-resolution camera in edge devices.

Download Full-text

Image Super-Resolution via Low-Rank Representation

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.568-570.652 ◽

2014 ◽

Vol 568-570 ◽

pp. 652-655 ◽

Cited By ~ 2

Author(s):

Zhao Li ◽

Le Wang ◽

Tao Yu ◽

Bing Liang Hu

Keyword(s):

High Resolution ◽

Super Resolution ◽

Low Rank ◽

Low Resolution ◽

Resolution Image ◽

High Resolution Image ◽

High Resolution Images ◽

Low Rank Representation ◽

Image Super Resolution ◽

Single Image Super Resolution

This paper presents a novel method for solving single-image super-resolution problems, based upon low-rank representation (LRR). Given a set of a low-resolution image patches, LRR seeks the lowest-rank representation among all the candidates that represent all patches as the linear combination of the patches in a low-resolution dictionary. By jointly training two dictionaries for the low-resolution and high-resolution images, we can enforce the similarity of LLRs between the low-resolution and high-resolution image pair with respect to their own dictionaries. Therefore, the LRR of a low-resolution image can be applied with the high-resolution dictionary to generate a high-resolution image. Unlike the well-known sparse representation, which computes the sparsest representation of each image patch individually, LRR aims at finding the lowest-rank representation of a collection of patches jointly. LRR better captures the global structure of image. Experiments show that our method gives good results both visually and quantitatively.

Download Full-text

Spatial Super Resolution of Real-World Aerial Images for Image-Based Plant Phenotyping

Remote Sensing ◽

10.3390/rs13122308 ◽

2021 ◽

Vol 13 (12) ◽

pp. 2308

Author(s):

Masoomeh Aslahishahri ◽

Kevin G. Stanley ◽

Hema Duddu ◽

Steve Shirtliffe ◽

Sally Vail ◽

...

Keyword(s):

High Resolution ◽

Spatial Resolution ◽

Real World ◽

Super Resolution ◽

Aerial Images ◽

Plant Phenotyping ◽

Low Resolution ◽

High Resolution Images ◽

Synthetic Images ◽

Low Resolution Images

Unmanned aerial vehicle (UAV) imaging is a promising data acquisition technique for image-based plant phenotyping. However, UAV images have a lower spatial resolution than similarly equipped in field ground-based vehicle systems, such as carts, because of their distance from the crop canopy, which can be particularly problematic for measuring small-sized plant features. In this study, the performance of three deep learning-based super resolution models, employed as a pre-processing tool to enhance the spatial resolution of low resolution images of three different kinds of crops were evaluated. To train a super resolution model, aerial images employing two separate sensors co-mounted on a UAV flown over lentil, wheat and canola breeding trials were collected. A software workflow to pre-process and align real-world low resolution and high-resolution images and use them as inputs and targets for training super resolution models was created. To demonstrate the effectiveness of real-world images, three different experiments employing synthetic images, manually downsampled high resolution images, or real-world low resolution images as input to the models were conducted. The performance of the super resolution models demonstrates that the models trained with synthetic images cannot generalize to real-world images and fail to reproduce comparable images with the targets. However, the same models trained with real-world datasets can reconstruct higher-fidelity outputs, which are better suited for measuring plant phenotypes.

Download Full-text