A Fast 4K Video Frame Interpolation Using a Hybrid Task-Based Convolutional Neural Network

Ha-Eun Ahn; Jinwoo Jeong; Je Woo Kim

doi:10.3390/sym11050619

A Fast 4K Video Frame Interpolation Using a Hybrid Task-Based Convolutional Neural Network

Symmetry ◽

10.3390/sym11050619 ◽

2019 ◽

Vol 11 (5) ◽

pp. 619 ◽

Cited By ~ 2

Author(s):

Ha-Eun Ahn ◽

Jinwoo Jeong ◽

Je Woo Kim

Keyword(s):

Neural Network ◽

High Resolution ◽

Convolutional Neural Network ◽

High Frequency ◽

State Of The Art ◽

Visual Quality ◽

Video Frame ◽

Frame Interpolation ◽

Algorithm Efficiency ◽

Coarse To Fine

Visual quality and algorithm efficiency are two main interests in video frame interpolation. We propose a hybrid task-based convolutional neural network for fast and accurate frame interpolation of 4K videos. The proposed method synthesizes low-resolution frames, then reconstructs high-resolution frames in a coarse-to-fine fashion. We also propose edge loss, to preserve high-frequency information and make the synthesized frames look sharper. Experimental results show that the proposed method achieves state-of-the-art performance and performs 2.69x faster than the existing methods that are operable for 4K videos, while maintaining comparable visual and quantitative quality.

Download Full-text

A Fast 4K Video Frame Interpolation Using a Multi-Scale Optical Flow Reconstruction Network

Symmetry ◽

10.3390/sym11101251 ◽

2019 ◽

Vol 11 (10) ◽

pp. 1251 ◽

Cited By ~ 2

Author(s):

Ahn ◽

Jeong ◽

Kim ◽

Kwon ◽

Yoo

Keyword(s):

High Resolution ◽

Optical Flow ◽

State Of The Art ◽

Interpolation Method ◽

Video Frame ◽

Frame Interpolation ◽

Multi Scale ◽

Reconstruction Scheme ◽

Flow Reconstruction

Recently, video frame interpolation research developed with a convolutional neural network has shown remarkable results. However, these methods demand huge amounts of memory and run time for high-resolution videos, and are unable to process a 4K frame in a single pass. In this paper, we propose a fast 4K video frame interpolation method, based upon a multi-scale optical flow reconstruction scheme. The proposed method predicts low resolution bi-directional optical flow, and reconstructs it into high resolution. We also proposed consistency and multi-scale smoothness loss to enhance the quality of the predicted optical flow. Furthermore, we use adversarial loss to make the interpolated frame more seamless and natural. We demonstrated that the proposed method outperforms the existing state-of-the-art methods in quantitative evaluation, while it runs up to 4.39× faster than those methods for 4K videos.

Download Full-text

Unsupervised Representation High-Resolution Remote Sensing Image Scene Classification via Contrastive Learning Convolutional Neural Network

Photogrammetric Engineering & Remote Sensing ◽

10.14358/pers.87.8.577 ◽

2021 ◽

Vol 87 (8) ◽

pp. 577-591

Author(s):

Fengpeng Li ◽

Jiabao Li ◽

Wei Han ◽

Ruyi Feng ◽

Lizhe Wang

Keyword(s):

Neural Network ◽

Remote Sensing ◽

Deep Learning ◽

High Resolution ◽

Convolutional Neural Network ◽

State Of The Art ◽

Remote Sensing Image ◽

Scene Classification ◽

Data Set ◽

Unsupervised Deep Learning

Inspired by the outstanding achievement of deep learning, supervised deep learning representation methods for high-spatial-resolution remote sensing image scene classification obtained state-of-the-art performance. However, supervised deep learning representation methods need a considerable amount of labeled data to capture class-specific features, limiting the application of deep learning-based methods while there are a few labeled training samples. An unsupervised deep learning representation, high-resolution remote sensing image scene classification method is proposed in this work to address this issue. The proposed method, called contrastive learning, narrows the distance between positive views: color channels belonging to the same images widens the gaps between negative view pairs consisting of color channels from different images to obtain class-specific data representations of the input data without any supervised information. The classifier uses extracted features by the convolutional neural network (CNN)-based feature extractor with labeled information of training data to set space of each category and then, using linear regression, makes predictions in the testing procedure. Comparing with existing unsupervised deep learning representation high-resolution remote sensing image scene classification methods, contrastive learning CNN achieves state-of-the-art performance on three different scale benchmark data sets: small scale RSSCN7 data set, midscale aerial image data set, and large-scale NWPU-RESISC45 data set.

Download Full-text

Video Frame Interpolation Using Deep Convolutional Neural Network

Proceedings of the International Conference on ISMAC in Computational Vision and Bio-Engineering 2018 (ISMAC-CVB) - Lecture Notes in Computational Vision and Biomechanics ◽

10.1007/978-3-030-00665-5_82 ◽

2019 ◽

pp. 847-855

Author(s):

Varghese Mathai ◽

Arun Baby ◽

Akhila Sabu ◽

Jeexson Jose ◽

Bineeth Kuriakose

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Deep Convolutional Neural Network ◽

Video Frame ◽

Frame Interpolation

Download Full-text

ART-UP: A Novel Method for Generating Scanning-Robust Aesthetic QR Codes

ACM Transactions on Multimedia Computing Communications and Applications ◽

10.1145/3418214 ◽

2021 ◽

Vol 17 (1) ◽

pp. 1-23

Author(s):

Mingliang Xu ◽

Qingfeng Li ◽

Jianwei Niu ◽

Hao Su ◽

Xiting Liu ◽

...

Keyword(s):

State Of The Art ◽

Visual Quality ◽

Qr Code ◽

Quick Response ◽

Estimation Model ◽

Qr Codes ◽

Excellent Performance ◽

Novel Method ◽

Coarse To Fine

Quick response (QR) codes are usually scanned in different environments, so they must be robust to variations in illumination, scale, coverage, and camera angles. Aesthetic QR codes improve the visual quality, but subtle changes in their appearance may cause scanning failure. In this article, a new method to generate scanning-robust aesthetic QR codes is proposed, which is based on a module-based scanning probability estimation model that can effectively balance the tradeoff between visual quality and scanning robustness. Our method locally adjusts the luminance of each module by estimating the probability of successful sampling. The approach adopts the hierarchical, coarse-to-fine strategy to enhance the visual quality of aesthetic QR codes, which sequentially generate the following three codes: a binary aesthetic QR code, a grayscale aesthetic QR code, and the final color aesthetic QR code. Our approach also can be used to create QR codes with different visual styles by adjusting some initialization parameters. User surveys and decoding experiments were adopted for evaluating our method compared with state-of-the-art algorithms, which indicates that the proposed approach has excellent performance in terms of both visual quality and scanning robustness.

Download Full-text

Video Frame Interpolation via Deformable Separable Convolution

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6634 ◽

2020 ◽

Vol 34 (07) ◽

pp. 10607-10614 ◽

Cited By ~ 2

Author(s):

Xianhang Cheng ◽

Zhenzhong Chen

Keyword(s):

State Of The Art ◽

Video Frame ◽

Kernel Size ◽

Frame Interpolation ◽

Interpolation Methods ◽

Video Frames ◽

Convolution Process ◽

Strong Performance ◽

Existing Frames ◽

Better Than

Learning to synthesize non-existing frames from the original consecutive video frames is a challenging task. Recent kernel-based interpolation methods predict pixels with a single convolution process to replace the dependency of optical flow. However, when scene motion is larger than the pre-defined kernel size, these methods yield poor results even though they take thousands of neighboring pixels into account. To solve this problem in this paper, we propose to use deformable separable convolution (DSepConv) to adaptively estimate kernels, offsets and masks to allow the network to obtain information with much fewer but more relevant pixels. In addition, we show that the kernel-based methods and conventional flow-based methods are specific instances of the proposed DSepConv. Experimental results demonstrate that our method significantly outperforms the other kernel-based interpolation methods and shows strong performance on par or even better than the state-of-the-art algorithms both qualitatively and quantitatively.

Download Full-text

PCDRN: Progressive Cascade Deep Residual Network for Pansharpening

Remote Sensing ◽

10.3390/rs12040676 ◽

2020 ◽

Vol 12 (4) ◽

pp. 676 ◽

Cited By ~ 1

Author(s):

Yong Yang ◽

Wei Tu ◽

Shuying Huang ◽

Hangyuan Lu

Keyword(s):

High Resolution ◽

Loss Function ◽

High Frequency ◽

Experimental Results ◽

Superior Performance ◽

Residual Network ◽

High Quality ◽

Convolution Approach ◽

Visual Assessments ◽

Coarse To Fine

Pansharpening is the process of fusing a low-resolution multispectral (LRMS) image with a high-resolution panchromatic (PAN) image. In the process of pansharpening, the LRMS image is often directly upsampled by a scale of 4, which may result in the loss of high-frequency details in the fused high-resolution multispectral (HRMS) image. To solve this problem, we put forward a novel progressive cascade deep residual network (PCDRN) with two residual subnetworks for pansharpening. The network adjusts the size of an MS image to the size of a PAN image twice and gradually fuses the LRMS image with the PAN image in a coarse-to-fine manner. To prevent an overly-smooth phenomenon and achieve high-quality fusion results, a multitask loss function is defined to train our network. Furthermore, to eliminate checkerboard artifacts in the fusion results, we employ a resize-convolution approach instead of transposed convolution for upsampling LRMS images. Experimental results on the Pléiades and WorldView-3 datasets prove that PCDRN exhibits superior performance compared to other popular pansharpening methods in terms of quantitative and visual assessments.

Download Full-text

High-resolution CT image analysis based on 3D convolutional neural network can enhance the classification performance of radiologists in classifying pulmonary non-solid nodules

European Journal of Radiology ◽

10.1016/j.ejrad.2021.109810 ◽

2021 ◽

pp. 109810

Author(s):

Teng Zhang ◽

Yida Wang ◽

Yingli Sun ◽

Mei Yuan ◽

Yan Zhong ◽

...

Keyword(s):

Neural Network ◽

Image Analysis ◽

High Resolution ◽

Convolutional Neural Network ◽

Classification Performance ◽

Ct Image ◽

High Resolution Ct ◽

Ct Image Analysis

Download Full-text

Low-Rank and Sparse Based Deep-Fusion Convolutional Neural Network for Crowd Counting

Mathematical Problems in Engineering ◽

10.1155/2017/5046727 ◽

2017 ◽

Vol 2017 ◽

pp. 1-11 ◽

Cited By ~ 2

Author(s):

Siqi Tang ◽

Zhisong Pan ◽

Xingyu Zhou

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

State Of The Art ◽

Regression Method ◽

Low Rank ◽

Counting Method ◽

Direct Integral ◽

Crowd Counting ◽

Counting Methods ◽

Density Map

This paper proposes an accurate crowd counting method based on convolutional neural network and low-rank and sparse structure. To this end, we firstly propose an effective deep-fusion convolutional neural network to promote the density map regression accuracy. Furthermore, we figure out that most of the existing CNN based crowd counting methods obtain overall counting by direct integral of estimated density map, which limits the accuracy of counting. Instead of direct integral, we adopt a regression method based on low-rank and sparse penalty to promote accuracy of the projection from density map to global counting. Experiments demonstrate the importance of such regression process on promoting the crowd counting performance. The proposed low-rank and sparse based deep-fusion convolutional neural network (LFCNN) outperforms existing crowd counting methods and achieves the state-of-the-art performance.

Download Full-text

Structured Building Extraction from High-Resolution Satellite Images with a Hybrid Convolutional Neural Network

10.1109/igarss47720.2021.9554401 ◽

2021 ◽

Author(s):

Jianing Wang ◽

Hanjiang Xiong ◽

Jianya Gong ◽

Xianwei Zheng

Keyword(s):

Neural Network ◽

High Resolution ◽

Convolutional Neural Network ◽

Satellite Images ◽

Building Extraction ◽

High Resolution Satellite Images

Download Full-text

Performance Analysis of State of the Art Convolutional Neural Network Architectures in Bangla Handwritten Character Recognition

Pattern Recognition and Image Analysis ◽

10.1134/s1054661821010089 ◽

2021 ◽

Vol 31 (1) ◽

pp. 60-71

Author(s):

Tapotosh Ghosh ◽

Min-Ha-Zul Abedin ◽

Hasan Al Banna ◽

Nasirul Mumenin ◽

Mohammad Abu Yousuf

Keyword(s):

Neural Network ◽

Performance Analysis ◽

Convolutional Neural Network ◽

Character Recognition ◽

State Of The Art ◽

Network Architectures ◽

Handwritten Character Recognition ◽

Handwritten Character ◽

Neural Network Architectures

Download Full-text