Advanced Driving Assistance Based on the Fusion of Infrared and Visible Images

Yansong Gu; Xinya Wang; Can Zhang; Baiyang Li

doi:10.3390/e23020239

Advanced Driving Assistance Based on the Fusion of Infrared and Visible Images

Entropy ◽

10.3390/e23020239 ◽

2021 ◽

Vol 23 (2) ◽

pp. 239

Author(s):

Yansong Gu ◽

Xinya Wang ◽

Can Zhang ◽

Baiyang Li

Keyword(s):

Visual Information ◽

State Of The Art ◽

Infrared Image ◽

Activity Level ◽

Qualitative And Quantitative ◽

Intensity Information ◽

Driving Assistance ◽

Fused Image ◽

Visible Images ◽

End To End

Obtaining key and rich visual information under sophisticated road conditions is one of the key requirements for advanced driving assistance. In this paper, a newfangled end-to-end model is proposed for advanced driving assistance based on the fusion of infrared and visible images, termed as FusionADA. In our model, we are committed to extracting and fusing the optimal texture details and salient thermal targets from the source images. To achieve this goal, our model constitutes an adversarial framework between the generator and the discriminator. Specifically, the generator aims to generate a fused image with basic intensity information together with the optimal texture details from source images, while the discriminator aims to force the fused image to restore the salient thermal targets from the source infrared image. In addition, our FusionADA is a fully end-to-end model, solving the issues of manually designing complicated activity level measurements and fusion rules existing in traditional methods. Qualitative and quantitative experiments on publicly available datasets RoadScene and TNO demonstrate the superiority of our FusionADA over the state-of-the-art approaches.

Download Full-text

Learning a Generative Model for Fusing Infrared and Visible Images via Conditional Generative Adversarial Network with Dual Discriminators

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/549 ◽

2019 ◽

Cited By ~ 12

Author(s):

Han Xu ◽

Pengwei Liang ◽

Wei Yu ◽

Junjun Jiang ◽

Jiayi Ma

Keyword(s):

Probability Distribution ◽

State Of The Art ◽

Infrared Image ◽

Infrared Images ◽

Generative Adversarial Network ◽

Visible Image ◽

Qualitative And Quantitative ◽

Adversarial Network ◽

Fused Image ◽

Visible Images

In this paper, we propose a new end-to-end model, called dual-discriminator conditional generative adversarial network (DDcGAN), for fusing infrared and visible images of different resolutions. Unlike the pixel-level methods and existing deep learning-based methods, the fusion task is accomplished through the adversarial process between a generator and two discriminators, in addition to the specially designed content loss. The generator is trained to generate real-like fused images to fool discriminators. The two discriminators are trained to calculate the JS divergence between the probability distribution of downsampled fused images and infrared images, and the JS divergence between the probability distribution of gradients of fused images and gradients of visible images, respectively. Thus, the fused images can compensate for the features that are not constrained by the single content loss. Consequently, the prominence of thermal targets in the infrared image and the texture details in the visible image can be preserved or even enhanced in the fused image simultaneously. Moreover, by constraining and distinguishing between the downsampled fused image and the low-resolution infrared image, DDcGAN can be preferably applied to the fusion of different resolution images. Qualitative and quantitative experiments on publicly available datasets demonstrate the superiority of our method over the state-of-the-art.

Download Full-text

FusionDN: A Unified Densely Connected Network for Image Fusion

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6936 ◽

2020 ◽

Vol 34 (07) ◽

pp. 12484-12491 ◽

Cited By ~ 8

Author(s):

Han Xu ◽

Jiayi Ma ◽

Zhuliang Le ◽

Junjun Jiang ◽

Xiaojie Guo

Keyword(s):

Image Fusion ◽

State Of The Art ◽

Data Driven ◽

Single Model ◽

Connected Network ◽

Qualitative And Quantitative ◽

Multiple Tasks ◽

Different Types ◽

Fused Image ◽

Quantitative Results

In this paper, we present a new unsupervised and unified densely connected network for different types of image fusion tasks, termed as FusionDN. In our method, the densely connected network is trained to generate the fused image conditioned on source images. Meanwhile, a weight block is applied to obtain two data-driven weights as the retention degrees of features in different source images, which are the measurement of the quality and the amount of information in them. Losses of similarities based on these weights are applied for unsupervised learning. In addition, we obtain a single model applicable to multiple fusion tasks by applying elastic weight consolidation to avoid forgetting what has been learned from previous tasks when training multiple tasks sequentially, rather than train individual models for every fusion task or jointly train tasks roughly. Qualitative and quantitative results demonstrate the advantages of FusionDN compared with state-of-the-art methods in different fusion tasks.

Download Full-text

Fusing Infrared and Visible Images of Different Resolutions via Total Variation Model

Sensors ◽

10.3390/s18113827 ◽

2018 ◽

Vol 18 (11) ◽

pp. 3827 ◽

Cited By ~ 9

Author(s):

Qinglei Du ◽

Han Xu ◽

Yong Ma ◽

Jun Huang ◽

Fan Fan

Keyword(s):

Thermal Radiation ◽

Total Variation ◽

Infrared Image ◽

Infrared Images ◽

Visible Image ◽

Pixel Intensity ◽

Texture Information ◽

Fused Image ◽

Fusion Methods ◽

Visible Images

In infrared and visible image fusion, existing methods typically have a prerequisite that the source images share the same resolution. However, due to limitations of hardware devices and application environments, infrared images constantly suffer from markedly lower resolution compared with the corresponding visible images. In this case, current fusion methods inevitably cause texture information loss in visible images or blur thermal radiation information in infrared images. Moreover, the principle of existing fusion rules typically focuses on preserving texture details in source images, which may be inappropriate for fusing infrared thermal radiation information because it is characterized by pixel intensities, possibly neglecting the prominence of targets in fused images. Faced with such difficulties and challenges, we propose a novel method to fuse infrared and visible images of different resolutions and generate high-resolution resulting images to obtain clear and accurate fused images. Specifically, the fusion problem is formulated as a total variation (TV) minimization problem. The data fidelity term constrains the pixel intensity similarity of the downsampled fused image with respect to the infrared image, and the regularization term compels the gradient similarity of the fused image with respect to the visible image. The fast iterative shrinkage-thresholding algorithm (FISTA) framework is applied to improve the convergence rate. Our resulting fused images are similar to super-resolved infrared images, which are sharpened by the texture information from visible images. Advantages and innovations of our method are demonstrated by the qualitative and quantitative comparisons with six state-of-the-art methods on publicly available datasets.

Download Full-text

Infrared and visible image fusion based on optimal segmenting and contour extraction

SN Applied Sciences ◽

10.1007/s42452-020-04050-w ◽

2021 ◽

Vol 3 (3) ◽

Author(s):

Javad Abbasi Aghamaleki ◽

Alireza Ghorbani

Keyword(s):

Image Fusion ◽

Infrared Image ◽

Fusion Method ◽

Visible Image ◽

Output Image ◽

Contour Lines ◽

Fused Image ◽

Fusion Methods ◽

Visible Images ◽

The Individual

AbstractImage fusion is the combining process of complementary information of multiple same scene images into an output image. The resultant output image that is named fused image, produces more precise description of the scene than any of the individual input images. In this paper, we propose a novel simple and fast strategy for infrared (IR) and visible images based on local important areas of IR image. The fusion method is completed in three step approach. Firstly, only the segmented regions in the infrared image is extracted. Next, the image fusion is applied on segmented area and finally, contour lines are also used to improve the quality of the results of the second step of fusion method. Using a publicly available database, the proposed method is evaluated and compared to the other fusion methods. The experimental results show the effectiveness of the proposed method compared to the state of the art methods.

Download Full-text

A Cross-Direction and Progressive Network for Pan-Sharpening

Remote Sensing ◽

10.3390/rs13153045 ◽

2021 ◽

Vol 13 (15) ◽

pp. 3045

Author(s):

Han Xu ◽

Zhuliang Le ◽

Jun Huang ◽

Jiayi Ma

Keyword(s):

State Of The Art ◽

Ground Truth ◽

Single Type ◽

Spectral Information ◽

Cross Direction ◽

Qualitative And Quantitative ◽

Multi Scale ◽

Fused Image ◽

Partial Inactivation ◽

The One

In this paper, we propose a cross-direction and progressive network, termed CPNet, to solve the pan-sharpening problem. The full processing of information is the main characteristic of our model, which is reflected as follows: on the one hand, we process the source images in a cross-direction manner to obtain the source images of different scales as the input of the fusion modules at different stages, which maximizes the usage of multi-scale information in the source images; on the other hand, the progressive reconstruction loss is designed to boost the training of our network and avoid partial inactivation, while maintaining the consistency of the fused result with the ground truth. Since the extraction of the information from the source images and the reconstruction of the fused image is based on the entire image rather than a single type of information, there is little loss of partial spatial or spectral information due to insufficient information processing. Extensive experiments, including qualitative and quantitative comparisons demonstrate that our model can maintain more spatial and spectral information compared to the state-of-the-art pan-sharpening methods.

Download Full-text

Segmentation of SAR Image using Fuzzy C-Means and Filters

Science & Technology Journal ◽

10.22232/stj.2020.08.01.11 ◽

2020 ◽

Vol 8 (1) ◽

pp. 84-90

Author(s):

R. Lalchhanhima ◽

◽

Debdatta Kandar ◽

R. Chawngsangpuii ◽

Vanlalmuansangi Khenglawt ◽

...

Keyword(s):

Clustering Algorithm ◽

State Of The Art ◽

Speckle Noise ◽

Synthetic Aperture Radar Image ◽

Synthetic Aperture ◽

Sar Image ◽

Spatial Filters ◽

Fuzzy C Means ◽

Automatic Clustering ◽

Intensity Information

Fuzzy C-Means is an unsupervised clustering algorithm for the automatic clustering of data. Synthetic Aperture Radar Image Segmentation has been a challenging task because of the presence of speckle noise. Therefore the segmentation process can not directly rely on the intensity information alone but must consider several derived features in order to get satisfactory segmentation results. In this paper, it is attempted to use the fuzzy nature of classification for the purpose of unsupervised region segmentation in which FCM is employed. Different features are obtained by filtering of the image by using different spatial filters and are selected for segmentation criteria. The segmentation performance is determined by the accuracy compared with a different state of the art techniques proposed recently.

Download Full-text

Efficient End-to-End Sentence-Level Lipreading with Temporal Convolutional Networks

Applied Sciences ◽

10.3390/app11156975 ◽

2021 ◽

Vol 11 (15) ◽

pp. 6975

Author(s):

Tao Zhang ◽

Lun He ◽

Xudong Li ◽

Guoqing Feng

Keyword(s):

Performance Improvement ◽

State Of The Art ◽

Error Rates ◽

Convolutional Network ◽

Convolutional Networks ◽

Sentence Level ◽

End To End ◽

High Level ◽

Improved Accuracy ◽

Talking Face

Lipreading aims to recognize sentences being spoken by a talking face. In recent years, the lipreading method has achieved a high level of accuracy on large datasets and made breakthrough progress. However, lipreading is still far from being solved, and existing methods tend to have high error rates on the wild data and have the defects of disappearing training gradient and slow convergence. To overcome these problems, we proposed an efficient end-to-end sentence-level lipreading model, using an encoder based on a 3D convolutional network, ResNet50, Temporal Convolutional Network (TCN), and a CTC objective function as the decoder. More importantly, the proposed architecture incorporates TCN as a feature learner to decode feature. It can partly eliminate the defects of RNN (LSTM, GRU) gradient disappearance and insufficient performance, and this yields notable performance improvement as well as faster convergence. Experiments show that the training and convergence speed are 50% faster than the state-of-the-art method, and improved accuracy by 2.4% on the GRID dataset.

Download Full-text

Cross-Modal Effect of Presenting Visual and Force Feedback That Create the Illusion of Stair-Climbing

Applied Sciences ◽

10.3390/app11072987 ◽

2021 ◽

Vol 11 (7) ◽

pp. 2987

Author(s):

Takumi Okumura ◽

Yuichi Kurita

Keyword(s):

Visual Information ◽

Force Feedback ◽

Activity Level ◽

Experimental Results ◽

Stair Climbing ◽

Mirror Therapy ◽

Visual Condition ◽

Stroke Patients ◽

Head Mount Display ◽

The Brain

Image therapy, which creates illusions with a mirror and a head mount display, assists movement relearning in stroke patients. Mirror therapy presents the movement of the unaffected limb in a mirror, creating the illusion of movement of the affected limb. As the visual information of images cannot create a fully immersive experience, we propose a cross-modal strategy that supplements the image with sensual information. By interacting with the stimuli received from multiple sensory organs, the brain complements missing senses, and the patient experiences a different sense of motion. Our system generates the sense of stair-climbing in a subject walking on a level floor. The force sensation is presented by a pneumatic gel muscle (PGM). Based on motion analysis in a human lower-limb model and the characteristics of the force exerted by the PGM, we set the appropriate air pressure of the PGM. The effectiveness of the proposed system was evaluated by surface electromyography and a questionnaire. The experimental results showed that by synchronizing the force sensation with visual information, we could match the motor and perceived sensations at the muscle-activity level, enhancing the sense of stair-climbing. The experimental results showed that the visual condition significantly improved the illusion intensity during stair-climbing.

Download Full-text

Pseudo-3D Physical Design Flow for Monolithic 3D ICs: Comparisons and Enhancements

ACM Transactions on Design Automation of Electronic Systems ◽

10.1145/3453480 ◽

2021 ◽

Vol 26 (5) ◽

pp. 1-25

Author(s):

Heechun Park ◽

Bon Woong Ku ◽

Kyungwook Chang ◽

Da Eun Shim ◽

Sung Kyu Lim

Keyword(s):

State Of The Art ◽

Physical Design ◽

Design Flow ◽

Design Tools ◽

Mixed Design ◽

Power Performance ◽

3D Design ◽

Qualitative And Quantitative ◽

3D Ics ◽

Power Delay Product

Studies have shown that monolithic 3D ( M3D ) ICs outperform the existing through-silicon-via ( TSV ) -based 3D ICs in terms of power, performance, and area ( PPA ) metrics, primarily due to the orders of magnitude denser vertical interconnections offered by the nano-scale monolithic inter-tier vias. In order to facilitate faster industry adoption of the M3D technologies, physical design tools and methodologies are essential. Recent academic efforts in developing an EDA algorithm for 3D ICs, mainly targeting placement using TSVs, are inadequate to provide commercial-quality GDS layouts. Lately, pseudo-3D approaches have been devised, which utilize commercial 2D IC EDA engines with tricks that help them operate as an efficient 3D IC CAD tool. In this article, we provide thorough discussions and fair comparisons (both qualitative and quantitative) of the state-of-the-art pseudo-3D design flows, with analysis of limitations in each design flow and solutions to improve their PPA metrics. Moreover, we suggest a hybrid pseudo-3D design flow that achieves both benefits. Our enhancements and the inter-mixed design flow, provide up to an additional 26% wirelength, 10% power consumption, and 23% of power-delay-product improvements.

Download Full-text

FUSION OF SURVEILLANCE IMAGES IN INFRARED AND VISIBLE BAND USING CURVELET, WAVELET AND WAVELET PACKET TRANSFORM

International Journal of Wavelets Multiresolution and Information Processing ◽

10.1142/s0219691310003444 ◽

2010 ◽

Vol 08 (02) ◽

pp. 271-292 ◽

Cited By ~ 27

Author(s):

PARUL SHAH ◽

S. N. MERCHANT ◽

U. B. DESAI

Keyword(s):

Wavelet Packet ◽

Curvelet Transform ◽

Visual Quality ◽

Wavelet Packet Transform ◽

Discrete Wavelet ◽

Discrete Wavelet Packet Transform ◽

Fused Image ◽

Fusion Methods ◽

Visible Images

This paper presents two methods for fusion of infrared (IR) and visible surveillance images. The first method combines Curvelet Transform (CT) with Discrete Wavelet Transform (DWT). As wavelets do not represent long edges well while curvelets are challenged with small features, our objective is to combine both to achieve better performance. The second approach uses Discrete Wavelet Packet Transform (DWPT), which provides multiresolution in high frequency band as well and hence helps in handling edges better. The performance of the proposed methods have been extensively tested for a number of multimodal surveillance images and compared with various existing transform domain fusion methods. Experimental results show that evaluation based on entropy, gradient, contrast etc., the criteria normally used, are not enough, as in some cases, these criteria are not consistent with the visual quality. It also demonstrates that the Petrovic and Xydeas image fusion metric is a more appropriate criterion for fusion of IR and visible images, as in all the tested fused images, visual quality agrees with the Petrovic and Xydeas metric evaluation. The analysis shows that there is significant increase in the quality of fused image, both visually and quantitatively. The major achievement of the proposed fusion methods is its reduced artifacts, one of the most desired feature for fusion used in surveillance applications.

Download Full-text