Multi-Scale Adversarial Feature Learning for Saliency Detection

Dandan Zhu; Lei Dai; Ye Luo; Guokai Zhang; Xuan Shao; Laurent Itti; Jianwei Lu

doi:10.3390/sym10100457

Multi-Scale Adversarial Feature Learning for Saliency Detection

Symmetry ◽

10.3390/sym10100457 ◽

2018 ◽

Vol 10 (10) ◽

pp. 457 ◽

Cited By ~ 9

Author(s):

Dandan Zhu ◽

Lei Dai ◽

Ye Luo ◽

Guokai Zhang ◽

Xuan Shao ◽

...

Keyword(s):

Saliency Detection ◽

Feature Learning ◽

Ground Truth ◽

Saliency Map ◽

Natural Images ◽

Learning Ability ◽

Detection Methods ◽

Generative Adversarial Network ◽

Great Ability ◽

Multi Scale

Previous saliency detection methods usually focused on extracting powerful discriminative features to describe images with a complex background. Recently, the generative adversarial network (GAN) has shown a great ability in feature learning for synthesizing high quality natural images. Since the GAN shows a superior feature learning ability, we present a new multi-scale adversarial feature learning (MAFL) model for image saliency detection. In particular, we build this model, which is composed of two convolutional neural network (CNN) modules: the multi-scale G-network takes natural images as inputs and generates the corresponding synthetic saliency map, and we design a novel layer in the D-network, namely a correlation layer, which is used to determine whether one image is a synthetic saliency map or ground-truth saliency map. Quantitative and qualitative comparisons on several public datasets show the superiority of our approach.

Download Full-text

MSF-Net: Multi-Scale Feature Learning Network for Classification of Surface Defects of Multifarious Sizes

Sensors ◽

10.3390/s21155125 ◽

2021 ◽

Vol 21 (15) ◽

pp. 5125

Author(s):

Pengcheng Xu ◽

Zhongyuan Guo ◽

Lei Liang ◽

Xiaohang Xu

Keyword(s):

Defect Detection ◽

Surface Defects ◽

Receptive Fields ◽

Feature Learning ◽

Learning Ability ◽

Detection Methods ◽

Feature Maps ◽

Scale Feature ◽

Learning Network ◽

Multi Scale

In the field of surface defect detection, the scale difference of product surface defects is often huge. The existing defect detection methods based on Convolutional Neural Networks (CNNs) are more inclined to express macro and abstract features, and the ability to express local and small defects is insufficient, resulting in an imbalance of feature expression capabilities. In this paper, a Multi-Scale Feature Learning Network (MSF-Net) based on Dual Module Feature (DMF) extractor is proposed. DMF extractor is mainly composed of optimized Concatenated Rectified Linear Units (CReLUs) and optimized Inception feature extraction modules, which increases the diversity of feature receptive fields while reducing the amount of calculation; the feature maps of the middle layer with different sizes of receptive fields are merged to increase the richness of the receptive fields of the last layer of feature maps; the residual shortcut connections, batch normalization layer and average pooling layer are used to replace the fully connected layer to improve training efficiency, and make the multi-scale feature learning ability more balanced at the same time. Two representative multi-scale defect data sets are used for experiments, and the experimental results verify the advancement and effectiveness of the proposed MSF-Net in the detection of surface defects with multi-scale features.

Download Full-text

Change Detection Based on Multi-Grained Cascade Forest and Multi-Scale Fusion for SAR Images

Remote Sensing ◽

10.3390/rs11020142 ◽

2019 ◽

Vol 11 (2) ◽

pp. 142 ◽

Cited By ~ 10

Author(s):

Wenping Ma ◽

Hui Yang ◽

Yue Wu ◽

Yunta Xiong ◽

Tao Hu ◽

...

Keyword(s):

Change Detection ◽

Representation Learning ◽

Local Information ◽

Learning Ability ◽

Detection Methods ◽

Detection Accuracy ◽

Image Block ◽

Sar Images ◽

Difference Image ◽

Multi Scale

In this paper, a novel change detection approach based on multi-grained cascade forest(gcForest) and multi-scale fusion for synthetic aperture radar (SAR) images is proposed. It detectsthe changed and unchanged areas of the images by using the well-trained gcForest. Most existingchange detection methods need to select the appropriate size of the image block. However, thesingle size image block only provides a part of the local information, and gcForest cannot achieve agood effect on the image representation learning ability. Therefore, the proposed approach choosesdifferent sizes of image blocks as the input of gcForest, which can learn more image characteristicsand reduce the influence of the local information of the image on the classification result as well.In addition, in order to improve the detection accuracy of those pixels whose gray value changesabruptly, the proposed approach combines gradient information of the difference image with theprobability map obtained from the well-trained gcForest. Therefore, the image edge information canbe enhanced and the accuracy of edge detection can be improved by extracting the image gradientinformation. Experiments on four data sets indicate that the proposed approach outperforms otherstate-of-the-art algorithms.

Download Full-text

Research on an Infrared Multi-Target Saliency Detection Algorithm under Sky Background Conditions

Sensors ◽

10.3390/s20020459 ◽

2020 ◽

Vol 20 (2) ◽

pp. 459

Author(s):

Shaosheng Dai ◽

Dongyang Li

Keyword(s):

Saliency Detection ◽

Infrared Image ◽

Detection Algorithm ◽

Saliency Map ◽

Phase Spectrum ◽

Background Suppression ◽

Suppression Effect ◽

Multi Scale ◽

Saliency Maps ◽

Image Saliency

Aiming at solving the problem of incomplete saliency detection and unclear boundaries in infrared multi-target images with different target sizes and low signal-to-noise ratio under sky background conditions, this paper proposes a saliency detection method for multiple targets based on multi-saliency detection. The multiple target areas of the infrared image are mainly bright and the background areas are dark. Combining with the multi-scale top hat (Top-hat) transformation, the image is firstly corroded and expanded to extract the subtraction of light and shade parts and reconstruct the image to reduce the interference of sky blurred background noise. Then the image obtained by a multi-scale Top-hat transformation is transformed from the time domain to the frequency domain, and the spectral residuals and phase spectrum are extracted directly to obtain two kinds of image saliency maps by multi-scale Gauss filtering reconstruction, respectively. On the other hand, the quaternion features are extracted directly to transform the phase spectrum, and then the phase spectrum is reconstructed to obtain one kind of image saliency map by the Gauss filtering. Finally, the above three saliency maps are fused to complete the saliency detection of infrared images. The test results show that after the experimental analysis of infrared video photographs and the comparative analysis of Receiver Operating Characteristic (ROC) curve and Area Under the Curve (AUC) index, the infrared image saliency map generated by this method has clear target details and good background suppression effect, and the AUC index performance is good, reaching over 99%. It effectively improves the multi-target saliency detection effect of the infrared image under the sky background and is beneficial to subsequent detection and tracking of image targets.

Download Full-text

A NEPHOGRAM PREDICTION METHOD BASED ON GENERATIVE ADVERSARIAL NETWORK

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xlii-3-w10-925-2020 ◽

2020 ◽

Vol XLII-3/W10 ◽

pp. 925-930

Author(s):

Y. Xun ◽

W. Q. Yu

Keyword(s):

Loss Function ◽

Internal Representation ◽

Prediction Method ◽

Ground Truth ◽

Generative Adversarial Network ◽

Multi Scale ◽

Adversarial Network ◽

Basic Characteristics ◽

Cloud Evolution

Abstract. As one of the important sources of meteorological information, satellite nephogram is playing an increasingly important role in the detection and forecast of disastrous weather. The predictions about the movement and transformation of cloud with certain timeliness can enhance the practicability of satellite nephogram. Based on the generative adversarial network in unsupervised learning, we propose a prediction model of time series nephogram, which construct the internal representation of cloud evolution accurately and realize nephogram prediction for the next several hours. We improve the traditional generative adversarial network by constructing the generator and discriminator used the multi-scale convolution network. After the scale transform process, different scales operate convolutions in parallel and then merge the features. This structure can solve the problem of long-term dependence in the traditional network, and both global and detailed features are considered. Then according to the network structure and practical application, we define a new loss function combined with adversarial loss function to accelerate the convergence of model and sharpen predictions which keeps the effectivity of predictions further. Our method has no need to carry out the stack mathematics calculation and the manual operations, has greatly enhanced the feasibility and the efficiency. The results show that this model can reasonably describe the basic characteristics and evolution trend of cloud cluster, the prediction nephogram has very high similarity to the ground-truth nephogram.

Download Full-text

BSFCoS: Block and Sparse Principal Component Analysis-Based Fast Co-Saliency Detection Method

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s021800141655003x ◽

2015 ◽

Vol 30 (01) ◽

pp. 1655003 ◽

Cited By ~ 2

Author(s):

Ning-Min Shen ◽

Jing Li ◽

Pei-Yun Zhou ◽

Ying Huo ◽

Yi Zhuang

Keyword(s):

Detection Method ◽

Feature Fusion ◽

Rapid Development ◽

Saliency Detection ◽

Principal Component ◽

Saliency Map ◽

Detection Methods ◽

Image Block ◽

Feature Extraction Method ◽

Sparse Features

Co-saliency detection, an emerging research area in saliency detection, aims to extract the common saliency from the multi images. The extracted co-saliency map has been utilized in various applications, such as in co-segmentation, co-recognition and so on. With the rapid development of image acquisition technology, the original digital images are becoming more and more clearly. The existing co-saliency detection methods processing these images need enormous computer memory along with high computational complexity. These limitations made it hard to satisfy the demand of real-time user interaction. This paper proposes a fast co-saliency detection method based on the image block partition and sparse feature extraction method (BSFCoS). Firstly, the images are divided into several uniform blocks, and the low-level features are extracted from Lab and RGB color spaces. In order to maintain the characteristics of the original images and reduce the number of feature points as well as possible, Truncated Power for sparse principal components method are employed to extract sparse features. Furthermore, K-Means method is adopted to cluster the extracted sparse features, and calculate the three salient feature weights. Finally, the co-saliency map was acquired from the feature fusion of the saliency map for single image and multi images. The proposed method has been tested and simulated on two benchmark datasets: Co-saliency Pairs and CMU Cornell iCoseg datasets. Compared with the existing co-saliency methods, BSFCoS has a significant running time improvement in multi images processing while ensuring detection results. Lastly, the co-segmentation method based on BSFCoS is also given and has a better co-segmentation performance.

Download Full-text

Thermal Infrared Small Ship Detection in Sea Clutter Based on Morphological Reconstruction and Multi-Feature Analysis

Applied Sciences ◽

10.3390/app9183786 ◽

2019 ◽

Vol 9 (18) ◽

pp. 3786 ◽

Cited By ~ 4

Author(s):

Yongsong Li ◽

Zhengzhou Li ◽

Yong Zhu ◽

Bo Li ◽

Weiqi Xiong ◽

...

Keyword(s):

Saliency Detection ◽

Thermal Infrared ◽

Saliency Map ◽

Detection Methods ◽

Feature Analysis ◽

Sea Clutter ◽

Morphological Reconstruction ◽

Ship Detection ◽

Statistical Shape ◽

Small Ship

The existing thermal infrared (TIR) ship detection methods may suffer serious performance degradation in the situation of heavy sea clutter. To cope with this problem, a novel ship detection method based on morphological reconstruction and multi-feature analysis is proposed in this paper. Firstly, the TIR image is processed by opening- or closing-based gray-level morphological reconstruction (GMR) to smooth intricate background clutter while maintaining the intensity, shape, and contour features of ship target. Then, considering the intensity and contrast features, the fused saliency detection strategy including intensity foreground saliency map (IFSM) and brightness contrast saliency map (BCSM) is presented to highlight potential ship targets and suppress sea clutter. After that, an effective contour descriptor namely average eigenvalue measure of structure tensor (STAEM) is designed to characterize candidate ship targets, and the statistical shape knowledge is introduced to identify true ship targets from residual non-ship targets. Finally, the dual method is adopted to simultaneously detect both bright and dark ship targets in TIR image. Extensive experiments show that the proposed method outperforms the compared state-of-the-art methods, especially for infrared images with intricate sea clutter. Moreover, the proposed method can work stably for ship target with unknown brightness, variable quantities, sizes, and shapes.

Download Full-text

Multi-Scale Feature Integrated Attention-Based Rotation Network for Object Detection in VHR Aerial Images

Sensors ◽

10.3390/s20061686 ◽

2020 ◽

Vol 20 (6) ◽

pp. 1686 ◽

Cited By ~ 3

Author(s):

Feng Yang ◽

Wentong Li ◽

Haiwei Hu ◽

Wanyi Li ◽

Peng Wang

Keyword(s):

Object Detection ◽

Large Scale ◽

Ground Truth ◽

Classification Performance ◽

Aerial Images ◽

Detection Methods ◽

Robust Detection ◽

Scale Feature ◽

Multi Scale ◽

Bounding Boxes

Accurate and robust detection of multi-class objects in very high resolution (VHR) aerial images has been playing a significant role in many real-world applications. The traditional detection methods have made remarkable progresses with horizontal bounding boxes (HBBs) due to CNNs. However, HBB detection methods still exhibit limitations including the missed detection and the redundant detection regions, especially for densely-distributed and strip-like objects. Besides, large scale variations and diverse background also bring in many challenges. Aiming to address these problems, an effective region-based object detection framework named Multi-scale Feature Integration Attention Rotation Network (MFIAR-Net) is proposed for aerial images with oriented bounding boxes (OBBs), which promotes the integration of the inherent multi-scale pyramid features to generate a discriminative feature map. Meanwhile, the double-path feature attention network supervised by the mask information of ground truth is introduced to guide the network to focus on object regions and suppress the irrelevant noise. To boost the rotation regression and classification performance, we present a robust Rotation Detection Network, which can generate efficient OBB representation. Extensive experiments and comprehensive evaluations on two publicly available datasets demonstrate the effectiveness of the proposed framework.

Download Full-text

A Generative Adversarial Network with Dual Discriminators for Infrared and Visible Image Fusion Based on Saliency Detection

Mathematical Problems in Engineering ◽

10.1155/2021/4209963 ◽

2021 ◽

Vol 2021 ◽

pp. 1-9

Author(s):

Dazhi Zhang ◽

Jilei Hou ◽

Wei Wu ◽

Tao Lu ◽

Huabing Zhou

Keyword(s):

Image Fusion ◽

Loss Function ◽

Network Architecture ◽

Saliency Detection ◽

Infrared Image ◽

Saliency Map ◽

Generative Adversarial Network ◽

Visible Image ◽

Adversarial Network ◽

Salient Regions

Infrared and visible image fusion needs to preserve both the salient target of the infrared image and the texture details of the visible image. Therefore, an infrared and visible image fusion method based on saliency detection is proposed. Firstly, the saliency map of the infrared image is obtained by saliency detection. Then, the specific loss function and network architecture are designed based on the saliency map to improve the performance of the fusion algorithm. Specifically, the saliency map is normalized to [0, 1], used as a weight map to constrain the loss function. At the same time, the saliency map is binarized to extract salient regions and nonsalient regions. And, a generative adversarial network with dual discriminators is obtained. The two discriminators are used to distinguish the salient regions and the nonsalient regions, respectively, to promote the generator to generate better fusion results. The experimental results show that the fusion results of our method are better than those of the existing methods in both subjective and objective aspects.

Download Full-text

Dynamic Zero-Parallax-Setting Techniques for Multi-View Autostereoscopic Display

Electronic Imaging ◽

10.2352/issn.2470-1173.2020.2.sda-098 ◽

2020 ◽

Vol 2020 (2) ◽

pp. 98-1-98-6

Author(s):

Yuzhong Jiao ◽

Mark Ping Chan Mok ◽

Kayton Wai Keung Cheung ◽

Man Chi Chan ◽

Tak Wai Shen ◽

...

Keyword(s):

Spatial Distribution ◽

Deep Learning ◽

Saliency Detection ◽

Depth Map ◽

Visual Saliency ◽

Saliency Map ◽

Detection Methods ◽

3D Vision ◽

Detection Techniques ◽

Autostereoscopic Displays

The objective of this paper is to research a dynamic computation of Zero-Parallax-Setting (ZPS) for multi-view autostereoscopic displays in order to effectively alleviate blurry 3D vision for images with large disparity. Saliency detection techniques can yield saliency map which is a topographic representation of saliency which refers to visually dominant locations. By using saliency map, we can predict what attracts the attention, or region of interest, to viewers. Recently, deep learning techniques have been applied in saliency detection. Deep learning-based salient object detection methods have the advantage of highlighting most of the salient objects. With the help of depth map, the spatial distribution of salient objects can be computed. In this paper, we will compare two dynamic ZPS techniques based on visual attention. They are 1) maximum saliency computation by Graphic-Based Visual Saliency (GBVS) algorithm and 2) spatial distribution of salient objects by a convolutional neural networks (CNN)-based model. Experiments prove that both methods can help improve the 3D effect of autostereoscopic displays. Moreover, the spatial distribution of salient objects-based dynamic ZPS technique can achieve better 3D performance than maximum saliency-based method.

Download Full-text

Salient Region Detection Using Diffusion Process with Nonlocal Connections

Applied Sciences ◽

10.3390/app8122526 ◽

2018 ◽

Vol 8 (12) ◽

pp. 2526 ◽

Cited By ~ 2

Author(s):

Huiyuan Luo ◽

Guangliang Han ◽

Peixun Liu ◽

Yanfeng Wu

Keyword(s):

Saliency Detection ◽

Gaussian Model ◽

Single Layer ◽

Saliency Map ◽

Detection Methods ◽

Sparse Graph ◽

Salient Region Detection ◽

Salient Region ◽

Region Detection ◽

Benchmark Datasets

Diffusion-based salient region detection methods have gained great popularity. In most diffusion-based methods, the saliency values are ranked on 2-layer neighborhood graph by connecting each node to its neighboring nodes and the nodes sharing common boundaries with its neighboring nodes. However, only considering the local relevance between neighbors, the salient region may be heterogeneous and even wrongly suppressed, especially when the features of salient object are diverse. In order to address the issue, we present an effective saliency detection method using diffusing process on the graph with nonlocal connections. First, a saliency-biased Gaussian model is used to refine the saliency map based on the compactness cue, and then, the saliency information of compactness is diffused on 2-layer sparse graph with nonlocal connections. Second, we obtain the contrast of each superpixel by restricting the reference region to the background. Similarly, a saliency-biased Gaussian refinement model is generated and the saliency information based on the uniqueness cue is propagated on the 2-layer sparse graph. We linearly integrate the initial saliency maps based on the compactness and uniqueness cues due to the complementarity to each other. Finally, to obtain a highlighted and homogeneous saliency map, a single-layer updating and multi-layer integrating scheme is presented. Comprehensive experiments on four benchmark datasets demonstrate that the proposed method performs better in terms of various evaluation metrics.

Download Full-text