A Weakly Supervised Semantic Segmentation Network by Aggregating Seed Cues: The Multi-Object Proposal Generation Perspective

Junsheng Xiao; Huahu Xu; Honghao Gao; Minjie Bian; Yang Li

doi:10.1145/3419842

A Weakly Supervised Semantic Segmentation Network by Aggregating Seed Cues: The Multi-Object Proposal Generation Perspective

ACM Transactions on Multimedia Computing Communications and Applications ◽

10.1145/3419842 ◽

2021 ◽

Vol 17 (1s) ◽

pp. 1-19

Author(s):

Junsheng Xiao ◽

Huahu Xu ◽

Honghao Gao ◽

Minjie Bian ◽

Yang Li

Keyword(s):

Image Classification ◽

Real World ◽

Semantic Segmentation ◽

Feature Maps ◽

High Confidence ◽

Deep Convolutional Neural Networks ◽

Object Proposal ◽

Initial Location ◽

Weakly Supervised ◽

High Level

Weakly supervised semantic segmentation under image-level annotations is effectiveness for real-world applications. The small and sparse discriminative regions obtained from an image classification network that are typically used as the important initial location of semantic segmentation also form the bottleneck. Although deep convolutional neural networks (DCNNs) have exhibited promising performances for single-label image classification tasks, images of the real-world usually contain multiple categories, which is still an open problem. So, the problem of obtaining high-confidence discriminative regions from multi-label classification networks remains unsolved. To solve this problem, this article proposes an innovative three-step framework within the perspective of multi-object proposal generation. First, an image is divided into candidate boxes using the object proposal method. The candidate boxes are sent to a single-classification network to obtain the discriminative regions. Second, the discriminative regions are aggregated to obtain a high-confidence seed map. Third, the seed cues grow on the feature maps of high-level semantics produced by a backbone segmentation network. Experiments are carried out on the PASCAL VOC 2012 dataset to verify the effectiveness of our approach, which is shown to outperform other baseline image segmentation methods.

Download Full-text

Weakly Supervised Learning with Deep Convolutional Neural Networks for Semantic Segmentation: Understanding Semantic Layout of Images with Minimum Human Supervision

IEEE Signal Processing Magazine ◽

10.1109/msp.2017.2742558 ◽

2017 ◽

Vol 34 (6) ◽

pp. 39-49 ◽

Cited By ~ 12

Author(s):

Seunghoon Hong ◽

Suha Kwak ◽

Bohyung Han

Keyword(s):

Neural Networks ◽

Supervised Learning ◽

Convolutional Neural Networks ◽

Semantic Segmentation ◽

Deep Convolutional Neural Networks ◽

Weakly Supervised Learning ◽

Weakly Supervised

Download Full-text

Optimal Scale of Hierarchical Image Segmentation with Scribbles Guidance for Weakly Supervised Semantic Segmentation

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001421540264 ◽

2021 ◽

pp. 2154026

Author(s):

Zaid Al-Huda ◽

Donghai Zhai ◽

Yan Yang ◽

Riyadh Nazar Ali Algburi

Keyword(s):

Image Segmentation ◽

Graphical Model ◽

Semantic Segmentation ◽

Saliency Map ◽

Training Data ◽

Deep Convolutional Neural Networks ◽

High Quality ◽

Optimal Scale ◽

Supervised Segmentation ◽

Weakly Supervised

Deep convolutional neural networks (DCNNs) trained on the pixel-level annotated images have achieved improvements in semantic segmentation. Due to the high cost of labeling training data, their applications may have great limitation. However, weakly supervised segmentation approaches can significantly reduce human labeling efforts. In this paper, we introduce a new framework to generate high-quality initial pixel-level annotations. By using a hierarchical image segmentation algorithm to predict the boundary map, we select the optimal scale of high-quality hierarchies. In the initialization step, scribble annotations and the saliency map are combined to construct a graphic model over the optimal scale segmentation. By solving the minimal cut problem, it can spread information from scribbles to unmarked regions. In the training process, the segmentation network is trained by using the initial pixel-level annotations. To iteratively optimize the segmentation, we use a graphical model to refine segmentation masks and retrain the segmentation network to get more precise pixel-level annotations. The experimental results on Pascal VOC 2012 dataset demonstrate that the proposed framework outperforms most of weakly supervised semantic segmentation methods and achieves the state-of-the-art performance, which is [Formula: see text] mIoU.

Download Full-text

MICRA-Net: MICRoscopy Analysis Neural Network to solve detection, classification, and segmentation from a single simple auxiliary task

10.1101/2021.06.29.448970 ◽

2021 ◽

Author(s):

Anthony Bilodeau ◽

Constantin V.L. Delmas ◽

Martin Parent ◽

Paul De Koninck ◽

Audrey Durand ◽

...

Keyword(s):

Neural Network ◽

Quantitative Analysis ◽

High Throughput ◽

Semantic Segmentation ◽

Feature Maps ◽

Expert Annotation ◽

Microscopy Analysis ◽

Weakly Supervised ◽

Segmentation Task ◽

Microscopy Images

High throughput quantitative analysis of microscopy images presents a challenge due to the complexity of the image content and the difficulty to retrieve precisely annotated datasets. In this paper we introduce a weakly-supervised MICRoscopy Analysis neural network (MICRA-Net) that can be trained on a simple main classification task using image-level annotations to solve multiple the more complex auxiliary semantic segmentation task and other associated tasks such as detection or enumeration. MICRA-Net relies on the latent information embedded within a trained model to achieve performances similar to state-of-the-art fully-supervised learning. This learnt information is extracted from the network using gradient class activation maps, which are combined to generate detailed feature maps of the biological structures of interest. We demonstrate how MICRA-Net significantly alleviates the Expert annotation process on various microscopy datasets and can be used for high-throughput quantitative analysis of microscopy images.

Download Full-text

SegCloud: a novel cloud image segmentation model using deep Convolutional Neural Network for ground-based all-sky-view camera observation

10.5194/amt-2019-356 ◽

2019 ◽

Author(s):

Wanyi Xie ◽

Dong Liu ◽

Ming Yang ◽

Shaoqing Chen ◽

Benge Wang ◽

...

Keyword(s):

Image Segmentation ◽

Cloud Cover ◽

Weather Forecast ◽

Cloud Detection ◽

Feature Maps ◽

Deep Convolutional Neural Networks ◽

Softmax Classifier ◽

Discrimination Ability ◽

Low Level ◽

High Level

Abstract. Cloud detection and cloud properties have significant applications in weather forecast, signal attenuation analysis, and other cloud-related fields. Cloud image segmentation is the fundamental and important step to derive cloud cover. However, traditional segmentation methods rely on low-level visual features of clouds, and often fail to achieve satisfactory performance. Deep Convolutional Neural Networks (CNNs) are able to extract high-level feature information of object and have become the dominant methods in many image segmentation fields. Inspired by that, a novel deep CNN model named SegCloud is proposed and applied to accurate cloud segmentation based on ground-based observation. Architecturally, SegCloud possesses symmetric encoder-decoder structure. The encoder network combines low-level cloud features to form high-level cloud feature maps with low resolution, and the decoder network restores the obtained high-level cloud feature maps to the same resolution of input images. The softmax classifier finally achieves pixel-wise classification and outputs segmentation results. SegCloud has powerful cloud discrimination ability and can automatically segment the whole sky images obtained by a ground-based all-sky-view camera. Furthermore, a new database, which includes 400 whole sky images and manual-marked labels, is built to train and test the SegCloud model. The performance of SegCloud is validated by extensive experiments, which show that SegCloud is effective and accurate for ground-based cloud segmentation and achieves better results than traditional methods. Moreover, the accuracy and practicability of SegCloud is further proved by applying it to cloud cover estimation.

Download Full-text

Semi- and Weakly- Supervised Semantic Segmentation with Deep Convolutional Neural Networks

Proceedings of the 23rd ACM international conference on Multimedia - MM '15 ◽

10.1145/2733373.2806322 ◽

2015 ◽

Cited By ~ 5

Author(s):

Yuhang Wang ◽

Jing Liu ◽

Yong Li ◽

Hanqing Lu

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Semantic Segmentation ◽

Deep Convolutional Neural Networks ◽

Weakly Supervised

Download Full-text

Learnable Gated Convolutional Neural Network for Semantic Segmentation in Remote-Sensing Images

Remote Sensing ◽

10.3390/rs11161922 ◽

2019 ◽

Vol 11 (16) ◽

pp. 1922 ◽

Cited By ~ 7

Author(s):

Shichen Guo ◽

Qizhao Jin ◽

Hongzhen Wang ◽

Xuezhi Wang ◽

Yangang Wang ◽

...

Keyword(s):

Remote Sensing ◽

High Resolution ◽

Urban Areas ◽

Information Aggregation ◽

Semantic Segmentation ◽

Local Context ◽

Context Information ◽

Feature Maps ◽

Deep Convolutional Neural Networks ◽

Proposed Model

Semantic segmentation in high-resolution remote-sensing (RS) images is a fundamental task for RS-based urban understanding and planning. However, various types of artificial objects in urban areas make this task quite challenging. Recently, the use of Deep Convolutional Neural Networks (DCNNs) with multiscale information fusion has demonstrated great potential in enhancing performance. Technically, however, existing fusions are usually implemented by summing or concatenating feature maps in a straightforward way. Seldom do works consider the spatial importance for global-to-local context-information aggregation. This paper proposes a Learnable-Gated CNN (L-GCNN) to address this issue. Methodologically, the Taylor expression of the information-entropy function is first parameterized to design the gate function, which is employed to generate pixelwise weights for coarse-to-fine refinement in the L-GCNN. Accordingly, a Parameterized Gate Module (PGM) was designed to achieve this goal. Then, the single PGM and its densely connected extension were embedded into different levels of the encoder in the L-GCNN to help identify the discriminative feature maps at different scales. With the above designs, the L-GCNN is finally organized as a self-cascaded end-to-end architecture that is able to sequentially aggregate context information for fine segmentation. The proposed model was evaluated on two public challenging benchmarks, the ISPRS 2Dsemantic segmentation challenge Potsdam dataset and the Massachusetts building dataset. The experiment results demonstrate that the proposed method exhibited significant improvement compared with several related segmentation networks, including the FCN, SegNet, RefineNet, PSPNet, DeepLab and GSN.For example, on the Potsdam dataset, our method achieved a 93.65% F 1 score and 88.06% I o U score for the segmentation of tiny cars in high-resolution RS images. As a conclusion, the proposed model showed potential for object segmentation from the RS images of buildings, impervious surfaces, low vegetation, trees and cars in urban settings, which largely varies in size and have confusing appearances.

Download Full-text

Combining Deep Semantic Segmentation Network and Graph Convolutional Neural Network for Semantic Segmentation of Remote Sensing Imagery

Remote Sensing ◽

10.3390/rs13010119 ◽

2020 ◽

Vol 13 (1) ◽

pp. 119

Author(s):

Song Ouyang ◽

Yansheng Li

Keyword(s):

Neural Network ◽

Remote Sensing ◽

Convolutional Neural Network ◽

Spatial Information ◽

Spatial Relationship ◽

Semantic Segmentation ◽

Extraction Ability ◽

Feature Maps ◽

High Level ◽

Graph Nodes

Although the deep semantic segmentation network (DSSN) has been widely used in remote sensing (RS) image semantic segmentation, it still does not fully mind the spatial relationship cues between objects when extracting deep visual features through convolutional filters and pooling layers. In fact, the spatial distribution between objects from different classes has a strong correlation characteristic. For example, buildings tend to be close to roads. In view of the strong appearance extraction ability of DSSN and the powerful topological relationship modeling capability of the graph convolutional neural network (GCN), a DSSN-GCN framework, which combines the advantages of DSSN and GCN, is proposed in this paper for RS image semantic segmentation. To lift the appearance extraction ability, this paper proposes a new DSSN called the attention residual U-shaped network (AttResUNet), which leverages residual blocks to encode feature maps and the attention module to refine the features. As far as GCN, the graph is built, where graph nodes are denoted by the superpixels and the graph weight is calculated by considering the spectral information and spatial information of the nodes. The AttResUNet is trained to extract the high-level features to initialize the graph nodes. Then the GCN combines features and spatial relationships between nodes to conduct classification. It is worth noting that the usage of spatial relationship knowledge boosts the performance and robustness of the classification module. In addition, benefiting from modeling GCN on the superpixel level, the boundaries of objects are restored to a certain extent and there are less pixel-level noises in the final classification result. Extensive experiments on two publicly open datasets show that DSSN-GCN model outperforms the competitive baseline (i.e., the DSSN model) and the DSSN-GCN when adopting AttResUNet achieves the best performance, which demonstrates the advance of our method.

Download Full-text

Weakly-Supervised Learning of a Deep Convolutional Neural Networks for Semantic Segmentation

IEEE Access ◽

10.1109/access.2019.2926972 ◽

2019 ◽

Vol 7 ◽

pp. 91009-91018 ◽

Cited By ~ 2

Author(s):

Yanqing Feng ◽

Lunwen Wang ◽

Mengbo Zhang

Keyword(s):

Neural Networks ◽

Supervised Learning ◽

Convolutional Neural Networks ◽

Semantic Segmentation ◽

Deep Convolutional Neural Networks ◽

Weakly Supervised Learning ◽

Weakly Supervised

Download Full-text

Construction of Deep Convolutional Neural Networks For Medical Image Classification

International Journal of Computer Vision and Image Processing ◽

10.4018/ijcvip.2019040101 ◽

2019 ◽

Vol 9 (2) ◽

pp. 1-15 ◽

Cited By ~ 1

Author(s):

Rama A ◽

Kumaravel A ◽

Nalini C

Keyword(s):

Image Classification ◽

Medical Image ◽

High Performance ◽

Fine Tuning ◽

Heterogeneous Environments ◽

Feature Maps ◽

Deep Convolutional Neural Networks ◽

Convolutional Network ◽

Final Layer ◽

Medical Image Classification

Implementing image processing tools demands its components produce better results in critical applications like medical image classification. TensorFlow is one open source with a machine learning framework for high performance and operates in heterogeneous environments. It heralds broad attention at a fine tuning of parameters for obtaining the final models, to obtain better performance. The main aim of this article is to prove the appropriate steps for the classification techniques for diagnosing the diseases with better accuracy. The proposed convolutional network is comprised of three convolutional layers, preceded by average pooling with a size equal to the size of the final feature maps. The final layer in this network has two outputs, corresponding to the number of classes considered to be either normal or abnormal. To train and evaluate such networks like the Deep Convolutional Neural Network (DCNN), a dataset of 2000 x-ray images of lungs was used and a comparative analysis between the proposed DCNN against previous methods is also made.

Download Full-text

Learning Visual Words for Weakly-Supervised Semantic Segmentation

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/136 ◽

2021 ◽

Author(s):

Lixiang Ru ◽

Bo Du ◽

Chen Wu

Keyword(s):

State Of The Art ◽

Local Maximum ◽

Semantic Segmentation ◽

Input Image ◽

Feature Maps ◽

Visual Words ◽

Fine Grained ◽

Spatial Pyramid Pooling ◽

Weakly Supervised ◽

Spatial Pyramid

Current weakly-supervised semantic segmentation (WSSS) methods with image-level labels mainly adopt class activation maps (CAM) to generate the initial pseudo labels. However, CAM usually only identifies the most discriminative object extents, which is attributed to the fact that the network doesn't need to discover the integral object to recognize image-level labels. In this work, to tackle this problem, we proposed to simultaneously learn the image-level labels and local visual word labels. Specifically, in each forward propagation, the feature maps of the input image will be encoded to visual words with a learnable codebook. By enforcing the network to classify the encoded fine-grained visual words, the generated CAM could cover more semantic regions. Besides, we also proposed a hybrid spatial pyramid pooling module that could preserve local maximum and global average values of feature maps, so that more object details and less background were considered. Based on the proposed methods, we conducted experiments on the PASCAL VOC 2012 dataset. Our proposed method achieved 67.2% mIoU on the val set and 67.3% mIoU on the test set, which outperformed recent state-of-the-art methods.

Download Full-text