A Fast Deep Perception Network for Remote Sensing Scene Classification

Ruchan Dong; Dazhuan Xu; Lichen Jiao; Jin Zhao; Jungang An

doi:10.3390/rs12040729

A Fast Deep Perception Network for Remote Sensing Scene Classification

Remote Sensing ◽

10.3390/rs12040729 ◽

2020 ◽

Vol 12 (4) ◽

pp. 729 ◽

Cited By ~ 3

Author(s):

Ruchan Dong ◽

Dazhuan Xu ◽

Lichen Jiao ◽

Jin Zhao ◽

Jungang An

Keyword(s):

Remote Sensing ◽

Feature Fusion ◽

Remote Sensing Image ◽

Learning System ◽

Support Vector ◽

Scene Classification ◽

Deep Convolutional Neural Networks ◽

Directional Information ◽

Series Of Experiments ◽

New Feature

Current scene classification for high-resolution remote sensing images usually uses deep convolutional neural networks (DCNN) to extract extensive features and adopts support vector machine (SVM) as classifier. DCNN can well exploit deep features but ignore valuable shallow features like texture and directional information; and SVM can hardly train a large amount of samples in an efficient way. This paper proposes a fast deep perception network (FDPResnet) that integrates DCNN and Broad Learning System (BLS), a novel effective learning system, to extract both deep and shallow features and encapsulates a designed DPModel to fuse the two kinds of features. FDPResnet first extracts the shallow and the deep scene features of a remote sensing image through a pre-trained model on residual neural network-101 (Resnet101). Then, it inputs the two kinds of features into a designed deep perception module (DPModel) to obtain a new set of feature vectors that can describe both higher-level semantic and lower-level space information of the image. The DPModel is the key module responsible for dimension reduction and feature fusion. Finally, the obtained new feature vector is input into BLS for training and classification, and we can obtain a satisfactory classification result. A series of experiments are conducted on the challenging NWPU-RESISC45 remote sensing image dataset, and the results demonstrate that our approach outperforms some popular state-of-the-art deep learning methods, and present high-accurate scene classification within a shorter running time.

Download Full-text

A Multi-Branch Feature Fusion Strategy Based on an Attention Mechanism for Remote Sensing Image Scene Classification

Remote Sensing ◽

10.3390/rs13101950 ◽

2021 ◽

Vol 13 (10) ◽

pp. 1950

Author(s):

Cuiping Shi ◽

Xin Zhao ◽

Liguo Wang

Keyword(s):

Remote Sensing ◽

Feature Extraction ◽

Classification Accuracy ◽

Feature Fusion ◽

State Of The Art ◽

Rapid Development ◽

Remote Sensing Image ◽

Classification Performance ◽

Attention Mechanism ◽

Scene Classification

In recent years, with the rapid development of computer vision, increasing attention has been paid to remote sensing image scene classification. To improve the classification performance, many studies have increased the depth of convolutional neural networks (CNNs) and expanded the width of the network to extract more deep features, thereby increasing the complexity of the model. To solve this problem, in this paper, we propose a lightweight convolutional neural network based on attention-oriented multi-branch feature fusion (AMB-CNN) for remote sensing image scene classification. Firstly, we propose two convolution combination modules for feature extraction, through which the deep features of images can be fully extracted with multi convolution cooperation. Then, the weights of the feature are calculated, and the extracted deep features are sent to the attention mechanism for further feature extraction. Next, all of the extracted features are fused by multiple branches. Finally, depth separable convolution and asymmetric convolution are implemented to greatly reduce the number of parameters. The experimental results show that, compared with some state-of-the-art methods, the proposed method still has a great advantage in classification accuracy with very few parameters.

Download Full-text

An Efficient and Lightweight Convolutional Neural Network for Remote Sensing Image Scene Classification

Sensors ◽

10.3390/s20071999 ◽

2020 ◽

Vol 20 (7) ◽

pp. 1999 ◽

Cited By ~ 6

Author(s):

Donghang Yu ◽

Qing Xu ◽

Haitao Guo ◽

Chuan Zhao ◽

Yuzhun Lin ◽

...

Keyword(s):

Neural Network ◽

Remote Sensing ◽

Convolutional Neural Network ◽

Visual Recognition ◽

Feature Fusion ◽

Remote Sensing Image ◽

Classification Performance ◽

Image Features ◽

Training Dataset ◽

Scene Classification

Classifying remote sensing images is vital for interpreting image content. Presently, remote sensing image scene classification methods using convolutional neural networks have drawbacks, including excessive parameters and heavy calculation costs. More efficient and lightweight CNNs have fewer parameters and calculations, but their classification performance is generally weaker. We propose a more efficient and lightweight convolutional neural network method to improve classification accuracy with a small training dataset. Inspired by fine-grained visual recognition, this study introduces a bilinear convolutional neural network model for scene classification. First, the lightweight convolutional neural network, MobileNetv2, is used to extract deep and abstract image features. Each feature is then transformed into two features with two different convolutional layers. The transformed features are subjected to Hadamard product operation to obtain an enhanced bilinear feature. Finally, the bilinear feature after pooling and normalization is used for classification. Experiments are performed on three widely used datasets: UC Merced, AID, and NWPU-RESISC45. Compared with other state-of-art methods, the proposed method has fewer parameters and calculations, while achieving higher accuracy. By including feature fusion with bilinear pooling, performance and accuracy for remote scene classification can greatly improve. This could be applied to any remote sensing image classification task.

Download Full-text

Feature Fusion with Deep Supervision for Remote-Sensing Image Scene Classification

2018 IEEE 30th International Conference on Tools with Artificial Intelligence (ICTAI) ◽

10.1109/ictai.2018.00046 ◽

2018 ◽

Author(s):

Usman Muhammad ◽

Weiqiang Wang ◽

Abdenour Hadid

Keyword(s):

Remote Sensing ◽

Feature Fusion ◽

Remote Sensing Image ◽

Scene Classification

Download Full-text

FEATURE FUSION FOR CROSS-MODAL SCENE CLASSIFICATION OF REMOTE SENSING IMAGE

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xliv-m-3-2021-63-2021 ◽

2021 ◽

Vol XLIV-M-3-2021 ◽

pp. 63-66

Author(s):

W. Geng ◽

W. Zhou ◽

S. Jin

Keyword(s):

Remote Sensing ◽

Feature Fusion ◽

Remote Sensing Image ◽

Aerial Images ◽

Aerial Image ◽

Svm Classifier ◽

Scene Classification ◽

Street View ◽

Modal Model

Abstract. Scene classification plays an important role in remote sensing field. Traditional approaches use high-resolution remote sensing images as data source to extract powerful features. Although these kind of methods are common, the model performance is severely affected by the image quality of the dataset, and the single modal (source) of images tend to cause the mission of some scene semantic information, which eventually degrade the classification accuracy. Nowadays, multi-modal remote sensing data become easy to obtain since the development of remote sensing technology. How to carry out scene classification of cross-modal data has become an interesting topic in the field. To solve the above problems, this paper proposes using feature fusion for cross-modal scene classification of remote sensing image, i.e., aerial and ground street view images, expecting to use the advantages of aerial images and ground street view data to complement each other. Our cross- modal model is based on Siamese Network. Specifically, we first train the cross-modal model by pairing different sources of data with aerial image and ground data. Then, the trained model is used to extract the deep features of the aerial and ground image pair, and the features of the two perspectives are fused to train a SVM classifier for scene classification. Our approach has been demonstrated using two public benchmark datasets, AiRound and CV-BrCT. The preliminary results show that the proposed method achieves state-of-the-art performance compared with the traditional methods, indicating that the information from ground data can contribute to aerial image classification.

Download Full-text

Remote Sensing Image Scene Classification Based on Deep Multi-branch Feature Fusion Network

ACTA PHOTONICA SINICA ◽

10.3788/gzxb20204905.0510002 ◽

2020 ◽

Vol 49 (5) ◽

pp. 510002

Author(s):

张桐 ZHANG Tong ◽

郑恩让 ZHENG En-rang ◽

沈钧戈 SHEN Jun-ge ◽

高安同 GAO An-tong

Keyword(s):

Remote Sensing ◽

Feature Fusion ◽

Remote Sensing Image ◽

Scene Classification

Download Full-text

An End-to-End Local-Global-Fusion Feature Extraction Network for Remote Sensing Image Scene Classification

Remote Sensing ◽

10.3390/rs11243006 ◽

2019 ◽

Vol 11 (24) ◽

pp. 3006 ◽

Cited By ~ 4

Author(s):

Yafei Lv ◽

Xiaohan Zhang ◽

Wei Xiong ◽

Yaqi Cui ◽

Mi Cai

Keyword(s):

Remote Sensing ◽

Feature Extraction ◽

Feature Fusion ◽

Remote Sensing Image ◽

Local Features ◽

Feature Representation ◽

Scene Classification ◽

Global Features ◽

End To End ◽

Fusion Feature

Remote sensing image scene classification (RSISC) is an active task in the remote sensing community and has attracted great attention due to its wide applications. Recently, the deep convolutional neural networks (CNNs)-based methods have witnessed a remarkable breakthrough in performance of remote sensing image scene classification. However, the problem that the feature representation is not discriminative enough still exists, which is mainly caused by the characteristic of inter-class similarity and intra-class diversity. In this paper, we propose an efficient end-to-end local-global-fusion feature extraction (LGFFE) network for a more discriminative feature representation. Specifically, global and local features are extracted from channel and spatial dimensions respectively, based on a high-level feature map from deep CNNs. For the local features, a novel recurrent neural network (RNN)-based attention module is first proposed to capture the spatial layout information and context information across different regions. Gated recurrent units (GRUs) is then exploited to generate the important weight of each region by taking a sequence of features from image patches as input. A reweighed regional feature representation can be obtained by focusing on the key region. Then, the final feature representation can be acquired by fusing the local and global features. The whole process of feature extraction and feature fusion can be trained in an end-to-end manner. Finally, extensive experiments have been conducted on four public and widely used datasets and experimental results show that our method LGFFE outperforms baseline methods and achieves state-of-the-art results.

Download Full-text

Remote sensing image scene classification via multi-feature fusion

2018 Chinese Control And Decision Conference (CCDC) ◽

10.1109/ccdc.2018.8407728 ◽

2018 ◽

Cited By ~ 1

Author(s):

Ruiyao Liu ◽

Xiaoyong Bian ◽

Yuxia Sheng

Keyword(s):

Remote Sensing ◽

Feature Fusion ◽

Remote Sensing Image ◽

Scene Classification

Download Full-text

Rotation Invariance Regularization for Remote Sensing Image Scene Classification with Convolutional Neural Networks

Remote Sensing ◽

10.3390/rs13040569 ◽

2021 ◽

Vol 13 (4) ◽

pp. 569

Author(s):

Kunlun Qi ◽

Chao Yang ◽

Chuli Hu ◽

Yonglin Shen ◽

Shengyu Shen ◽

...

Keyword(s):

Remote Sensing ◽

Neural Networks ◽

Convolutional Neural Networks ◽

Data Augmentation ◽

Remote Sensing Image ◽

Classification Performance ◽

Rotation Invariance ◽

Scene Classification ◽

Deep Convolutional Neural Networks ◽

Convolutional Network

Deep convolutional neural networks (DCNNs) have shown significant improvements in remote sensing image scene classification for powerful feature representations. However, because of the high variance and volume limitations of the available remote sensing datasets, DCNNs are prone to overfit the data used for their training. To address this problem, this paper proposes a novel scene classification framework based on a deep Siamese convolutional network with rotation invariance regularization. Specifically, we design a data augmentation strategy for the Siamese model to learn a rotation invariance DCNN model that is achieved by directly enforcing the labels of the training samples before and after rotating to be mapped close to each other. In addition to the cross-entropy cost function for the traditional CNN models, we impose a rotation invariance regularization constraint on the objective function of our proposed model. The experimental results obtained using three publicly-available scene classification datasets show that the proposed method can generally improve the classification performance by 2~3% and achieves satisfactory classification performance compared with some state-of-the-art methods.

Download Full-text

PCAN—Part-Based Context Attention Network for Thermal Power Plant Detection in Remote Sensing Imagery

Remote Sensing ◽

10.3390/rs13071243 ◽

2021 ◽

Vol 13 (7) ◽

pp. 1243

Author(s):

Wenxin Yin ◽

Wenhui Diao ◽

Peijin Wang ◽

Xin Gao ◽

Ya Li ◽

...

Keyword(s):

Remote Sensing ◽

Power Plants ◽

State Of The Art ◽

Thermal Power ◽

Image Interpretation ◽

Remote Sensing Image ◽

Thermal Power Plants ◽

Average Precision ◽

Deep Convolutional Neural Networks ◽

Multi Scale

The detection of Thermal Power Plants (TPPs) is a meaningful task for remote sensing image interpretation. It is a challenging task, because as facility objects TPPs are composed of various distinctive and irregular components. In this paper, we propose a novel end-to-end detection framework for TPPs based on deep convolutional neural networks. Specifically, based on the RetinaNet one-stage detector, a context attention multi-scale feature extraction network is proposed to fuse global spatial attention to strengthen the ability in representing irregular objects. In addition, we design a part-based attention module to adapt to TPPs containing distinctive components. Experiments show that the proposed method outperforms the state-of-the-art methods and can achieve 68.15% mean average precision.

Download Full-text