Coarse-to-Fine Deep Metric Learning for Remote Sensing Image Retrieval

Min-Sub Yun; Woo-Jeoung Nam; Seong-Whan Lee

doi:10.3390/rs12020219

Coarse-to-Fine Deep Metric Learning for Remote Sensing Image Retrieval

Remote Sensing ◽

10.3390/rs12020219 ◽

2020 ◽

Vol 12 (2) ◽

pp. 219 ◽

Cited By ~ 1

Author(s):

Min-Sub Yun ◽

Woo-Jeoung Nam ◽

Seong-Whan Lee

Keyword(s):

Remote Sensing ◽

South Korea ◽

Image Retrieval ◽

Loss Function ◽

Remote Sensing Image ◽

Google Earth ◽

Remote Sensing Images ◽

Query Image ◽

Coverage Area ◽

Coarse To Fine

Remote sensing image retrieval (RSIR) is the process of searching for identical areas by investigating the similarities between a query image and the database images. RSIR is a challenging task owing to the time difference, viewpoint, and coverage area depending on the shooting circumstance, resulting in variations in the image contents. In this paper, we propose a novel method based on a coarse-to-fine strategy, which makes a deep network more robust to the variations in remote sensing images. Moreover, we propose a new triangular loss function to consider the whole relation within the tuple. This loss function improves the retrieval performance and demonstrates better performance in terms of learning the detailed information in complex remote sensing images. To verify our methods, we experimented with the Google Earth South Korea dataset, which contains 40,000 images, using the evaluation metric Recall@n. In all experiments, we obtained better performance results than those of the existing retrieval training methods. Our source code and Google Earth South Korea dataset are available online.

Download Full-text

Remote Sensing Image Retrieval with Gabor-CA-ResNet and Split-Based Deep Feature Transform Network

Remote Sensing ◽

10.3390/rs13050869 ◽

2021 ◽

Vol 13 (5) ◽

pp. 869

Author(s):

Zheng Zhuo ◽

Zhong Zhou

Keyword(s):

Remote Sensing ◽

Image Retrieval ◽

State Of The Art ◽

Remote Sensing Image ◽

Storage Space ◽

Remote Sensing Images ◽

Retrieval Method ◽

Organization Management ◽

Deep Feature ◽

Feature Transform

In recent years, the amount of remote sensing imagery data has increased exponentially. The ability to quickly and effectively find the required images from massive remote sensing archives is the key to the organization, management, and sharing of remote sensing image information. This paper proposes a high-resolution remote sensing image retrieval method with Gabor-CA-ResNet and a split-based deep feature transform network. The main contributions include two points. (1) For the complex texture, diverse scales, and special viewing angles of remote sensing images, A Gabor-CA-ResNet network taking ResNet as the backbone network is proposed by using Gabor to represent the spatial-frequency structure of images, channel attention (CA) mechanism to obtain stronger representative and discriminative deep features. (2) A split-based deep feature transform network is designed to divide the features extracted by the Gabor-CA-ResNet network into several segments and transform them separately for reducing the dimensionality and the storage space of deep features significantly. The experimental results on UCM, WHU-RS, RSSCN7, and AID datasets show that, compared with the state-of-the-art methods, our method can obtain competitive performance, especially for remote sensing images with rare targets and complex textures.

Download Full-text

A Public Dataset for Fine-Grained Ship Classification in Optical Remote Sensing Images

Remote Sensing ◽

10.3390/rs13040747 ◽

2021 ◽

Vol 13 (4) ◽

pp. 747

Author(s):

Yanghua Di ◽

Zhiguo Jiang ◽

Haopeng Zhang

Keyword(s):

Remote Sensing ◽

Image Data ◽

Remote Sensing Image ◽

Google Earth ◽

Optical Remote Sensing ◽

Remote Sensing Images ◽

Visual Categorization ◽

Class Differences ◽

Fine Grained ◽

Ship Classification

Fine-grained visual categorization (FGVC) is an important and challenging problem due to large intra-class differences and small inter-class differences caused by deformation, illumination, angles, etc. Although major advances have been achieved in natural images in the past few years due to the release of popular datasets such as the CUB-200-2011, Stanford Cars and Aircraft datasets, fine-grained ship classification in remote sensing images has been rarely studied because of relative scarcity of publicly available datasets. In this paper, we investigate a large amount of remote sensing image data of sea ships and determine most common 42 categories for fine-grained visual categorization. Based our previous DSCR dataset, a dataset for ship classification in remote sensing images, we collect more remote sensing images containing warships and civilian ships of various scales from Google Earth and other popular remote sensing image datasets including DOTA, HRSC2016, NWPU VHR-10, We call our dataset FGSCR-42, meaning a dataset for Fine-Grained Ship Classification in Remote sensing images with 42 categories. The whole dataset of FGSCR-42 contains 9320 images of most common types of ships. We evaluate popular object classification algorithms and fine-grained visual categorization algorithms to build a benchmark. Our FGSCR-42 dataset is publicly available at our webpages.

Download Full-text

Performance Evaluation of Single-Label and Multi-Label Remote Sensing Image Retrieval Using a Dense Labeling Dataset

Remote Sensing ◽

10.3390/rs10060964 ◽

2018 ◽

Vol 10 (6) ◽

pp. 964 ◽

Cited By ~ 34

Author(s):

Zhenfeng Shao ◽

Ke Yang ◽

Weixun Zhou

Keyword(s):

Remote Sensing ◽

Performance Evaluation ◽

Deep Learning ◽

Image Retrieval ◽

Semantic Segmentation ◽

Semantic Content ◽

Remote Sensing Image ◽

Remote Sensing Images ◽

Benchmark Datasets ◽

Feature Based

Benchmark datasets are essential for developing and evaluating remote sensing image retrieval (RSIR) approaches. However, most of the existing datasets are single-labeled, with each image in these datasets being annotated by a single label representing the most significant semantic content of the image. This is sufficient for simple problems, such as distinguishing between a building and a beach, but multiple labels and sometimes even dense (pixel) labels are required for more complex problems, such as RSIR and semantic segmentation.We therefore extended the existing multi-labeled dataset collected for multi-label RSIR and presented a dense labeling remote sensing dataset termed "DLRSD". DLRSD contained a total of 17 classes, and the pixels of each image were assigned with 17 pre-defined labels. We used DLRSD to evaluate the performance of RSIR methods ranging from traditional handcrafted feature-based methods to deep learning-based ones. More specifically, we evaluated the performances of RSIR methods from both single-label and multi-label perspectives. These results demonstrated the advantages of multiple labels over single labels for interpreting complex remote sensing images. DLRSD provided the literature a benchmark for RSIR and other pixel-based problems such as semantic segmentation.

Download Full-text

A novel coarse-to-fine remote sensing image retrieval system in JPEG-2000 compressed domain

Image and Signal Processing for Remote Sensing XXIV ◽

10.1117/12.2327051 ◽

2018 ◽

Author(s):

Begüm Demir ◽

Akshara Preethy Byju ◽

Lorenzo Bruzzone

Keyword(s):

Remote Sensing ◽

Image Retrieval ◽

Retrieval System ◽

Remote Sensing Image ◽

Compressed Domain ◽

Jpeg 2000 ◽

Image Retrieval System ◽

Coarse To Fine

Download Full-text

A Segmentation Model for Extracting Farmland and Woodland from Remote Sensing Image

10.20944/preprints201712.0192.v1 ◽

2017 ◽

Author(s):

Chengming Zhang ◽

Shujing Wan ◽

Shuai Gao ◽

Fan Yu ◽

Qingdi Wei ◽

...

Keyword(s):

Remote Sensing ◽

Remote Sensing Image ◽

Remote Sensing Images ◽

Convolutional Network ◽

Coverage Area ◽

Plant Coverage ◽

Training Stage ◽

Convolution Kernels ◽

Neural Network Structure ◽

Segmentation Of Images

It is very difficult to accurately divide farmland and woodland in Gaofen 2 (GF-2) remote sensing image, because their single plant coverage is very small, and their spectra are very similar. The ratio of spatial resolution and one plant’s coverage area must be fully taken into account when designing the Convolutional Neural Network structure for extracting them from GF-2 image. We establish a Convolutional Encode Neural Networks model (CENN), The first layer has two sets of convolution kernels to learn the characteristics of farmland and woodland respectively, while the second layer is the encoder to encode the characteristics by transfer function, which can map the results to the corresponding category number. In the training stage, samples of farmland, woodland, and other categories are categorically used to train CENN, as soon as training is accomplished, CENN would acquire enough ability to accurately extract farmland and woodland from remote sensing images. The final extraction result is obtained by implementing per-pixel segmentation of images used to train the CENN. CENN is compared and analyzed with others such as Deep Belief Network (DBN), Full Convolutional Network (FCN), Deeplab Model. The results of experiments show that CENN can more accurately mine the characteristics of farmland and woodland, and it achieves its goal of extracting farmland and woodland with high precision from GF-2 images.

Download Full-text

A co-occurrence region based Bayesian network stepwise remote sensing image retrieval algorithm

Earth Sciences Research Journal ◽

10.15446/esrj.v22n1.66107 ◽

2018 ◽

Vol 22 (1) ◽

pp. 29-35 ◽

Cited By ~ 1

Author(s):

Rui Zeng ◽

Yingyan Wang ◽

Wanliang Wang

Keyword(s):

Remote Sensing ◽

Image Retrieval ◽

Bayesian Network ◽

Remote Sensing Image ◽

Structure Design ◽

System Structure ◽

Retrieval Algorithm ◽

Remote Sensing Images ◽

High Frequency Signal ◽

Application Systems

Although scholars have conducted numerous researches on content-based image retrieval and obtained great achievements, they make little progress in studying remote sensing image retrieval. Both theoretical and application systems are immature. Since remote sensing images are characterized by large data volume, broad coverage, vague themes and rich semantics, the research results on natural images and medical images cannot be directly used in remote sensing image retrieval. Even perfect content-based remote sensing image retrieval systems have many difficulties with data organization, storage and management, feature description and extraction, similarity measurement, relevance feedback, network service mode, and system structure design and implementation. This paper proposes a remote sensing image retrieval algorithm that combines co-occurrence region based Bayesian network image retrieval with average high-frequency signal strength. By Bayesian networks, it establishes correspondence relationships between images and semantics, thereby realizing semantic-based retrieval of remote sensing images. In the meantime, integrated region matching is introduced for iterative retrieval, which effectively improves the precision of semantic retrieval.

Download Full-text

Cohesion Intensive Deep Hashing for Remote Sensing Image Retrieval

Remote Sensing ◽

10.3390/rs12010101 ◽

2019 ◽

Vol 12 (1) ◽

pp. 101 ◽

Cited By ~ 3

Author(s):

Lirong Han ◽

Peng Li ◽

Xiao Bai ◽

Christos Grecos ◽

Xiaoyu Zhang ◽

...

Keyword(s):

Remote Sensing ◽

Image Retrieval ◽

Large Scale ◽

Remote Sensing Data ◽

Optimization Method ◽

Remote Sensing Image ◽

Model Parameters ◽

Remote Sensing Images ◽

Deep Hashing ◽

Deep Model

Recently, the demand for remote sensing image retrieval is growing and attracting the interest of many researchers because of the increasing number of remote sensing images. Hashing, as a method of retrieving images, has been widely applied to remote sensing image retrieval. In order to improve hashing performance, we develop a cohesion intensive deep hashing model for remote sensing image retrieval. The underlying architecture of our deep model is motivated by the state-of-the-art residual net. Residual nets aim at avoiding gradient vanishing and gradient explosion when the net reaches a certain depth. However, different from the residual net which outputs multiple class-labels, we present a residual hash net that is terminated by a Heaviside-like function for binarizing remote sensing images. In this scenario, the representational power of the residual net architecture is exploited to establish an end-to-end deep hashing model. The residual hash net is trained subject to a weighted loss strategy that intensifies the cohesiveness of image hash codes within one class. This effectively addresses the data imbalance problem normally arising in remote sensing image retrieval tasks. Furthermore, we adopted a gradualness optimization method for obtaining optimal model parameters in order to favor accurate binary codes with little quantization error. We conduct comparative experiments on large-scale remote sensing data sets such as UCMerced and AID. The experimental results validate the hypothesis that our method improves the performance of current remote sensing image retrieval.

Download Full-text

Remote Sensing Image Retrieval Using Convolutional Neural Network Features and Weighted Distance

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.37321 ◽

2021 ◽

Vol 9 (VIII) ◽

pp. 312-3189

Author(s):

Chippy Babu

Keyword(s):

Neural Network ◽

Remote Sensing ◽

Image Retrieval ◽

Convolutional Neural Network ◽

Remote Sensing Image ◽

Similarity Criteria ◽

Query Image ◽

Data Set ◽

Weighted Distance ◽

Basic Features

Remote sensing image retrieval (RSIR) may be a fundamental task in remote sensing. Most content-based image retrieval (CBRSIR) approaches take an easy distance as similarity criteria. A retrieval method supported weighted distance and basic features of Convolutional Neural Network (CNN) is proposed during this letter. the strategy contains two stages. First, in offline stage, the pretrained CNN will be fine-tuned by some labelled images from our target data set, then accustomed extract CNN features, and labelled the pictures within the retrieval data set. Second, in online stage, we extract features of the query image by using fine-tuned CNN model and calculate the load of every image class and apply them to calculate the space between the query image and also the retrieved images. Experiments and methods are conducted on two Remote Sensing Image Retrieval data sets. Compared with the state-of the-art methods, the proposed method significantly improves retrieval performance.

Download Full-text

Remote Sensing Image Registration Using Multiple Image Features

10.20944/preprints201705.0027.v2 ◽

2017 ◽

Author(s):

Kun Yang ◽

Anning Pan ◽

Yang Yang ◽

Su Zhang ◽

Sim Heng Ong ◽

...

Keyword(s):

Remote Sensing ◽

Image Registration ◽

Mixture Model ◽

Damage Assessment ◽

Remote Sensing Image ◽

Image Features ◽

Google Earth ◽

Geometric Distortion ◽

Remote Sensing Images ◽

Registration Method

Remote sensing image registration plays an important role in military and civilian fields, such as natural disaster damage assessment, military damage assessment and ground targets identification, etc. However, due to the ground relief variations and imaging viewpoint changes, non-rigid geometric distortion occurs between remote sensing images with different viewpoint, which further increases the difficulty of remote sensing image registration. To address the problem, we propose a multi-viewpoint remote sensing image registration method which contains the following contributions. (i) A multiple features based finite mixture model is constructed for dealing with different types of image features. (ii) Three features are combined and substituted into the mixture model to form a feature complementation, i.e., the Euclidean distance and shape context are used to measure the similarity of geometric structure, and the SIFT (scale-invariant feature transform) distance which is endowed with the intensity information is used to measure the scale space extrema. (iii) To prevent the ill-posed problem, a geometric constraint term is introduced into the L2E-based energy function for better behaving the non-rigid transformation. We evaluated the performances of the proposed method by three series of remote sensing images obtained from the unmanned aerial vehicle (UAV) and Google Earth, and compared with five state-of-the-art methods where our method shows the best alignments in most cases.

Download Full-text

THE JOINT SPATIAL AND RADIOMETRIC TRANSFORMER FOR REMOTE SENSING IMAGE RETRIEVAL

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xliii-b3-2020-227-2020 ◽

2020 ◽

Vol XLIII-B3-2020 ◽

pp. 227-231

Author(s):

Y. Wang ◽

D. Yu ◽

S. Ji ◽

Q. Cheng ◽

M. Luo

Keyword(s):

Remote Sensing ◽

Image Retrieval ◽

Network Structure ◽

Remote Sensing Image ◽

Optimal Performance ◽

Query Image ◽

Backbone Network ◽

Image Dataset ◽

Generation Network ◽

Abstract Content

Abstract. Content-based remote sensing image retrieval refers to searching interested images from a remote sensing image dataset that are similar to a query image via extracting features (contents) from images and comparing their similarity. In this work, we come up with a lightweight network structure, which we call the joint spatial and radiometric transformer, which is composed of three modules: parameter generation network (PGN), spatial conversion and radiometric conversion. The PGN module learns specific transformation parameters from input images to guide subsequent spatial and radiometric conversion processes. With these parameters, the spatial conversion and radiometric conversion transform the input images with spatial and spectrum perspectives respectively, to increase the intra-class similarity and inter-class difference, which are attached great importance to CBRSIR. In comparative experiments on multiple remote sensing image retrieval datasets, our proposed joint spatial and radiometric transformer combined with the backbone network ResNet34 has achieved optimal performance.

Download Full-text