scholarly journals Coarse-to-Fine Deep Metric Learning for Remote Sensing Image Retrieval

2020 ◽  
Vol 12 (2) ◽  
pp. 219 ◽  
Author(s):  
Min-Sub Yun ◽  
Woo-Jeoung Nam ◽  
Seong-Whan Lee

Remote sensing image retrieval (RSIR) is the process of searching for identical areas by investigating the similarities between a query image and the database images. RSIR is a challenging task owing to the time difference, viewpoint, and coverage area depending on the shooting circumstance, resulting in variations in the image contents. In this paper, we propose a novel method based on a coarse-to-fine strategy, which makes a deep network more robust to the variations in remote sensing images. Moreover, we propose a new triangular loss function to consider the whole relation within the tuple. This loss function improves the retrieval performance and demonstrates better performance in terms of learning the detailed information in complex remote sensing images. To verify our methods, we experimented with the Google Earth South Korea dataset, which contains 40,000 images, using the evaluation metric Recall@n. In all experiments, we obtained better performance results than those of the existing retrieval training methods. Our source code and Google Earth South Korea dataset are available online.

2021 ◽  
Vol 13 (5) ◽  
pp. 869
Author(s):  
Zheng Zhuo ◽  
Zhong Zhou

In recent years, the amount of remote sensing imagery data has increased exponentially. The ability to quickly and effectively find the required images from massive remote sensing archives is the key to the organization, management, and sharing of remote sensing image information. This paper proposes a high-resolution remote sensing image retrieval method with Gabor-CA-ResNet and a split-based deep feature transform network. The main contributions include two points. (1) For the complex texture, diverse scales, and special viewing angles of remote sensing images, A Gabor-CA-ResNet network taking ResNet as the backbone network is proposed by using Gabor to represent the spatial-frequency structure of images, channel attention (CA) mechanism to obtain stronger representative and discriminative deep features. (2) A split-based deep feature transform network is designed to divide the features extracted by the Gabor-CA-ResNet network into several segments and transform them separately for reducing the dimensionality and the storage space of deep features significantly. The experimental results on UCM, WHU-RS, RSSCN7, and AID datasets show that, compared with the state-of-the-art methods, our method can obtain competitive performance, especially for remote sensing images with rare targets and complex textures.


2021 ◽  
Vol 13 (4) ◽  
pp. 747
Author(s):  
Yanghua Di ◽  
Zhiguo Jiang ◽  
Haopeng Zhang

Fine-grained visual categorization (FGVC) is an important and challenging problem due to large intra-class differences and small inter-class differences caused by deformation, illumination, angles, etc. Although major advances have been achieved in natural images in the past few years due to the release of popular datasets such as the CUB-200-2011, Stanford Cars and Aircraft datasets, fine-grained ship classification in remote sensing images has been rarely studied because of relative scarcity of publicly available datasets. In this paper, we investigate a large amount of remote sensing image data of sea ships and determine most common 42 categories for fine-grained visual categorization. Based our previous DSCR dataset, a dataset for ship classification in remote sensing images, we collect more remote sensing images containing warships and civilian ships of various scales from Google Earth and other popular remote sensing image datasets including DOTA, HRSC2016, NWPU VHR-10, We call our dataset FGSCR-42, meaning a dataset for Fine-Grained Ship Classification in Remote sensing images with 42 categories. The whole dataset of FGSCR-42 contains 9320 images of most common types of ships. We evaluate popular object classification algorithms and fine-grained visual categorization algorithms to build a benchmark. Our FGSCR-42 dataset is publicly available at our webpages.


2018 ◽  
Vol 10 (6) ◽  
pp. 964 ◽  
Author(s):  
Zhenfeng Shao ◽  
Ke Yang ◽  
Weixun Zhou

Benchmark datasets are essential for developing and evaluating remote sensing image retrieval (RSIR) approaches. However, most of the existing datasets are single-labeled, with each image in these datasets being annotated by a single label representing the most significant semantic content of the image. This is sufficient for simple problems, such as distinguishing between a building and a beach, but multiple labels and sometimes even dense (pixel) labels are required for more complex problems, such as RSIR and semantic segmentation.We therefore extended the existing multi-labeled dataset collected for multi-label RSIR and presented a dense labeling remote sensing dataset termed "DLRSD". DLRSD contained a total of 17 classes, and the pixels of each image were assigned with 17 pre-defined labels. We used DLRSD to evaluate the performance of RSIR methods ranging from traditional handcrafted feature-based methods to deep learning-based ones. More specifically, we evaluated the performances of RSIR methods from both single-label and multi-label perspectives. These results demonstrated the advantages of multiple labels over single labels for interpreting complex remote sensing images. DLRSD provided the literature a benchmark for RSIR and other pixel-based problems such as semantic segmentation.


Author(s):  
Chengming Zhang ◽  
Shujing Wan ◽  
Shuai Gao ◽  
Fan Yu ◽  
Qingdi Wei ◽  
...  

It is very difficult to accurately divide farmland and woodland in Gaofen 2 (GF-2) remote sensing image, because their single plant coverage is very small, and their spectra are very similar. The ratio of spatial resolution and one plant’s coverage area must be fully taken into account when designing the Convolutional Neural Network structure for extracting them from GF-2 image. We establish a Convolutional Encode Neural Networks model (CENN), The first layer has two sets of convolution kernels to learn the characteristics of farmland and woodland respectively, while the second layer is the encoder to encode the characteristics by transfer function, which can map the results to the corresponding category number. In the training stage, samples of farmland, woodland, and other categories are categorically used to train CENN, as soon as training is accomplished, CENN would acquire enough ability to accurately extract farmland and woodland from remote sensing images. The final extraction result is obtained by implementing per-pixel segmentation of images used to train the CENN. CENN is compared and analyzed with others such as Deep Belief Network (DBN), Full Convolutional Network (FCN), Deeplab Model. The results of experiments show that CENN can more accurately mine the characteristics of farmland and woodland, and it achieves its goal of extracting farmland and woodland with high precision from GF-2 images.


2018 ◽  
Vol 22 (1) ◽  
pp. 29-35 ◽  
Author(s):  
Rui Zeng ◽  
Yingyan Wang ◽  
Wanliang Wang

Although scholars have conducted numerous researches on content-based image retrieval and obtained great achievements, they make little progress in studying remote sensing image retrieval. Both theoretical and application systems are immature. Since remote sensing images are characterized by large data volume, broad coverage, vague themes and rich semantics, the research results on natural images and medical images cannot be directly used in remote sensing image retrieval. Even perfect content-based remote sensing image retrieval systems have many difficulties with data organization, storage and management, feature description and extraction, similarity measurement, relevance feedback, network service mode, and system structure design and implementation. This paper proposes a remote sensing image retrieval algorithm that combines co-occurrence region based Bayesian network image retrieval with average high-frequency signal strength. By Bayesian networks, it establishes correspondence relationships between images and semantics, thereby realizing semantic-based retrieval of remote sensing images. In the meantime, integrated region matching is introduced for iterative retrieval, which effectively improves the precision of semantic retrieval.


2019 ◽  
Vol 12 (1) ◽  
pp. 101 ◽  
Author(s):  
Lirong Han ◽  
Peng Li ◽  
Xiao Bai ◽  
Christos Grecos ◽  
Xiaoyu Zhang ◽  
...  

Recently, the demand for remote sensing image retrieval is growing and attracting the interest of many researchers because of the increasing number of remote sensing images. Hashing, as a method of retrieving images, has been widely applied to remote sensing image retrieval. In order to improve hashing performance, we develop a cohesion intensive deep hashing model for remote sensing image retrieval. The underlying architecture of our deep model is motivated by the state-of-the-art residual net. Residual nets aim at avoiding gradient vanishing and gradient explosion when the net reaches a certain depth. However, different from the residual net which outputs multiple class-labels, we present a residual hash net that is terminated by a Heaviside-like function for binarizing remote sensing images. In this scenario, the representational power of the residual net architecture is exploited to establish an end-to-end deep hashing model. The residual hash net is trained subject to a weighted loss strategy that intensifies the cohesiveness of image hash codes within one class. This effectively addresses the data imbalance problem normally arising in remote sensing image retrieval tasks. Furthermore, we adopted a gradualness optimization method for obtaining optimal model parameters in order to favor accurate binary codes with little quantization error. We conduct comparative experiments on large-scale remote sensing data sets such as UCMerced and AID. The experimental results validate the hypothesis that our method improves the performance of current remote sensing image retrieval.


Author(s):  
Chippy Babu

Remote sensing image retrieval (RSIR) may be a fundamental task in remote sensing. Most content-based image retrieval (CBRSIR) approaches take an easy distance as similarity criteria. A retrieval method supported weighted distance and basic features of Convolutional Neural Network (CNN) is proposed during this letter. the strategy contains two stages. First, in offline stage, the pretrained CNN will be fine-tuned by some labelled images from our target data set, then accustomed extract CNN features, and labelled the pictures within the retrieval data set. Second, in online stage, we extract features of the query image by using fine-tuned CNN model and calculate the load of every image class and apply them to calculate the space between the query image and also the retrieved images. Experiments and methods are conducted on two Remote Sensing Image Retrieval data sets. Compared with the state-of the-art methods, the proposed method significantly improves retrieval performance.


Author(s):  
Kun Yang ◽  
Anning Pan ◽  
Yang Yang ◽  
Su Zhang ◽  
Sim Heng Ong ◽  
...  

Remote sensing image registration plays an important role in military and civilian fields, such as natural disaster damage assessment, military damage assessment and ground targets identification, etc. However, due to the ground relief variations and imaging viewpoint changes, non-rigid geometric distortion occurs between remote sensing images with different viewpoint, which further increases the difficulty of remote sensing image registration. To address the problem, we propose a multi-viewpoint remote sensing image registration method which contains the following contributions. (i) A multiple features based finite mixture model is constructed for dealing with different types of image features. (ii) Three features are combined and substituted into the mixture model to form a feature complementation, i.e., the Euclidean distance and shape context are used to measure the similarity of geometric structure, and the SIFT (scale-invariant feature transform) distance which is endowed with the intensity information is used to measure the scale space extrema. (iii) To prevent the ill-posed problem, a geometric constraint term is introduced into the L2E-based energy function for better behaving the non-rigid transformation. We evaluated the performances of the proposed method by three series of remote sensing images obtained from the unmanned aerial vehicle (UAV) and Google Earth, and compared with five state-of-the-art methods where our method shows the best alignments in most cases.


Author(s):  
Y. Wang ◽  
D. Yu ◽  
S. Ji ◽  
Q. Cheng ◽  
M. Luo

Abstract. Content-based remote sensing image retrieval refers to searching interested images from a remote sensing image dataset that are similar to a query image via extracting features (contents) from images and comparing their similarity. In this work, we come up with a lightweight network structure, which we call the joint spatial and radiometric transformer, which is composed of three modules: parameter generation network (PGN), spatial conversion and radiometric conversion. The PGN module learns specific transformation parameters from input images to guide subsequent spatial and radiometric conversion processes. With these parameters, the spatial conversion and radiometric conversion transform the input images with spatial and spectrum perspectives respectively, to increase the intra-class similarity and inter-class difference, which are attached great importance to CBRSIR. In comparative experiments on multiple remote sensing image retrieval datasets, our proposed joint spatial and radiometric transformer combined with the backbone network ResNet34 has achieved optimal performance.


Sign in / Sign up

Export Citation Format

Share Document