scholarly journals Neighbor-Based Label Distribution Learning to Model Label Ambiguity for Aerial Scene Classification

2021 ◽  
Vol 13 (4) ◽  
pp. 755
Author(s):  
Jianqiao Luo ◽  
Yihan Wang ◽  
Yang Ou ◽  
Biao He ◽  
Bailin Li

Many aerial images with similar appearances have different but correlated scene labels, which causes the label ambiguity. Label distribution learning (LDL) can express label ambiguity by giving each sample a label distribution. Thus, a sample contributes to the learning of its ground-truth label as well as correlated labels, which improve data utilization. LDL has gained success in many fields, such as age estimation, in which label ambiguity can be easily modeled on the basis of the prior knowledge about local sample similarity and global label correlations. However, LDL has never been applied to scene classification, because there is no knowledge about the local similarity and label correlations and thus it is hard to model label ambiguity. In this paper, we uncover the sample neighbors that cause label ambiguity by jointly capturing the local similarity and label correlations and propose neighbor-based LDL (N-LDL) for aerial scene classification. We define a subspace learning problem, which formulates the neighboring relations as a coefficient matrix that is regularized by a sparse constraint and label correlations. The sparse constraint provides a few nearest neighbors, which captures local similarity. The label correlations are predefined according to the confusion matrices on validation sets. During subspace learning, the neighboring relations are encouraged to agree with the label correlations, which ensures that the uncovered neighbors have correlated labels. Finally, the label propagation among the neighbors forms the label distributions, which leads to label smoothing in terms of label ambiguity. The label distributions are used to train convolutional neural networks (CNNs). Experiments on the aerial image dataset (AID) and NWPU_RESISC45 (NR) datasets demonstrate that using the label distributions clearly improves the classification performance by assisting feature learning and mitigating over-fitting problems, and our method achieves state-of-the-art performance.

Author(s):  
Xiuyi Jia ◽  
Zechao Li ◽  
Xiang Zheng ◽  
Weiwei Li ◽  
Sheng-Jun Huang

Author(s):  
Tingting Ren ◽  
Xiuyi Jia ◽  
Weiwei Li ◽  
Shu Zhao

Label distribution learning (LDL) can be viewed as the generalization of multi-label learning. This novel paradigm focuses on the relative importance of different labels to a particular instance. Most previous LDL methods either ignore the correlation among labels, or only exploit the label correlations in a global way. In this paper, we utilize both the global and local relevance among labels to provide more information for training model and propose a novel label distribution learning algorithm. In particular, a label correlation matrix based on low-rank approximation is applied to capture the global label correlations. In addition, the label correlation among local samples are adopted to modify the label correlation matrix. The experimental results on real-world data sets show that the proposed algorithm outperforms state-of-the-art LDL methods.


Author(s):  
W. Geng ◽  
W. Zhou ◽  
S. Jin

Abstract. Scene classification plays an important role in remote sensing field. Traditional approaches use high-resolution remote sensing images as data source to extract powerful features. Although these kind of methods are common, the model performance is severely affected by the image quality of the dataset, and the single modal (source) of images tend to cause the mission of some scene semantic information, which eventually degrade the classification accuracy. Nowadays, multi-modal remote sensing data become easy to obtain since the development of remote sensing technology. How to carry out scene classification of cross-modal data has become an interesting topic in the field. To solve the above problems, this paper proposes using feature fusion for cross-modal scene classification of remote sensing image, i.e., aerial and ground street view images, expecting to use the advantages of aerial images and ground street view data to complement each other. Our cross- modal model is based on Siamese Network. Specifically, we first train the cross-modal model by pairing different sources of data with aerial image and ground data. Then, the trained model is used to extract the deep features of the aerial and ground image pair, and the features of the two perspectives are fused to train a SVM classifier for scene classification. Our approach has been demonstrated using two public benchmark datasets, AiRound and CV-BrCT. The preliminary results show that the proposed method achieves state-of-the-art performance compared with the traditional methods, indicating that the information from ground data can contribute to aerial image classification.


Author(s):  
Tingting Ren ◽  
Xiuyi Jia ◽  
Weiwei Li ◽  
Lei Chen ◽  
Zechao Li

Label distribution learning (LDL) is a novel machine learning paradigm to deal with label ambiguity issues by placing more emphasis on how relevant each label is to a particular instance. Many LDL algorithms have been proposed and most of them concentrate on the learning models, while few of them focus on the feature selection problem. All existing LDL models are built on a simple feature space in which all features are shared by all the class labels. However, this kind of traditional data representation strategy tends to select features that are distinguishable for all labels, but ignores label-specific features that are pertinent and discriminative for each class label. In this paper, we propose a novel LDL algorithm by leveraging label-specific features. The common features for all labels and specific features for each label are simultaneously learned to enhance the LDL model. Moreover, we also exploit the label correlations in the proposed LDL model. The experimental results on several real-world data sets validate the effectiveness of our method.


Aerial images provide a landscape view of earth surfaces that utilized to monitor the large areas. Each Aerial image comprises the different scenes to identify the objects on the digital maps. The several methodologies have been developed to solve the problem of the scene classification using input aerial images. The method does not improve the classification performance using more aerial images. In order to improve the classification performance, a Tanimoto Gaussian Kernelized Feature Extraction Based Multinomial GentleBoost Classification (TGKFE-MGBC) technique is introduced. The TGKFE-MGBC technique comprises three major processes namely object-based segmentation, feature extraction and aerial image scene classification. At first, object-based segmentation partitions the aerial image into several sub-bands. Aerial image with more than two objects is called as multi-spectral. The objects in spectral bands are identified by Tanimoto pixel similarity measure. This process helps to reduce the feature extraction time. Each object has different features like shape, size, color, texture and so on. After that, Gaussian Kernelized Feature Extraction is carried out to extracts the features from the objects with minimal time. Finally, the Multinomial GentleBoost Classification is applied for categorizing the scenes into different classes with the extracted features. The GentleBoost is an ensemble technique uses multinomial naïve Bayes probabilistic classifier as a weak learner and it combines to makes a strong one for classifying the scenes. The strong classifier result improves the aerial image scene classification accuracy and minimizes the false positive rate. Simulation is conducted using aerial image database with different factors such as feature extraction time, aerial image scene classification accuracy and false positive rate. The results showed that the TGKFE-MGBC technique effectively improves the aerial image scene classification accuracy and minimizes the feature extraction time as well as the false positive rate.


Aerial image scene classification is a key problem to be resolved in image processing. Many research works have been designed for carried outing scene classification. But, accuracy of existing scene classification was lower. In order to overcome such limitation, a Robust Regressive Feature Extraction Based Relevance Vector Margin Boosting Scene Classification (RRFERVMBSC) Technique is proposed. The RRFE-RVMBSC technique is designed for improving the classification performance of aerial images with minimal time. The RRFERVMBSC technique comprises two main processes namely feature extraction and classification. Initially, RRFE-RVMBSC technique gets number of aerial images as input. After taking input, Robust Regressive Independent Component Analysis Based Feature Extraction process is performed in order to extract the features i.e. shape, color, texture and size from aerial image. After completing feature extraction process, RRFERVMBSC technique carried outs Ensembled Relevance Vector Margin Boosting Classification (ERVMBC) where all the input aerial images are classified into multiple classes with higher accuracy. The RRFE-RVMBSC technique constructs a strong classifier by reducing the training error of weak RVM classifier for effectual aerial images scene categorization. The RRFERVMBSC technique accomplishes simulation work using parameters such as feature extraction time classification accuracy and false positive rate with respect to number of aerial images.


2021 ◽  
Author(s):  
Gui-Lin Li ◽  
Heng-Ru Zhang ◽  
Yuan-Yuan Xu ◽  
Ya-Lan Lv ◽  
Fan Min

2019 ◽  
Vol 11 (10) ◽  
pp. 1157 ◽  
Author(s):  
Jorge Fuentes-Pacheco ◽  
Juan Torres-Olivares ◽  
Edgar Roman-Rangel ◽  
Salvador Cervantes ◽  
Porfirio Juarez-Lopez ◽  
...  

Crop segmentation is an important task in Precision Agriculture, where the use of aerial robots with an on-board camera has contributed to the development of new solution alternatives. We address the problem of fig plant segmentation in top-view RGB (Red-Green-Blue) images of a crop grown under open-field difficult circumstances of complex lighting conditions and non-ideal crop maintenance practices defined by local farmers. We present a Convolutional Neural Network (CNN) with an encoder-decoder architecture that classifies each pixel as crop or non-crop using only raw colour images as input. Our approach achieves a mean accuracy of 93.85% despite the complexity of the background and a highly variable visual appearance of the leaves. We make available our CNN code to the research community, as well as the aerial image data set and a hand-made ground truth segmentation with pixel precision to facilitate the comparison among different algorithms.


Sign in / Sign up

Export Citation Format

Share Document