Auto Encoder Feature Learning with Utilization of Local Spatial Information and Data Distribution for Classification of PolSAR Image

Biao Hou; Jianlong Wang; Licheng Jiao; Shuang Wang

doi:10.3390/rs11111313

Auto Encoder Feature Learning with Utilization of Local Spatial Information and Data Distribution for Classification of PolSAR Image

Remote Sensing ◽

10.3390/rs11111313 ◽

2019 ◽

Vol 11 (11) ◽

pp. 1313

Author(s):

Biao Hou ◽

Jianlong Wang ◽

Licheng Jiao ◽

Shuang Wang

Keyword(s):

Spatial Information ◽

Data Distribution ◽

Feature Learning ◽

Classification Performance ◽

Data Matrix ◽

Post Processing ◽

Softmax Classifier ◽

Machine Learning Model ◽

Processing Steps

The distribution of data plays a key role in the designing of a machine learning model. Therefore, this paper proposes a novel auto encoder network based on the distribution of polarimetric synthetic aperture radar (PolSAR) data matrix. Designed specifically for PolSAR data matrix, the proposed mixture auto encoder (MAE) feature learning method defines data error term in the loss function according to the data distribution. Instead of the pixel itself, all pixels in the neighborhood are used as input to train the proposed MAE. Then, a corresponding classification network is also given by discarding the decoder process of the proposed MAE and connecting with a Softmax classifier. The MAE is trained using the unlabeled data, while the training process of the classification network is completed with the help of a small number of labeled pixels. In view of the phenomenon of misclassification in the predicted result image, two post-processing steps acting on local spatial are also given, which accomplished by the proposed two filters. Extensive experiments by four methods were made over three real PolSAR images including the proposed classification network. The experimental results show that introducing data distribution into the auto encoder network leads to an average 4% improvement in overall accuracy for three PolSAR images. Moreover, the post-processing steps with the proposed filters bring a new level of discrimination on the classification performance of PolSAR images.

Download Full-text

Scene Classification Based on Improved Spatial Partition Model

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.527.339 ◽

2014 ◽

Vol 527 ◽

pp. 339-342

Author(s):

Zhi Yuan Liu ◽

Jin He ◽

Jin Long Wang ◽

Fei Zhao

Keyword(s):

Spatial Information ◽

Classification Performance ◽

Image Features ◽

Natural Scene ◽

Scene Classification ◽

Partition Model ◽

Spatial Partition ◽

Improved Model ◽

Local Image

In order to make full use of the spatial information of images in the classification of natural scene, we use the spatial partition model. But mechanically space division caused the abuse of spatial information. So spatial partition model must be properly improved to make the different categories of images were more diversity, so that the classification performance is improved. In addition, to further improve the performance, we use FAN-SIFT as local image features. Experiments made on 8 scenes image dataset and Caltech101 dataset show that the improved model can obtain better classification performance.

Download Full-text

Contrastive Learning for 3D Point Clouds Classification and Shape Completion

Sensors ◽

10.3390/s21217392 ◽

2021 ◽

Vol 21 (21) ◽

pp. 7392

Author(s):

Danish Nazir ◽

Muhammad Zeshan Afzal ◽

Alain Pagani ◽

Marcus Liwicki ◽

Didier Stricker

Keyword(s):

Point Cloud ◽

Feature Learning ◽

Point Clouds ◽

Classification Performance ◽

Feature Representations ◽

3D Point Clouds ◽

Chamfer Distance ◽

Shape Completion ◽

Number Of Classes

In this paper, we present the idea of Self Supervised learning on the shape completion and classification of point clouds. Most 3D shape completion pipelines utilize AutoEncoders to extract features from point clouds used in downstream tasks such as classification, segmentation, detection, and other related applications. Our idea is to add contrastive learning into AutoEncoders to encourage global feature learning of the point cloud classes. It is performed by optimizing triplet loss. Furthermore, local feature representations learning of point cloud is performed by adding the Chamfer distance function. To evaluate the performance of our approach, we utilize the PointNet classifier. We also extend the number of classes for evaluation from 4 to 10 to show the generalization ability of the learned features. Based on our results, embeddings generated from the contrastive AutoEncoder enhances shape completion and classification performance from 84.2% to 84.9% of point clouds achieving the state-of-the-art results with 10 classes.

Download Full-text

Gastrointestinal Disease Classification in Endoscopic Images Using Attention-Guided Convolutional Neural Networks

Applied Sciences ◽

10.3390/app112311136 ◽

2021 ◽

Vol 11 (23) ◽

pp. 11136

Author(s):

Zenebe Markos Lonseko ◽

Prince Ebenezer Adjei ◽

Wenju Du ◽

Chengsi Luo ◽

Dingcan Hu ◽

...

Keyword(s):

Data Augmentation ◽

Spatial Information ◽

Gastrointestinal Disease ◽

Confusion Matrix ◽

Automatic Classification ◽

Classification Performance ◽

Attention Mechanism ◽

Disease Classification ◽

Matrix Analysis

Gastrointestinal (GI) diseases constitute a leading problem in the human digestive system. Consequently, several studies have explored automatic classification of GI diseases as a means of minimizing the burden on clinicians and improving patient outcomes, for both diagnostic and treatment purposes. The challenge in using deep learning-based (DL) approaches, specifically a convolutional neural network (CNN), is that spatial information is not fully utilized due to the inherent mechanism of CNNs. This paper proposes the application of spatial factors in improving classification performance. Specifically, we propose a deep CNN-based spatial attention mechanism for the classification of GI diseases, implemented with encoder–decoder layers. To overcome the data imbalance problem, we adapt data-augmentation techniques. A total of 12,147 multi-sited, multi-diseased GI images, drawn from publicly available and private sources, were used to validate the proposed approach. Furthermore, a five-fold cross-validation approach was adopted to minimize inconsistencies in intra- and inter-class variability and to ensure that results were robustly assessed. Our results, compared with other state-of-the-art models in terms of mean accuracy (ResNet50 = 90.28, GoogLeNet = 91.38, DenseNets = 91.60, and baseline = 92.84), demonstrated better outcomes (Precision = 92.8, Recall = 92.7, F1-score = 92.8, and Accuracy = 93.19). We also implemented t-distributed stochastic neighbor embedding (t–SNE) and confusion matrix analysis techniques for better visualization and performance validation. Overall, the results showed that the attention mechanism improved the automatic classification of multi-sited GI disease images. We validated clinical tests based on the proposed method by overcoming previous limitations, with the goal of improving automatic classification accuracy in future work.

Download Full-text

Motif-Matching Based Subgraph-Level Attentional Convolutional Network for Graph Classification

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5987 ◽

2020 ◽

Vol 34 (04) ◽

pp. 5387-5394

Author(s):

Hao Peng ◽

Jianxin Li ◽

Qiran Gong ◽

Yuanxin Ning ◽

Senzhang Wang ◽

...

Keyword(s):

Deep Learning ◽

Social Network ◽

Spatial Information ◽

Structural Information ◽

Feature Learning ◽

Drug Analysis ◽

Classification Performance ◽

Learning Approaches ◽

Graph Classification ◽

Convolutional Network

Graph classification is critically important to many real-world applications that are associated with graph data such as chemical drug analysis and social network mining. Traditional methods usually require feature engineering to extract the graph features that can help discriminate the graphs of different classes. Although recently deep learning based graph embedding approaches are proposed to automatically learn graph features, they mostly use a few vertex arrangements extracted from the graph for feature learning, which may lose some structural information. In this work, we present a novel motif-based attentional graph convolution neural network for graph classification, which can learn more discriminative and richer graph features. Specifically, a motif-matching guided subgraph normalization method is developed to better preserve the spatial information. A novel subgraph-level self-attention network is also proposed to capture the different impacts or weights of different subgraphs. Experimental results on both bioinformatics and social network datasets show that the proposed models significantly improve graph classification performance over both traditional graph kernel methods and recent deep learning approaches.

Download Full-text

Classification of PolSAR Image Using Neural Nonlocal Stacked Sparse Autoencoders with Virtual Adversarial Regularization

Remote Sensing ◽

10.3390/rs11091038 ◽

2019 ◽

Vol 11 (9) ◽

pp. 1038 ◽

Cited By ~ 2

Author(s):

Ruichuan Wang ◽

Yanfei Wang

Keyword(s):

Spatial Information ◽

Speckle Noise ◽

Extraction Process ◽

Negative Influence ◽

Polarimetric Synthetic Aperture Radar ◽

Softmax Classifier ◽

Latent Space ◽

Adversarial Training ◽

Stacked Sparse Autoencoders

Polarimetric synthetic aperture radar (PolSAR) has become increasingly popular in the past two decades, for it can derive multichannel features of ground objects, which contains more discriminative information compared with traditional SAR. In this paper, a neural nonlocal stacked sparse autoencoders with virtual adversarial regularization (NNSSAE-VAT) is proposed for PolSAR image classification. The NNSSAE first extracts the nonlocal features by calculating pairwise similarity of each pixel and its surrounding pixels using a neural network, which contains a multiscale feature extractor and a linear embedding layer. The feature extraction process can relieve the negative influence of speckle noise and extract discriminative nonlocal spatial information without carefully designed parameters. Then, the SSAE maps the center pixel and the extracted nonlocal features into deep latent space in which a Softmax classifier is utilized to conduct classification. The virtual adversarial training is introduced to regularize the network, which tries to keep the network from being overfitting. The experimental results from three real PolSAR image show that the proposed NNSSAE-VAT method has proved its robustness and effectiveness and it can achieve competitive performance compared with related methods.

Download Full-text

Remote Sensing Image Scene Classification Using CNN-CapsNet

Remote Sensing ◽

10.3390/rs11050494 ◽

2019 ◽

Vol 11 (5) ◽

pp. 494 ◽

Cited By ~ 45

Author(s):

Wei Zhang ◽

Ping Tang ◽

Lijun Zhao

Keyword(s):

Neural Network ◽

Remote Sensing ◽

Network Architecture ◽

Spatial Information ◽

Feature Learning ◽

Remote Sensing Image ◽

Classification Performance ◽

Scene Classification ◽

Feature Maps ◽

Fully Connected

Remote sensing image scene classification is one of the most challenging problems in understanding high-resolution remote sensing images. Deep learning techniques, especially the convolutional neural network (CNN), have improved the performance of remote sensing image scene classification due to the powerful perspective of feature learning and reasoning. However, several fully connected layers are always added to the end of CNN models, which is not efficient in capturing the hierarchical structure of the entities in the images and does not fully consider the spatial information that is important to classification. Fortunately, capsule network (CapsNet), which is a novel network architecture that uses a group of neurons as a capsule or vector to replace the neuron in the traditional neural network and can encode the properties and spatial information of features in an image to achieve equivariance, has become an active area in the classification field in the past two years. Motivated by this idea, this paper proposes an effective remote sensing image scene classification architecture named CNN-CapsNet to make full use of the merits of these two models: CNN and CapsNet. First, a CNN without fully connected layers is used as an initial feature maps extractor. In detail, a pretrained deep CNN model that was fully trained on the ImageNet dataset is selected as a feature extractor in this paper. Then, the initial feature maps are fed into a newly designed CapsNet to obtain the final classification result. The proposed architecture is extensively evaluated on three public challenging benchmark remote sensing image datasets: the UC Merced Land-Use dataset with 21 scene categories, AID dataset with 30 scene categories, and the NWPU-RESISC45 dataset with 45 challenging scene categories. The experimental results demonstrate that the proposed method can lead to a competitive classification performance compared with the state-of-the-art methods.

Download Full-text

Spatial-Spectral Multiple Manifold Discriminant Analysis for Dimensionality Reduction of Hyperspectral Imagery

Remote Sensing ◽

10.3390/rs11202414 ◽

2019 ◽

Vol 11 (20) ◽

pp. 2414 ◽

Cited By ~ 1

Author(s):

Guangyao Shi ◽

Hong Huang ◽

Jiamin Liu ◽

Zhengying Li ◽

Lihua Wang

Keyword(s):

Discriminant Analysis ◽

Dimensionality Reduction ◽

Spatial Information ◽

Hyperspectral Image ◽

Feature Learning ◽

Spatial Domain ◽

Classification Performance ◽

Neighborhood Structure ◽

Scatter Matrix ◽

Manifold Structure

Hyperspectral images (HSI) possess abundant spectral bands and rich spatial information, which can be utilized to discriminate different types of land cover. However, the high dimensional characteristics of spatial-spectral information commonly cause the Hughes phenomena. Traditional feature learning methods can reduce the dimensionality of HSI data and preserve the useful intrinsic information but they ignore the multi-manifold structure in hyperspectral image. In this paper, a novel dimensionality reduction (DR) method called spatial-spectral multiple manifold discriminant analysis (SSMMDA) was proposed for HSI classification. At first, several subsets are obtained from HSI data according to the prior label information. Then, a spectral-domain intramanifold graph is constructed for each submanifold to preserve the local neighborhood structure, a spatial-domain intramanifold scatter matrix and a spatial-domain intermanifold scatter matrix are constructed for each sub-manifold to characterize the within-manifold compactness and the between-manifold separability, respectively. Finally, a spatial-spectral combined objective function is designed for each submanifold to obtain an optimal projection and the discriminative features on different submanifolds are fused to improve the classification performance of HSI data. SSMMDA can explore spatial-spectral combined information and reveal the intrinsic multi-manifold structure in HSI. Experiments on three public HSI data sets demonstrate that the proposed SSMMDA method can achieve better classification accuracies in comparison with many state-of-the-art methods.

Download Full-text

Multi-channel Joint Sparse Learning Model for Non-rigid Three-dimensional Object Classification

Journal of Imaging Science and Technology ◽

10.2352/j.imagingsci.technol.2020.64.3.030503 ◽

2020 ◽

Vol 64 (3) ◽

pp. 30503-1-30503-11

Author(s):

Li Han ◽

Bing Yu ◽

Jingyu Piao ◽

Yuning Tong ◽

Pengyan Lan ◽

...

Keyword(s):

Three Dimensional ◽

Feature Learning ◽

Object Classification ◽

Learning Model ◽

Classification Performance ◽

Feature Representation ◽

Sparse Learning ◽

Shape Descriptors ◽

Softmax Classifier ◽

Dimensional Object

Abstract In order to solve the issues of inadequate feature description and inefficient feature learning model existing in current classification methods, this article proposes a multi-channel joint sparse learning model for three-dimensional (3D) non-rigid object classification. First, the authors adopt a multi-level measurement of intrinsic properties to create complementary shape descriptors. Second, they build independent and informative bag of features (BoF) by embedding these shape descriptors into the visual vocabulary space. Third, a max-dependency and min-redundancy criterion is applied for optimal feature filtering on each BoF dictionary based on mutual information; meanwhile, each dictionary is learned and weighted according to its contribution to the classification task, and then a compact multi-channel joint sparse learning model is constructed. Finally, the authors train the joint sparse learning model followed by a Softmax classifier to implement efficient shape classification. The experimental results show that the proposed method has stronger feature representation ability and promotes greatly the discrimination of sparse coding coefficients. Thus, the promising classification performance and the powerful robustness can be obtained compared to the state-of-the-art methods.

Download Full-text

A Study on the Auxiliary Diagnosis of Thyroid Disease Images Based on Multiple Dimensional Deep Learning Algorithms

Current Medical Imaging Formerly Current Medical Imaging Reviews ◽

10.2174/1573405615666190115155223 ◽

2020 ◽

Vol 16 (3) ◽

pp. 199-205

Author(s):

Yuejun Liu ◽

Yifei Xu ◽

Xiangzheng Meng ◽

Xuguang Wang ◽

Tianxu Bai

Keyword(s):

Deep Learning ◽

Learning Algorithms ◽

Region Of Interest ◽

Classification Performance ◽

Thyroid Diseases ◽

Great Success ◽

Learning Models ◽

Good Classification Performance ◽

Spect Images

Background: Medical imaging plays an important role in the diagnosis of thyroid diseases. In the field of machine learning, multiple dimensional deep learning algorithms are widely used in image classification and recognition, and have achieved great success. Objective: The method based on multiple dimensional deep learning is employed for the auxiliary diagnosis of thyroid diseases based on SPECT images. The performances of different deep learning models are evaluated and compared. Methods: Thyroid SPECT images are collected with three types, they are hyperthyroidism, normal and hypothyroidism. In the pre-processing, the region of interest of thyroid is segmented and the amount of data sample is expanded. Four CNN models, including CNN, Inception, VGG16 and RNN, are used to evaluate deep learning methods. Results: Deep learning based methods have good classification performance, the accuracy is 92.9%-96.2%, AUC is 97.8%-99.6%. VGG16 model has the best performance, the accuracy is 96.2% and AUC is 99.6%. Especially, the VGG16 model with a changing learning rate works best. Conclusion: The standard CNN, Inception, VGG16, and RNN four deep learning models are efficient for the classification of thyroid diseases with SPECT images. The accuracy of the assisted diagnostic method based on deep learning is higher than that of other methods reported in the literature.

Download Full-text

Spatial Information Enrichment using NLP-based Classification of Space Objects for School Bldgs. in Korea

10.22260/isarc2019/0056 ◽

2019 ◽

Author(s):

Jaeyeol Song ◽

Jinsung Kim ◽

Jin-Kook Lee

Keyword(s):

Spatial Information ◽

Space Objects

Download Full-text