scholarly journals Auto Encoder Feature Learning with Utilization of Local Spatial Information and Data Distribution for Classification of PolSAR Image

2019 ◽  
Vol 11 (11) ◽  
pp. 1313
Author(s):  
Biao Hou ◽  
Jianlong Wang ◽  
Licheng Jiao ◽  
Shuang Wang

The distribution of data plays a key role in the designing of a machine learning model. Therefore, this paper proposes a novel auto encoder network based on the distribution of polarimetric synthetic aperture radar (PolSAR) data matrix. Designed specifically for PolSAR data matrix, the proposed mixture auto encoder (MAE) feature learning method defines data error term in the loss function according to the data distribution. Instead of the pixel itself, all pixels in the neighborhood are used as input to train the proposed MAE. Then, a corresponding classification network is also given by discarding the decoder process of the proposed MAE and connecting with a Softmax classifier. The MAE is trained using the unlabeled data, while the training process of the classification network is completed with the help of a small number of labeled pixels. In view of the phenomenon of misclassification in the predicted result image, two post-processing steps acting on local spatial are also given, which accomplished by the proposed two filters. Extensive experiments by four methods were made over three real PolSAR images including the proposed classification network. The experimental results show that introducing data distribution into the auto encoder network leads to an average 4% improvement in overall accuracy for three PolSAR images. Moreover, the post-processing steps with the proposed filters bring a new level of discrimination on the classification performance of PolSAR images.

2014 ◽  
Vol 527 ◽  
pp. 339-342
Author(s):  
Zhi Yuan Liu ◽  
Jin He ◽  
Jin Long Wang ◽  
Fei Zhao

In order to make full use of the spatial information of images in the classification of natural scene, we use the spatial partition model. But mechanically space division caused the abuse of spatial information. So spatial partition model must be properly improved to make the different categories of images were more diversity, so that the classification performance is improved. In addition, to further improve the performance, we use FAN-SIFT as local image features. Experiments made on 8 scenes image dataset and Caltech101 dataset show that the improved model can obtain better classification performance.


Sensors ◽  
2021 ◽  
Vol 21 (21) ◽  
pp. 7392
Author(s):  
Danish Nazir ◽  
Muhammad Zeshan Afzal ◽  
Alain Pagani ◽  
Marcus Liwicki ◽  
Didier Stricker

In this paper, we present the idea of Self Supervised learning on the shape completion and classification of point clouds. Most 3D shape completion pipelines utilize AutoEncoders to extract features from point clouds used in downstream tasks such as classification, segmentation, detection, and other related applications. Our idea is to add contrastive learning into AutoEncoders to encourage global feature learning of the point cloud classes. It is performed by optimizing triplet loss. Furthermore, local feature representations learning of point cloud is performed by adding the Chamfer distance function. To evaluate the performance of our approach, we utilize the PointNet classifier. We also extend the number of classes for evaluation from 4 to 10 to show the generalization ability of the learned features. Based on our results, embeddings generated from the contrastive AutoEncoder enhances shape completion and classification performance from 84.2% to 84.9% of point clouds achieving the state-of-the-art results with 10 classes.


2021 ◽  
Vol 11 (23) ◽  
pp. 11136
Author(s):  
Zenebe Markos Lonseko ◽  
Prince Ebenezer Adjei ◽  
Wenju Du ◽  
Chengsi Luo ◽  
Dingcan Hu ◽  
...  

Gastrointestinal (GI) diseases constitute a leading problem in the human digestive system. Consequently, several studies have explored automatic classification of GI diseases as a means of minimizing the burden on clinicians and improving patient outcomes, for both diagnostic and treatment purposes. The challenge in using deep learning-based (DL) approaches, specifically a convolutional neural network (CNN), is that spatial information is not fully utilized due to the inherent mechanism of CNNs. This paper proposes the application of spatial factors in improving classification performance. Specifically, we propose a deep CNN-based spatial attention mechanism for the classification of GI diseases, implemented with encoder–decoder layers. To overcome the data imbalance problem, we adapt data-augmentation techniques. A total of 12,147 multi-sited, multi-diseased GI images, drawn from publicly available and private sources, were used to validate the proposed approach. Furthermore, a five-fold cross-validation approach was adopted to minimize inconsistencies in intra- and inter-class variability and to ensure that results were robustly assessed. Our results, compared with other state-of-the-art models in terms of mean accuracy (ResNet50 = 90.28, GoogLeNet = 91.38, DenseNets = 91.60, and baseline = 92.84), demonstrated better outcomes (Precision = 92.8, Recall = 92.7, F1-score = 92.8, and Accuracy = 93.19). We also implemented t-distributed stochastic neighbor embedding (t–SNE) and confusion matrix analysis techniques for better visualization and performance validation. Overall, the results showed that the attention mechanism improved the automatic classification of multi-sited GI disease images. We validated clinical tests based on the proposed method by overcoming previous limitations, with the goal of improving automatic classification accuracy in future work.


2020 ◽  
Vol 34 (04) ◽  
pp. 5387-5394
Author(s):  
Hao Peng ◽  
Jianxin Li ◽  
Qiran Gong ◽  
Yuanxin Ning ◽  
Senzhang Wang ◽  
...  

Graph classification is critically important to many real-world applications that are associated with graph data such as chemical drug analysis and social network mining. Traditional methods usually require feature engineering to extract the graph features that can help discriminate the graphs of different classes. Although recently deep learning based graph embedding approaches are proposed to automatically learn graph features, they mostly use a few vertex arrangements extracted from the graph for feature learning, which may lose some structural information. In this work, we present a novel motif-based attentional graph convolution neural network for graph classification, which can learn more discriminative and richer graph features. Specifically, a motif-matching guided subgraph normalization method is developed to better preserve the spatial information. A novel subgraph-level self-attention network is also proposed to capture the different impacts or weights of different subgraphs. Experimental results on both bioinformatics and social network datasets show that the proposed models significantly improve graph classification performance over both traditional graph kernel methods and recent deep learning approaches.


2019 ◽  
Vol 11 (9) ◽  
pp. 1038 ◽  
Author(s):  
Ruichuan Wang ◽  
Yanfei Wang

Polarimetric synthetic aperture radar (PolSAR) has become increasingly popular in the past two decades, for it can derive multichannel features of ground objects, which contains more discriminative information compared with traditional SAR. In this paper, a neural nonlocal stacked sparse autoencoders with virtual adversarial regularization (NNSSAE-VAT) is proposed for PolSAR image classification. The NNSSAE first extracts the nonlocal features by calculating pairwise similarity of each pixel and its surrounding pixels using a neural network, which contains a multiscale feature extractor and a linear embedding layer. The feature extraction process can relieve the negative influence of speckle noise and extract discriminative nonlocal spatial information without carefully designed parameters. Then, the SSAE maps the center pixel and the extracted nonlocal features into deep latent space in which a Softmax classifier is utilized to conduct classification. The virtual adversarial training is introduced to regularize the network, which tries to keep the network from being overfitting. The experimental results from three real PolSAR image show that the proposed NNSSAE-VAT method has proved its robustness and effectiveness and it can achieve competitive performance compared with related methods.


2019 ◽  
Vol 11 (5) ◽  
pp. 494 ◽  
Author(s):  
Wei Zhang ◽  
Ping Tang ◽  
Lijun Zhao

Remote sensing image scene classification is one of the most challenging problems in understanding high-resolution remote sensing images. Deep learning techniques, especially the convolutional neural network (CNN), have improved the performance of remote sensing image scene classification due to the powerful perspective of feature learning and reasoning. However, several fully connected layers are always added to the end of CNN models, which is not efficient in capturing the hierarchical structure of the entities in the images and does not fully consider the spatial information that is important to classification. Fortunately, capsule network (CapsNet), which is a novel network architecture that uses a group of neurons as a capsule or vector to replace the neuron in the traditional neural network and can encode the properties and spatial information of features in an image to achieve equivariance, has become an active area in the classification field in the past two years. Motivated by this idea, this paper proposes an effective remote sensing image scene classification architecture named CNN-CapsNet to make full use of the merits of these two models: CNN and CapsNet. First, a CNN without fully connected layers is used as an initial feature maps extractor. In detail, a pretrained deep CNN model that was fully trained on the ImageNet dataset is selected as a feature extractor in this paper. Then, the initial feature maps are fed into a newly designed CapsNet to obtain the final classification result. The proposed architecture is extensively evaluated on three public challenging benchmark remote sensing image datasets: the UC Merced Land-Use dataset with 21 scene categories, AID dataset with 30 scene categories, and the NWPU-RESISC45 dataset with 45 challenging scene categories. The experimental results demonstrate that the proposed method can lead to a competitive classification performance compared with the state-of-the-art methods.


2019 ◽  
Vol 11 (20) ◽  
pp. 2414 ◽  
Author(s):  
Guangyao Shi ◽  
Hong Huang ◽  
Jiamin Liu ◽  
Zhengying Li ◽  
Lihua Wang

Hyperspectral images (HSI) possess abundant spectral bands and rich spatial information, which can be utilized to discriminate different types of land cover. However, the high dimensional characteristics of spatial-spectral information commonly cause the Hughes phenomena. Traditional feature learning methods can reduce the dimensionality of HSI data and preserve the useful intrinsic information but they ignore the multi-manifold structure in hyperspectral image. In this paper, a novel dimensionality reduction (DR) method called spatial-spectral multiple manifold discriminant analysis (SSMMDA) was proposed for HSI classification. At first, several subsets are obtained from HSI data according to the prior label information. Then, a spectral-domain intramanifold graph is constructed for each submanifold to preserve the local neighborhood structure, a spatial-domain intramanifold scatter matrix and a spatial-domain intermanifold scatter matrix are constructed for each sub-manifold to characterize the within-manifold compactness and the between-manifold separability, respectively. Finally, a spatial-spectral combined objective function is designed for each submanifold to obtain an optimal projection and the discriminative features on different submanifolds are fused to improve the classification performance of HSI data. SSMMDA can explore spatial-spectral combined information and reveal the intrinsic multi-manifold structure in HSI. Experiments on three public HSI data sets demonstrate that the proposed SSMMDA method can achieve better classification accuracies in comparison with many state-of-the-art methods.


2020 ◽  
Vol 64 (3) ◽  
pp. 30503-1-30503-11
Author(s):  
Li Han ◽  
Bing Yu ◽  
Jingyu Piao ◽  
Yuning Tong ◽  
Pengyan Lan ◽  
...  

Abstract In order to solve the issues of inadequate feature description and inefficient feature learning model existing in current classification methods, this article proposes a multi-channel joint sparse learning model for three-dimensional (3D) non-rigid object classification. First, the authors adopt a multi-level measurement of intrinsic properties to create complementary shape descriptors. Second, they build independent and informative bag of features (BoF) by embedding these shape descriptors into the visual vocabulary space. Third, a max-dependency and min-redundancy criterion is applied for optimal feature filtering on each BoF dictionary based on mutual information; meanwhile, each dictionary is learned and weighted according to its contribution to the classification task, and then a compact multi-channel joint sparse learning model is constructed. Finally, the authors train the joint sparse learning model followed by a Softmax classifier to implement efficient shape classification. The experimental results show that the proposed method has stronger feature representation ability and promotes greatly the discrimination of sparse coding coefficients. Thus, the promising classification performance and the powerful robustness can be obtained compared to the state-of-the-art methods.


Author(s):  
Yuejun Liu ◽  
Yifei Xu ◽  
Xiangzheng Meng ◽  
Xuguang Wang ◽  
Tianxu Bai

Background: Medical imaging plays an important role in the diagnosis of thyroid diseases. In the field of machine learning, multiple dimensional deep learning algorithms are widely used in image classification and recognition, and have achieved great success. Objective: The method based on multiple dimensional deep learning is employed for the auxiliary diagnosis of thyroid diseases based on SPECT images. The performances of different deep learning models are evaluated and compared. Methods: Thyroid SPECT images are collected with three types, they are hyperthyroidism, normal and hypothyroidism. In the pre-processing, the region of interest of thyroid is segmented and the amount of data sample is expanded. Four CNN models, including CNN, Inception, VGG16 and RNN, are used to evaluate deep learning methods. Results: Deep learning based methods have good classification performance, the accuracy is 92.9%-96.2%, AUC is 97.8%-99.6%. VGG16 model has the best performance, the accuracy is 96.2% and AUC is 99.6%. Especially, the VGG16 model with a changing learning rate works best. Conclusion: The standard CNN, Inception, VGG16, and RNN four deep learning models are efficient for the classification of thyroid diseases with SPECT images. The accuracy of the assisted diagnostic method based on deep learning is higher than that of other methods reported in the literature.


Sign in / Sign up

Export Citation Format

Share Document