Scene Recognition Using Deep Softpool Capsule Network Based on Residual Diverse Branch Block

Chunyuan Wang; Yang Wu; Yihan Wang; Yiping Chen

doi:10.3390/s21165575

Scene Recognition Using Deep Softpool Capsule Network Based on Residual Diverse Branch Block

Sensors ◽

10.3390/s21165575 ◽

2021 ◽

Vol 21 (16) ◽

pp. 5575

Author(s):

Chunyuan Wang ◽

Yang Wu ◽

Yihan Wang ◽

Yiping Chen

Keyword(s):

Image Recognition ◽

Recognition Performance ◽

Spatial Relationship ◽

Classification Performance ◽

Scene Recognition ◽

Feature Representation ◽

The Novel ◽

Multi Scale ◽

Complex Scenes ◽

Imaging Sensors

With the improvement of the quality and resolution of remote sensing (RS) images, scene recognition tasks have played an important role in the RS community. However, due to the special bird’s eye view image acquisition mode of imaging sensors, it is still challenging to construct a discriminate representation of diverse and complex scenes to improve RS image recognition performance. Capsule networks that can learn the spatial relationship between the features in an image has a good image classification performance. However, the original capsule network is not suitable for images with a complex background. To address the above issues, this paper proposes a novel end-to-end capsule network termed DS-CapsNet, in which a new multi-scale feature enhancement module and a new Caps-SoftPool method are advanced by aggregating the advantageous attributes of the residual convolution architecture, Diverse Branch Block (DBB), Squeeze and Excitation (SE) block, and the Caps-SoftPool method. By using the residual DBB, multiscale features can be extracted and fused to recover a semantic strong feature representation. By adopting SE, the informative features are emphasized, and the less salient features are weakened. The new Caps-SoftPool method can reduce the number of parameters that are needed in order to prevent an over-fitting problem. The novel DS-CapsNet achieves a competitive and promising performance for RS image recognition by using high-quality and robust capsule representation. The extensive experiments on two challenging datasets, AID and NWPU-RESISC45, demonstrate the robustness and superiority of the proposed DS-CapsNet in scene recognition tasks.

Download Full-text

Enhanced biologically inspired model for image recognition based on a novel patch selection method with moment

International Journal of Wavelets Multiresolution and Information Processing ◽

10.1142/s0219691319400071 ◽

2019 ◽

Vol 17 (02) ◽

pp. 1940007

Author(s):

Yanfeng Lu ◽

Lihao Jia ◽

Hong Qiao ◽

Yi Li ◽

Zongshuai Qi

Keyword(s):

Image Recognition ◽

Gabor Filter ◽

Layer Structure ◽

Experimental Studies ◽

Selection Method ◽

Feature Representation ◽

Patch Selection ◽

Biologically Inspired ◽

Multi Scale ◽

Random Method

Biologically inspired model (BIM) for image recognition is a robust computational architecture, which has attracted widespread attention. BIM can be described as a four-layer structure based on the mechanisms of the visual cortex. Although the performance of BIM for image recognition is robust, it takes the randomly selected ways for the patch selection, which is sightless, and results in heavy computing burden. To address this issue, we propose a novel patch selection method with oriented Gaussian–Hermite moment (PSGHM), and we enhanced the BIM based on the proposed PSGHM, named as PBIM. In contrast to the conventional BIM which adopts the random method to select patches within the feature representation layers processed by multi-scale Gabor filter banks, the proposed PBIM takes the PSGHM way to extract a small number of representation features while offering promising distinctiveness. To show the effectiveness of the proposed PBIM, experimental studies on object categorization are conducted on the CalTech05, TU Darmstadt (TUD) and GRAZ01 databases. Experimental results demonstrate that the performance of PBIM is a significant improvement on that of the conventional BIM.

Download Full-text

Rice Disease Image Recognition Based on Improved Multi-scale Stack Autoencoder

Journal of Agricultural Science ◽

10.5539/jas.v13n1p18 ◽

2020 ◽

Vol 13 (1) ◽

pp. 18

Author(s):

Jun Meng ◽

Xingchen Lv ◽

Lifang Fu ◽

Qiufeng Wu

Keyword(s):

Image Recognition ◽

Extraction Method ◽

Recognition Accuracy ◽

Classification Performance ◽

Identification Performance ◽

Feature Extraction Method ◽

Multi Scale ◽

Stacked Autoencoder ◽

Rice Disease ◽

Crop Disease

Recently, deep learning methods are widely used in the rice diseases identification. However, the actual image background of rice disease is complex, the classification performance is not ideal. Therefore, this paper proposed a multi-scale feature extraction method based on stacked autoencoder, named the multi-scale stacked autoencoder (MSSAE), to improve the recognition accuracy of rice diseases. This method extracts the complex rice disease image’s features by two steps. In the first step, the images are preprocessed. Then, the MSSAE extract the multi-scale features through preprocessed rice diseases data in different scales. Through comparative analysis of experiments, the new method achieved greater than 95% precision in the detection of rice diseases. It indicated that the MSSAE model has an outstanding identification performance for actual crop disease image recognition.

Download Full-text

Ship Classification Based on Attention Mechanism and Multi-Scale Convolutional Neural Network for Visible and Infrared Images

Electronics ◽

10.3390/electronics9122022 ◽

2020 ◽

Vol 9 (12) ◽

pp. 2022

Author(s):

Yongmei Ren ◽

Jie Yang ◽

Zhiqiang Guo ◽

Qingnian Zhang ◽

Hui Cao

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Classification Performance ◽

Attention Mechanism ◽

Feature Representation ◽

Infrared Images ◽

Visible Image ◽

Multi Scale ◽

Ship Classification ◽

Fusion Features

Visible image quality is very susceptible to changes in illumination, and there are limitations in ship classification using images acquired by a single sensor. This study proposes a ship classification method based on an attention mechanism and multi-scale convolutional neural network (MSCNN) for visible and infrared images. First, the features of visible and infrared images are extracted by a two-stream symmetric multi-scale convolutional neural network module, and then concatenated to make full use of the complementary features present in multi-modal images. After that, the attention mechanism is applied to the concatenated fusion features to emphasize local details areas in the feature map, aiming to further improve feature representation capability of the model. Lastly, attention weights and the original concatenated fusion features are added element by element and fed into fully connected layers and Softmax output layer for final classification output. Effectiveness of the proposed method is verified on a visible and infrared spectra (VAIS) dataset, which shows 93.81% accuracy in classification results. Compared with other state-of-the-art methods, the proposed method could extract features more effectively and has better overall classification performance.

Download Full-text

Aggregation-and-Attention Network for brain tumor segmentation

BMC Medical Imaging ◽

10.1186/s12880-021-00639-8 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Chih-Wei Lin ◽

Yu Hong ◽

Jinfu Liu

Keyword(s):

Brain Tumor ◽

Semantic Information ◽

Spatial Relationship ◽

Tumor Segmentation ◽

Computer Assisted ◽

Brain Tumor Segmentation ◽

Attention Network ◽

Multi Scale ◽

Assisted Diagnosis ◽

The Brain

Abstract Background Glioma is a malignant brain tumor; its location is complex and is difficult to remove surgically. To diagnosis the brain tumor, doctors can precisely diagnose and localize the disease using medical images. However, the computer-assisted diagnosis for the brain tumor diagnosis is still the problem because the rough segmentation of the brain tumor makes the internal grade of the tumor incorrect. Methods In this paper, we proposed an Aggregation-and-Attention Network for brain tumor segmentation. The proposed network takes the U-Net as the backbone, aggregates multi-scale semantic information, and focuses on crucial information to perform brain tumor segmentation. To this end, we proposed an enhanced down-sampling module and Up-Sampling Layer to compensate for the information loss. The multi-scale connection module is to construct the multi-receptive semantic fusion between encoder and decoder. Furthermore, we designed a dual-attention fusion module that can extract and enhance the spatial relationship of magnetic resonance imaging and applied the strategy of deep supervision in different parts of the proposed network. Results Experimental results show that the performance of the proposed framework is the best on the BraTS2020 dataset, compared with the-state-of-art networks. The performance of the proposed framework surpasses all the comparison networks, and its average accuracies of the four indexes are 0.860, 0.885, 0.932, and 1.2325, respectively. Conclusions The framework and modules of the proposed framework are scientific and practical, which can extract and aggregate useful semantic information and enhance the ability of glioma segmentation.

Download Full-text

Novel Features and PRPD Image Denoising Method for Improved Single-Source Partial Discharges Classification in On-Line Hydro-Generators

Energies ◽

10.3390/en14113267 ◽

2021 ◽

Vol 14 (11) ◽

pp. 3267

Author(s):

Ramon C. F. Araújo ◽

Rodrigo M. S. de Oliveira ◽

Fernando S. Brasil ◽

Fabrício J. B. Barros

Keyword(s):

Image Denoising ◽

Recognition Performance ◽

Partial Discharge ◽

Classification Performance ◽

Partial Discharges ◽

Denoising Method ◽

Internal Parameters ◽

On Line ◽

Automatic Removal ◽

The Impact

In this paper, a novel image denoising algorithm and novel input features are proposed. The algorithm is applied to phase-resolved partial discharge (PRPD) diagrams with a single dominant partial discharge (PD) source, preparing them for automatic artificial-intelligence-based classification. It was designed to mitigate several sources of distortions often observed in PRPDs obtained from fully operational hydroelectric generators. The capabilities of the denoising algorithm are the automatic removal of sparse noise and the suppression of non-dominant discharges, including those due to crosstalk. The input features are functions of PD distributions along amplitude and phase, which are calculated in a novel way to mitigate random effects inherent to PD measurements. The impact of the proposed contributions was statistically evaluated and compared to classification performance obtained using formerly published approaches. Higher recognition rates and reduced variances were obtained using the proposed methods, statistically outperforming autonomous classification techniques seen in earlier works. The values of the algorithm’s internal parameters are also validated by comparing the recognition performance obtained with different parameter combinations. All typical PD sources described in hydro-generators PD standards are considered and can be automatically detected.

Download Full-text

Learning rich features with hybrid loss for brain tumor segmentation

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-021-01431-y ◽

2021 ◽

Vol 21 (S2) ◽

Author(s):

Daobin Huang ◽

Minghui Wang ◽

Ling Zhang ◽

Haichun Li ◽

Minquan Ye ◽

...

Keyword(s):

Feature Extraction ◽

Brain Tumor ◽

Class Imbalance ◽

Feature Representation ◽

Loss Functions ◽

Radiotherapy Planning ◽

Tumor Segmentation ◽

Brain Tumor Segmentation ◽

Scale Feature ◽

Multi Scale

Abstract Background Accurately segment the tumor region of MRI images is important for brain tumor diagnosis and radiotherapy planning. At present, manual segmentation is wildly adopted in clinical and there is a strong need for an automatic and objective system to alleviate the workload of radiologists. Methods We propose a parallel multi-scale feature fusing architecture to generate rich feature representation for accurate brain tumor segmentation. It comprises two parts: (1) Feature Extraction Network (FEN) for brain tumor feature extraction at different levels and (2) Multi-scale Feature Fusing Network (MSFFN) for merge all different scale features in a parallel manner. In addition, we use two hybrid loss functions to optimize the proposed network for the class imbalance issue. Results We validate our method on BRATS 2015, with 0.86, 0.73 and 0.61 in Dice for the three tumor regions (complete, core and enhancing), and the model parameter size is only 6.3 MB. Without any post-processing operations, our method still outperforms published state-of-the-arts methods on the segmentation results of complete tumor regions and obtains competitive performance in another two regions. Conclusions The proposed parallel structure can effectively fuse multi-level features to generate rich feature representation for high-resolution results. Moreover, the hybrid loss functions can alleviate the class imbalance issue and guide the training process. The proposed method can be used in other medical segmentation tasks.

Download Full-text

A Dual-Model Architecture with Grouping-Attention-Fusion for Remote Sensing Scene Classification

Remote Sensing ◽

10.3390/rs13030433 ◽

2021 ◽

Vol 13 (3) ◽

pp. 433

Author(s):

Junge Shen ◽

Tong Zhang ◽

Yichen Wang ◽

Ruxin Wang ◽

Qi Wang ◽

...

Keyword(s):

Remote Sensing ◽

Feature Representation ◽

Dual Model ◽

Scene Classification ◽

Remote Sensing Images ◽

Single Model ◽

Fusion Strategy ◽

Multi Scale ◽

The Arts ◽

Scene Representation

Remote sensing images contain complex backgrounds and multi-scale objects, which pose a challenging task for scene classification. The performance is highly dependent on the capacity of the scene representation as well as the discriminability of the classifier. Although multiple models possess better properties than a single model on these aspects, the fusion strategy for these models is a key component to maximize the final accuracy. In this paper, we construct a novel dual-model architecture with a grouping-attention-fusion strategy to improve the performance of scene classification. Specifically, the model employs two different convolutional neural networks (CNNs) for feature extraction, where the grouping-attention-fusion strategy is used to fuse the features of the CNNs in a fine and multi-scale manner. In this way, the resultant feature representation of the scene is enhanced. Moreover, to address the issue of similar appearances between different scenes, we develop a loss function which encourages small intra-class diversities and large inter-class distances. Extensive experiments are conducted on four scene classification datasets include the UCM land-use dataset, the WHU-RS19 dataset, the AID dataset, and the OPTIMAL-31 dataset. The experimental results demonstrate the superiority of the proposed method in comparison with the state-of-the-arts.

Download Full-text

A two-stage clustering-based cold-start method for active learning

Intelligent Data Analysis ◽

10.3233/ida-205393 ◽

2021 ◽

Vol 25 (5) ◽

pp. 1169-1185

Author(s):

Deniu He ◽

Hong Yu ◽

Guoyin Wang ◽

Jie Li

Keyword(s):

Active Learning ◽

Class Imbalance ◽

Imbalanced Data ◽

Cold Start ◽

Classification Performance ◽

The Novel ◽

Two Stage ◽

Minority Class ◽

Novel Method ◽

Multiple Clusters

The problem of initialization of active learning is considered in this paper. Especially, this paper studies the problem in an imbalanced data scenario, which is called as class-imbalance active learning cold-start. The novel method is two-stage clustering-based active learning cold-start (ALCS). In the first stage, to separate the instances of minority class from that of majority class, a multi-center clustering is constructed based on a new inter-cluster tightness measure, thus the data is grouped into multiple clusters. Then, in the second stage, the initial training instances are selected from each cluster based on an adaptive candidate representative instances determination mechanism and a clusters-cyclic instance query mechanism. The comprehensive experiments demonstrate the effectiveness of the proposed method from the aspects of class coverage, classification performance, and impact on active learning.

Download Full-text

RAMS-Trans: Recurrent Attention Multi-scale Transformer for Fine-grained Image Recognition

10.1145/3474085.3475561 ◽

2021 ◽

Author(s):

Yunqing Hu ◽

Xuan Jin ◽

Yin Zhang ◽

Haiwen Hong ◽

Jingfeng Zhang ◽

...

Keyword(s):

Image Recognition ◽

Fine Grained ◽

Multi Scale

Download Full-text

A Study of Defect Detection Techniques for Metallographic Images

Sensors ◽

10.3390/s20195593 ◽

2020 ◽

Vol 20 (19) ◽

pp. 5593 ◽

Cited By ~ 2

Author(s):

Wei-Hung Wu ◽

Jen-Chun Lee ◽

Yi-Ming Wang

Keyword(s):

Neural Network ◽

Defect Detection ◽

Network Architecture ◽

Recognition Performance ◽

Metallographic Analysis ◽

Automatic Identification ◽

Detection Techniques ◽

Multi Scale ◽

Modified Method ◽

Metallographic Images

Metallography is the study of the structure of metals and alloys. Metallographic analysis can be regarded as a detection tool to assist in identifying a metal or alloy, to evaluate whether an alloy is processed correctly, to inspect multiple phases within a material, to locate and characterize imperfections such as voids or impurities, or to find the damaged areas of metallographic images. However, the defect detection of metallography is evaluated by human experts, and its automatic identification is still a challenge in almost every real solution. Deep learning has been applied to different problems in computer vision since the proposal of AlexNet in 2012. In this study, we propose a novel convolutional neural network architecture for metallographic analysis based on a modified residual neural network (ResNet). Multi-scale ResNet (M-ResNet), the modified method, improves efficiency by utilizing multi-scale operations for the accurate detection of objects of various sizes, especially small objects. The experimental results show that the proposed method yields an accuracy of 85.7% (mAP) in recognition performance, which is higher than existing methods. As a consequence, we propose a novel system for automatic defect detection as an application for metallographic analysis.

Download Full-text