scholarly journals Scene Recognition Using Deep Softpool Capsule Network Based on Residual Diverse Branch Block

Sensors ◽  
2021 ◽  
Vol 21 (16) ◽  
pp. 5575
Author(s):  
Chunyuan Wang ◽  
Yang Wu ◽  
Yihan Wang ◽  
Yiping Chen

With the improvement of the quality and resolution of remote sensing (RS) images, scene recognition tasks have played an important role in the RS community. However, due to the special bird’s eye view image acquisition mode of imaging sensors, it is still challenging to construct a discriminate representation of diverse and complex scenes to improve RS image recognition performance. Capsule networks that can learn the spatial relationship between the features in an image has a good image classification performance. However, the original capsule network is not suitable for images with a complex background. To address the above issues, this paper proposes a novel end-to-end capsule network termed DS-CapsNet, in which a new multi-scale feature enhancement module and a new Caps-SoftPool method are advanced by aggregating the advantageous attributes of the residual convolution architecture, Diverse Branch Block (DBB), Squeeze and Excitation (SE) block, and the Caps-SoftPool method. By using the residual DBB, multiscale features can be extracted and fused to recover a semantic strong feature representation. By adopting SE, the informative features are emphasized, and the less salient features are weakened. The new Caps-SoftPool method can reduce the number of parameters that are needed in order to prevent an over-fitting problem. The novel DS-CapsNet achieves a competitive and promising performance for RS image recognition by using high-quality and robust capsule representation. The extensive experiments on two challenging datasets, AID and NWPU-RESISC45, demonstrate the robustness and superiority of the proposed DS-CapsNet in scene recognition tasks.

Author(s):  
Yanfeng Lu ◽  
Lihao Jia ◽  
Hong Qiao ◽  
Yi Li ◽  
Zongshuai Qi

Biologically inspired model (BIM) for image recognition is a robust computational architecture, which has attracted widespread attention. BIM can be described as a four-layer structure based on the mechanisms of the visual cortex. Although the performance of BIM for image recognition is robust, it takes the randomly selected ways for the patch selection, which is sightless, and results in heavy computing burden. To address this issue, we propose a novel patch selection method with oriented Gaussian–Hermite moment (PSGHM), and we enhanced the BIM based on the proposed PSGHM, named as PBIM. In contrast to the conventional BIM which adopts the random method to select patches within the feature representation layers processed by multi-scale Gabor filter banks, the proposed PBIM takes the PSGHM way to extract a small number of representation features while offering promising distinctiveness. To show the effectiveness of the proposed PBIM, experimental studies on object categorization are conducted on the CalTech05, TU Darmstadt (TUD) and GRAZ01 databases. Experimental results demonstrate that the performance of PBIM is a significant improvement on that of the conventional BIM.


2020 ◽  
Vol 13 (1) ◽  
pp. 18
Author(s):  
Jun Meng ◽  
Xingchen Lv ◽  
Lifang Fu ◽  
Qiufeng Wu

Recently, deep learning methods are widely used in the rice diseases identification. However, the actual image background of rice disease is complex, the classification performance is not ideal. Therefore, this paper proposed a multi-scale feature extraction method based on stacked autoencoder, named the multi-scale stacked autoencoder (MSSAE), to improve the recognition accuracy of rice diseases. This method extracts the complex rice disease image’s features by two steps. In the first step, the images are preprocessed. Then, the MSSAE extract the multi-scale features through preprocessed rice diseases data in different scales. Through comparative analysis of experiments, the new method achieved greater than 95% precision in the detection of rice diseases. It indicated that the MSSAE model has an outstanding identification performance for actual crop disease image recognition.


Electronics ◽  
2020 ◽  
Vol 9 (12) ◽  
pp. 2022
Author(s):  
Yongmei Ren ◽  
Jie Yang ◽  
Zhiqiang Guo ◽  
Qingnian Zhang ◽  
Hui Cao

Visible image quality is very susceptible to changes in illumination, and there are limitations in ship classification using images acquired by a single sensor. This study proposes a ship classification method based on an attention mechanism and multi-scale convolutional neural network (MSCNN) for visible and infrared images. First, the features of visible and infrared images are extracted by a two-stream symmetric multi-scale convolutional neural network module, and then concatenated to make full use of the complementary features present in multi-modal images. After that, the attention mechanism is applied to the concatenated fusion features to emphasize local details areas in the feature map, aiming to further improve feature representation capability of the model. Lastly, attention weights and the original concatenated fusion features are added element by element and fed into fully connected layers and Softmax output layer for final classification output. Effectiveness of the proposed method is verified on a visible and infrared spectra (VAIS) dataset, which shows 93.81% accuracy in classification results. Compared with other state-of-the-art methods, the proposed method could extract features more effectively and has better overall classification performance.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Chih-Wei Lin ◽  
Yu Hong ◽  
Jinfu Liu

Abstract Background Glioma is a malignant brain tumor; its location is complex and is difficult to remove surgically. To diagnosis the brain tumor, doctors can precisely diagnose and localize the disease using medical images. However, the computer-assisted diagnosis for the brain tumor diagnosis is still the problem because the rough segmentation of the brain tumor makes the internal grade of the tumor incorrect. Methods In this paper, we proposed an Aggregation-and-Attention Network for brain tumor segmentation. The proposed network takes the U-Net as the backbone, aggregates multi-scale semantic information, and focuses on crucial information to perform brain tumor segmentation. To this end, we proposed an enhanced down-sampling module and Up-Sampling Layer to compensate for the information loss. The multi-scale connection module is to construct the multi-receptive semantic fusion between encoder and decoder. Furthermore, we designed a dual-attention fusion module that can extract and enhance the spatial relationship of magnetic resonance imaging and applied the strategy of deep supervision in different parts of the proposed network. Results Experimental results show that the performance of the proposed framework is the best on the BraTS2020 dataset, compared with the-state-of-art networks. The performance of the proposed framework surpasses all the comparison networks, and its average accuracies of the four indexes are 0.860, 0.885, 0.932, and 1.2325, respectively. Conclusions The framework and modules of the proposed framework are scientific and practical, which can extract and aggregate useful semantic information and enhance the ability of glioma segmentation.


Energies ◽  
2021 ◽  
Vol 14 (11) ◽  
pp. 3267
Author(s):  
Ramon C. F. Araújo ◽  
Rodrigo M. S. de Oliveira ◽  
Fernando S. Brasil ◽  
Fabrício J. B. Barros

In this paper, a novel image denoising algorithm and novel input features are proposed. The algorithm is applied to phase-resolved partial discharge (PRPD) diagrams with a single dominant partial discharge (PD) source, preparing them for automatic artificial-intelligence-based classification. It was designed to mitigate several sources of distortions often observed in PRPDs obtained from fully operational hydroelectric generators. The capabilities of the denoising algorithm are the automatic removal of sparse noise and the suppression of non-dominant discharges, including those due to crosstalk. The input features are functions of PD distributions along amplitude and phase, which are calculated in a novel way to mitigate random effects inherent to PD measurements. The impact of the proposed contributions was statistically evaluated and compared to classification performance obtained using formerly published approaches. Higher recognition rates and reduced variances were obtained using the proposed methods, statistically outperforming autonomous classification techniques seen in earlier works. The values of the algorithm’s internal parameters are also validated by comparing the recognition performance obtained with different parameter combinations. All typical PD sources described in hydro-generators PD standards are considered and can be automatically detected.


2021 ◽  
Vol 21 (S2) ◽  
Author(s):  
Daobin Huang ◽  
Minghui Wang ◽  
Ling Zhang ◽  
Haichun Li ◽  
Minquan Ye ◽  
...  

Abstract Background Accurately segment the tumor region of MRI images is important for brain tumor diagnosis and radiotherapy planning. At present, manual segmentation is wildly adopted in clinical and there is a strong need for an automatic and objective system to alleviate the workload of radiologists. Methods We propose a parallel multi-scale feature fusing architecture to generate rich feature representation for accurate brain tumor segmentation. It comprises two parts: (1) Feature Extraction Network (FEN) for brain tumor feature extraction at different levels and (2) Multi-scale Feature Fusing Network (MSFFN) for merge all different scale features in a parallel manner. In addition, we use two hybrid loss functions to optimize the proposed network for the class imbalance issue. Results We validate our method on BRATS 2015, with 0.86, 0.73 and 0.61 in Dice for the three tumor regions (complete, core and enhancing), and the model parameter size is only 6.3 MB. Without any post-processing operations, our method still outperforms published state-of-the-arts methods on the segmentation results of complete tumor regions and obtains competitive performance in another two regions. Conclusions The proposed parallel structure can effectively fuse multi-level features to generate rich feature representation for high-resolution results. Moreover, the hybrid loss functions can alleviate the class imbalance issue and guide the training process. The proposed method can be used in other medical segmentation tasks.


2021 ◽  
Vol 13 (3) ◽  
pp. 433
Author(s):  
Junge Shen ◽  
Tong Zhang ◽  
Yichen Wang ◽  
Ruxin Wang ◽  
Qi Wang ◽  
...  

Remote sensing images contain complex backgrounds and multi-scale objects, which pose a challenging task for scene classification. The performance is highly dependent on the capacity of the scene representation as well as the discriminability of the classifier. Although multiple models possess better properties than a single model on these aspects, the fusion strategy for these models is a key component to maximize the final accuracy. In this paper, we construct a novel dual-model architecture with a grouping-attention-fusion strategy to improve the performance of scene classification. Specifically, the model employs two different convolutional neural networks (CNNs) for feature extraction, where the grouping-attention-fusion strategy is used to fuse the features of the CNNs in a fine and multi-scale manner. In this way, the resultant feature representation of the scene is enhanced. Moreover, to address the issue of similar appearances between different scenes, we develop a loss function which encourages small intra-class diversities and large inter-class distances. Extensive experiments are conducted on four scene classification datasets include the UCM land-use dataset, the WHU-RS19 dataset, the AID dataset, and the OPTIMAL-31 dataset. The experimental results demonstrate the superiority of the proposed method in comparison with the state-of-the-arts.


2021 ◽  
Vol 25 (5) ◽  
pp. 1169-1185
Author(s):  
Deniu He ◽  
Hong Yu ◽  
Guoyin Wang ◽  
Jie Li

The problem of initialization of active learning is considered in this paper. Especially, this paper studies the problem in an imbalanced data scenario, which is called as class-imbalance active learning cold-start. The novel method is two-stage clustering-based active learning cold-start (ALCS). In the first stage, to separate the instances of minority class from that of majority class, a multi-center clustering is constructed based on a new inter-cluster tightness measure, thus the data is grouped into multiple clusters. Then, in the second stage, the initial training instances are selected from each cluster based on an adaptive candidate representative instances determination mechanism and a clusters-cyclic instance query mechanism. The comprehensive experiments demonstrate the effectiveness of the proposed method from the aspects of class coverage, classification performance, and impact on active learning.


2021 ◽  
Author(s):  
Yunqing Hu ◽  
Xuan Jin ◽  
Yin Zhang ◽  
Haiwen Hong ◽  
Jingfeng Zhang ◽  
...  

Sensors ◽  
2020 ◽  
Vol 20 (19) ◽  
pp. 5593 ◽  
Author(s):  
Wei-Hung Wu ◽  
Jen-Chun Lee ◽  
Yi-Ming Wang

Metallography is the study of the structure of metals and alloys. Metallographic analysis can be regarded as a detection tool to assist in identifying a metal or alloy, to evaluate whether an alloy is processed correctly, to inspect multiple phases within a material, to locate and characterize imperfections such as voids or impurities, or to find the damaged areas of metallographic images. However, the defect detection of metallography is evaluated by human experts, and its automatic identification is still a challenge in almost every real solution. Deep learning has been applied to different problems in computer vision since the proposal of AlexNet in 2012. In this study, we propose a novel convolutional neural network architecture for metallographic analysis based on a modified residual neural network (ResNet). Multi-scale ResNet (M-ResNet), the modified method, improves efficiency by utilizing multi-scale operations for the accurate detection of objects of various sizes, especially small objects. The experimental results show that the proposed method yields an accuracy of 85.7% (mAP) in recognition performance, which is higher than existing methods. As a consequence, we propose a novel system for automatic defect detection as an application for metallographic analysis.


Sign in / Sign up

Export Citation Format

Share Document