Compact Spatial Pyramid Pooling Deep Convolutional Neural Network Based Hand Gestures Decoder

Akm Ashiquzzaman; Hyunmin Lee; Kwangki Kim; Hye-Young Kim; Jaehyung Park; Jinsul Kim

doi:10.3390/app10217898

Compact Spatial Pyramid Pooling Deep Convolutional Neural Network Based Hand Gestures Decoder

Applied Sciences ◽

10.3390/app10217898 ◽

2020 ◽

Vol 10 (21) ◽

pp. 7898

Author(s):

Akm Ashiquzzaman ◽

Hyunmin Lee ◽

Kwangki Kim ◽

Hye-Young Kim ◽

Jaehyung Park ◽

...

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

High Performance ◽

Fixed Number ◽

Hand Gesture ◽

Computing Power ◽

Classical Models ◽

Spatial Pyramid Pooling ◽

Gesture Input ◽

Spatial Pyramid

Current deep learning convolutional neural network (DCNN) -based hand gesture detectors with acute precision demand incredibly high-performance computing power. Although DCNN-based detectors are capable of accurate classification, the sheer computing power needed for this form of classification makes it very difficult to run with lower computational power in remote environments. Moreover, classical DCNN architectures have a fixed number of input dimensions, which forces preprocessing, thus making it impractical for real-world applications. In this research, a practical DCNN with an optimized architecture is proposed with DCNN filter/node pruning, and spatial pyramid pooling (SPP) is introduced in order to make the model input dimension-invariant. This compact SPP-DCNN module uses 65% fewer parameters than traditional classifiers and operates almost 3× faster than classical models. Moreover, the new improved proposed algorithm, which decodes gestures or sign language finger-spelling from videos, gave a benchmark highest accuracy with the fastest processing speed. This proposed method paves the way for various practical and applied hand gesture input-based human-computer interaction (HCI) applications.

Download Full-text

Convolutional neural network with spatial pyramid pooling for hand gesture recognition

Neural Computing and Applications ◽

10.1007/s00521-020-05337-0 ◽

2020 ◽

Author(s):

Yong Soon Tan ◽

Kian Ming Lim ◽

Connie Tee ◽

Chin Poo Lee ◽

Cheng Yaw Low

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Gesture Recognition ◽

Hand Gesture Recognition ◽

Hand Gesture ◽

Spatial Pyramid Pooling ◽

Spatial Pyramid

Download Full-text

Detection of Algorithmically Generated Domain Names Using the Recurrent Convolutional Neural Network with Spatial Pyramid Pooling

Entropy ◽

10.3390/e22091058 ◽

2020 ◽

Vol 22 (9) ◽

pp. 1058

Author(s):

Zhanghui Liu ◽

Yudong Zhang ◽

Yuzhong Chen ◽

Xinwen Fan ◽

Chen Dong

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Network Traffic ◽

Contextual Information ◽

Recall Rate ◽

Domain Name ◽

Sample Distribution ◽

Domain Names ◽

Spatial Pyramid Pooling ◽

Spatial Pyramid

Domain generation algorithms (DGAs) use specific parameters as random seeds to generate a large number of random domain names to prevent malicious domain name detection. This greatly increases the difficulty of detecting and defending against botnets and malware. Traditional models for detecting algorithmically generated domain names generally rely on manually extracting statistical characteristics from the domain names or network traffic and then employing classifiers to distinguish the algorithmically generated domain names. These models always require labor intensive manual feature engineering. In contrast, most state-of-the-art models based on deep neural networks are sensitive to imbalance in the sample distribution and cannot fully exploit the discriminative class features in domain names or network traffic, leading to decreased detection accuracy. To address these issues, we employ the borderline synthetic minority over-sampling algorithm (SMOTE) to improve sample balance. We also propose a recurrent convolutional neural network with spatial pyramid pooling (RCNN-SPP) to extract discriminative and distinctive class features. The recurrent convolutional neural network combines a convolutional neural network (CNN) and a bi-directional long short-term memory network (Bi-LSTM) to extract both the semantic and contextual information from domain names. We then employ the spatial pyramid pooling strategy to refine the contextual representation by capturing multi-scale contextual information from domain names. The experimental results from different domain name datasets demonstrate that our model can achieve 92.36% accuracy, an 89.55% recall rate, a 90.46% F1-score, and 95.39% AUC in identifying DGA and legitimate domain names, and it can achieve 92.45% accuracy rate, a 90.12% recall rate, a 90.86% F1-score, and 96.59% AUC in multi-classification problems. It achieves significant improvement over existing models in terms of accuracy and robustness.

Download Full-text

An optimized convolutional neural network with bottleneck and spatial pyramid pooling layers for classification of foods

Pattern Recognition Letters ◽

10.1016/j.patrec.2017.12.007 ◽

2018 ◽

Vol 105 ◽

pp. 50-58 ◽

Cited By ~ 9

Author(s):

Elnaz Jahani Heravi ◽

Hamed Habibi Aghdam ◽

Domenec Puig

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Spatial Pyramid Pooling ◽

Spatial Pyramid

Download Full-text

Manchu Word Recognition Based on Convolutional Neural Network with Spatial Pyramid Pooling

2018 11th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI) ◽

10.1109/cisp-bmei.2018.8633131 ◽

2018 ◽

Author(s):

Min Li ◽

Ruirui Zheng ◽

Shuang Xu ◽

Yu Fu ◽

Di Huang

Keyword(s):

Neural Network ◽

Word Recognition ◽

Convolutional Neural Network ◽

Spatial Pyramid Pooling ◽

Spatial Pyramid

Download Full-text

DeepScene: Scene classification via convolutional neural network with spatial pyramid pooling

Expert Systems with Applications ◽

10.1016/j.eswa.2021.116382 ◽

2021 ◽

pp. 116382

Author(s):

Pui Sin Yee ◽

Kian Ming Lim ◽

Chin Poo Lee

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Scene Classification ◽

Spatial Pyramid Pooling ◽

Spatial Pyramid

Download Full-text

A Spatial Pyramid Pooling Convolutional Neural Network for Smoky Vehicle Detection

2018 37th Chinese Control Conference (CCC) ◽

10.23919/chicc.2018.8483521 ◽

2018 ◽

Author(s):

Yichao Cao ◽

Chang Lu ◽

Xiaobo Lu ◽

Xue Xia

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Vehicle Detection ◽

Spatial Pyramid Pooling ◽

Smoky Vehicle Detection ◽

Spatial Pyramid

Download Full-text

Grasp Detection Based on Light-Weight Hierarchical Fusion Convolutional Neural Network

Journal of Physics Conference Series ◽

10.1088/1742-6596/2083/4/042030 ◽

2021 ◽

Vol 2083 (4) ◽

pp. 042030

Author(s):

Ziang Xu

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Feature Fusion ◽

Light Weight ◽

Cornell University ◽

Spatial Pyramid Pooling ◽

Feature Information ◽

Unknown Objects ◽

Spatial Pyramid

Abstract This paper presents a light-weight Hierarchical Fusion Convolutional Neural Network (HF-CNN) which can be used for grasping detection. The network mainly employs residual structures, atrous spatial pyramid pooling (ASPP) and coding-decoding based feature fusion. Compared with the usual grasping detection, the network in this paper greatly improves the robustness and generalizability on detecting tasks by extensively extracting feature information of the images. In our test with the Cornell University dataset, we achieve 85% accuracy when detecting the unknown objects.

Download Full-text

Surface EMG-Based Instantaneous Hand Gesture Recognition Using Convolutional Neural Network with the Transfer Learning Method

Sensors ◽

10.3390/s21072540 ◽

2021 ◽

Vol 21 (7) ◽

pp. 2540

Author(s):

Zhipeng Yu ◽

Jianghai Zhao ◽

Yucheng Wang ◽

Linglong He ◽

Shaonan Wang

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Transfer Learning ◽

Gesture Recognition ◽

Recognition System ◽

Surface Emg ◽

Hand Gesture Recognition ◽

Hand Gesture ◽

Training Time ◽

Generalization Ability

In recent years, surface electromyography (sEMG)-based human–computer interaction has been developed to improve the quality of life for people. Gesture recognition based on the instantaneous values of sEMG has the advantages of accurate prediction and low latency. However, the low generalization ability of the hand gesture recognition method limits its application to new subjects and new hand gestures, and brings a heavy training burden. For this reason, based on a convolutional neural network, a transfer learning (TL) strategy for instantaneous gesture recognition is proposed to improve the generalization performance of the target network. CapgMyo and NinaPro DB1 are used to evaluate the validity of our proposed strategy. Compared with the non-transfer learning (non-TL) strategy, our proposed strategy improves the average accuracy of new subject and new gesture recognition by 18.7% and 8.74%, respectively, when up to three repeated gestures are employed. The TL strategy reduces the training time by a factor of three. Experiments verify the transferability of spatial features and the validity of the proposed strategy in improving the recognition accuracy of new subjects and new gestures, and reducing the training burden. The proposed TL strategy provides an effective way of improving the generalization ability of the gesture recognition system.

Download Full-text

BengaliNet: A Low-Cost Novel Convolutional Neural Network for Bengali Handwritten Characters Recognition

Applied Sciences ◽

10.3390/app11156845 ◽

2021 ◽

Vol 11 (15) ◽

pp. 6845

Author(s):

Abu Sayeed ◽

Jungpil Shin ◽

Md. Al Mehedi Hasan ◽

Azmain Yakin Srizon ◽

Md. Mehedi Hasan

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Network Architecture ◽

Character Recognition ◽

High Performance ◽

Low Cost ◽

Traditional Learning ◽

Neural Network Architecture ◽

Handwritten Character Recognition ◽

Handwritten Character

As it is the seventh most-spoken language and fifth most-spoken native language in the world, the domain of Bengali handwritten character recognition has fascinated researchers for decades. Although other popular languages i.e., English, Chinese, Hindi, Spanish, etc. have received many contributions in the area of handwritten character recognition, Bengali has not received many noteworthy contributions in this domain because of the complex curvatures and similar writing fashions of Bengali characters. Previously, studies were conducted by using different approaches based on traditional learning, and deep learning. In this research, we proposed a low-cost novel convolutional neural network architecture for the recognition of Bengali characters with only 2.24 to 2.43 million parameters based on the number of output classes. We considered 8 different formations of CMATERdb datasets based on previous studies for the training phase. With experimental analysis, we showed that our proposed system outperformed previous works by a noteworthy margin for all 8 datasets. Moreover, we tested our trained models on other available Bengali characters datasets such as Ekush, BanglaLekha, and NumtaDB datasets. Our proposed architecture achieved 96–99% overall accuracies for these datasets as well. We believe our contributions will be beneficial for developing an automated high-performance recognition tool for Bengali handwritten characters.

Download Full-text

High Performance Kernel Architecture for Convolutional Neural Network Acceleration

Journal of Circuits System and Computers ◽

10.1142/s0218126621502662 ◽

2021 ◽

Author(s):

Anakhi Hazarika ◽

Soumyajit Poddarr ◽

Hafizur Rahaman

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

High Performance

Download Full-text