Video-Based Human Action Recognition Using Spatial Pyramid Pooling and 3D Densely Convolutional Networks

Wanli Yang; Yimin Chen; Chen Huang; Mingke Gao

doi:10.3390/fi10120115

Video-Based Human Action Recognition Using Spatial Pyramid Pooling and 3D Densely Convolutional Networks

Future Internet ◽

10.3390/fi10120115 ◽

2018 ◽

Vol 10 (12) ◽

pp. 115

Author(s):

Wanli Yang ◽

Yimin Chen ◽

Chen Huang ◽

Mingke Gao

Keyword(s):

Neural Networks ◽

Network Structure ◽

Three Dimensional ◽

Human Action Recognition ◽

Human Action ◽

Fixed Size ◽

Behavior Recognition ◽

Convolutional Network ◽

Spatial Pyramid Pooling ◽

Spatial Pyramid

In recent years, the application of deep neural networks to human behavior recognition has become a hot topic. Although remarkable achievements have been made in the field of image recognition, there are still many problems to be solved in the area of video. It is well known that convolutional neural networks require a fixed size image input, which not only limits the network structure but also affects the recognition accuracy. Although this problem has been solved in the field of images, it has not yet been broken through in the field of video. To address the input problem of fixed size video frames in video recognition, we propose a three-dimensional (3D) densely connected convolutional network based on spatial pyramid pooling (3D-DenseNet-SPP). As the name implies, the network structure is mainly composed of three parts: 3DCNN, DenseNet, and SPPNet. Our models were evaluated on a KTH dataset and UCF101 dataset separately. The experimental results showed that our model has better performance in the field of video-based behavior recognition in comparison to the existing models.

Download Full-text

Skeleton geometric transformation for human action recognition using convolutional neural networks

Proceedings of the 13th ACM International Conference on PErvasive Technologies Related to Assistive Environments ◽

10.1145/3389189.3397653 ◽

2020 ◽

Author(s):

Antonios Papadakis ◽

Ioannis Vernikos ◽

Eirini Mathe ◽

Evaggelos Spyrou

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Action Recognition ◽

Human Action Recognition ◽

Human Action ◽

Geometric Transformation

Download Full-text

Multi-modal human action recognition using deep neural networks fusing image and inertial sensor data

2017 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI) ◽

10.1109/mfi.2017.8170441 ◽

2017 ◽

Cited By ~ 4

Author(s):

Inhwan Hwang ◽

Geonho Cha ◽

Songhwai Oh

Keyword(s):

Neural Networks ◽

Action Recognition ◽

Deep Neural Networks ◽

Inertial Sensor ◽

Human Action Recognition ◽

Human Action ◽

Sensor Data

Download Full-text

Human action recognition based on spatio-temporal three-dimensional scattering transform descriptor and an improved VLAD feature encoding algorithm

Neurocomputing ◽

10.1016/j.neucom.2018.05.121 ◽

2019 ◽

Vol 348 ◽

pp. 145-157 ◽

Cited By ~ 1

Author(s):

Bo Lin ◽

Bin Fang ◽

Weibin Yang ◽

Jiye Qian

Keyword(s):

Action Recognition ◽

Three Dimensional ◽

Human Action Recognition ◽

Human Action ◽

Scattering Transform ◽

Feature Encoding ◽

Spatio Temporal

Download Full-text

HUMAN ACTION RECOGNITION FROM VIDEOS USING NEURAL NETWORKS

CHERKASY UNIVERSITY BULLETIN: APPLIED MATHEMATICS. INFORMATICS ◽

10.31651/2076-5886-2019-2-59-72 ◽

2020 ◽

pp. 59-72

Author(s):

Yana HONTARENKO ◽

◽

Nataliia KRASNOSHLYK ◽

Keyword(s):

Neural Networks ◽

Action Recognition ◽

Human Action Recognition ◽

Human Action

Download Full-text

Human Action Recognition Using Action Bank Features and Convolutional Neural Networks

Computer Vision - ACCV 2014 Workshops - Lecture Notes in Computer Science ◽

10.1007/978-3-319-16628-5_24 ◽

2015 ◽

pp. 328-339

Author(s):

Earnest Paul Ijjina ◽

C. Krishna Mohan

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Action Recognition ◽

Human Action Recognition ◽

Human Action

Download Full-text

Human Action Recognition Combining Sequential Dynamic Images and Two-Stream Convolutional Network

Laser & Optoelectronics Progress ◽

10.3788/lop202158.0210007 ◽

2021 ◽

Vol 58 (2) ◽

pp. 0210007

Author(s):

张文强 Zhang Wenqiang ◽

王增强 Wang Zengqiang ◽

张良 Zhang Liang

Keyword(s):

Action Recognition ◽

Human Action Recognition ◽

Human Action ◽

Convolutional Network

Download Full-text

Learning Graph Convolutional Network for Skeleton-Based Human Action Recognition by Neural Searching

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i03.5652 ◽

2020 ◽

Vol 34 (03) ◽

pp. 2669-2676 ◽

Cited By ~ 11

Author(s):

Wei Peng ◽

Xiaopeng Hong ◽

Haoyu Chen ◽

Guoying Zhao

Keyword(s):

Action Recognition ◽

Large Scale ◽

Order Approximation ◽

Human Action Recognition ◽

Search Space ◽

Human Action ◽

Higher Order ◽

Dynamic Graph ◽

Convolutional Network ◽

Representational Capacity

Human action recognition from skeleton data, fuelled by the Graph Convolutional Network (GCN) with its powerful capability of modeling non-Euclidean data, has attracted lots of attention. However, many existing GCNs provide a pre-defined graph structure and share it through the entire network, which can loss implicit joint correlations especially for the higher-level features. Besides, the mainstream spectral GCN is approximated by one-order hop such that higher-order connections are not well involved. All of these require huge efforts to design a better GCN architecture. To address these problems, we turn to Neural Architecture Search (NAS) and propose the first automatically designed GCN for this task. Specifically, we explore the spatial-temporal correlations between nodes and build a search space with multiple dynamic graph modules. Besides, we introduce multiple-hop modules and expect to break the limitation of representational capacity caused by one-order approximation. Moreover, a corresponding sampling- and memory-efficient evolution strategy is proposed to search in this space. The resulted architecture proves the effectiveness of the higher-order approximation and the layer-wise dynamic graph modules. To evaluate the performance of the searched model, we conduct extensive experiments on two very large scale skeleton-based action recognition datasets. The results show that our model gets the state-of-the-art results in term of given metrics.

Download Full-text

Kinematic joint descriptor and depth motion descriptor with convolutional neural networks for human action recognition

Materials Today Proceedings ◽

10.1016/j.matpr.2020.09.052 ◽

2020 ◽

Author(s):

S. Sandhya Rani ◽

G. Apparao Naidu ◽

V. Usha Shree

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Action Recognition ◽

Human Action Recognition ◽

Human Action ◽

Motion Descriptor ◽

Kinematic Joint

Download Full-text

Vehicle detection from high-resolution aerial images using spatial pyramid pooling-based deep convolutional neural networks

Multimedia Tools and Applications ◽

10.1007/s11042-016-4043-5 ◽

2016 ◽

Vol 76 (20) ◽

pp. 21651-21663 ◽

Cited By ~ 32

Author(s):

Tao Qu ◽

Quanyuan Zhang ◽

Shilei Sun

Keyword(s):

Neural Networks ◽

High Resolution ◽

Convolutional Neural Networks ◽

Vehicle Detection ◽

Aerial Images ◽

Deep Convolutional Neural Networks ◽

Spatial Pyramid Pooling ◽

Spatial Pyramid

Download Full-text

Human action recognition using genetic algorithms and convolutional neural networks

Pattern Recognition ◽

10.1016/j.patcog.2016.01.012 ◽

2016 ◽

Vol 59 ◽

pp. 199-212 ◽

Cited By ~ 78

Author(s):

Earnest Paul Ijjina ◽

Krishna Mohan Chalavadi

Keyword(s):

Neural Networks ◽

Genetic Algorithms ◽

Convolutional Neural Networks ◽

Action Recognition ◽

Human Action Recognition ◽

Human Action

Download Full-text