Learning Long-Term Temporal Features With Deep Neural Networks for Human Action Recognition

Abstract: In deep neural networks, human action detection is one of the most demanding and complex tasks. Human gesture recognition is the same as human action recognition. Gesture is defined as a series of bodily motions that communicate a message. Gestures are a more natural and preferable way for humans to engage with computers, thereby bridging the gap between humans and robots. The finest communication platform for the deaf and dumb is human action recognition. We propose in this work to create a system for hand gesture identification that recognizes hand movements, hand characteristics such as peak calculation and angle calculation, and then converts gesture photos into text. Index Terms: Human action recognition, Deaf and dumb, CNN.

Download Full-text

A Survey on Deep Neural Networks for Human Action Recognition based on Skeleton Information

Recent Developments in Intelligent Systems and Interactive Applications - Advances in Intelligent Systems and Computing ◽

10.1007/978-3-319-49568-2_47 ◽

2016 ◽

pp. 329-336

Author(s):

Hongyu Wang

Keyword(s):

Neural Networks ◽

Action Recognition ◽

Deep Neural Networks ◽

Human Action Recognition ◽

Human Action

Download Full-text

Human Action Recognition Using Deep Neural Networks

2020 Fourth World Conference on Smart Trends in Systems, Security and Sustainability (WorldS4) ◽

10.1109/worlds450073.2020.9210345 ◽

2020 ◽

Author(s):

Rashmi R. Koli ◽

Tanveer I. Bagban

Keyword(s):

Neural Networks ◽

Action Recognition ◽

Deep Neural Networks ◽

Human Action Recognition ◽

Human Action

Download Full-text

Skeleton geometric transformation for human action recognition using convolutional neural networks

Proceedings of the 13th ACM International Conference on PErvasive Technologies Related to Assistive Environments ◽

10.1145/3389189.3397653 ◽

2020 ◽

Author(s):

Antonios Papadakis ◽

Ioannis Vernikos ◽

Eirini Mathe ◽

Evaggelos Spyrou

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Action Recognition ◽

Human Action Recognition ◽

Human Action ◽

Geometric Transformation

Download Full-text

HUMAN ACTION RECOGNITION FROM VIDEOS USING NEURAL NETWORKS

CHERKASY UNIVERSITY BULLETIN: APPLIED MATHEMATICS. INFORMATICS ◽

10.31651/2076-5886-2019-2-59-72 ◽

2020 ◽

pp. 59-72

Author(s):

Yana HONTARENKO ◽

◽

Nataliia KRASNOSHLYK ◽

Keyword(s):

Neural Networks ◽

Action Recognition ◽

Human Action Recognition ◽

Human Action

Download Full-text

Human Action Recognition Using Action Bank Features and Convolutional Neural Networks

Computer Vision - ACCV 2014 Workshops - Lecture Notes in Computer Science ◽

10.1007/978-3-319-16628-5_24 ◽

2015 ◽

pp. 328-339

Author(s):

Earnest Paul Ijjina ◽

C. Krishna Mohan

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Action Recognition ◽

Human Action Recognition ◽

Human Action

Download Full-text

I3D-Shufflenet Based Human Action Recognition

Algorithms ◽

10.3390/a13110301 ◽

2020 ◽

Vol 13 (11) ◽

pp. 301

Author(s):

Guocheng Liu ◽

Caixia Zhang ◽

Qingyang Xu ◽

Ruoshi Cheng ◽

Yong Song ◽

...

Keyword(s):

Neural Network ◽

Action Recognition ◽

Human Action Recognition ◽

Human Action ◽

Recognition Algorithm ◽

Convolution Kernel ◽

Histogram Of Oriented Gradients ◽

Temporal Features ◽

Convolution Kernels

In view of difficulty in application of optical flow based human action recognition due to large amount of calculation, a human action recognition algorithm I3D-shufflenet model is proposed combining the advantages of I3D neural network and lightweight model shufflenet. The 5 × 5 convolution kernel of I3D is replaced by a double 3 × 3 convolution kernels, which reduces the amount of calculations. The shuffle layer is adopted to achieve feature exchange. The recognition and classification of human action is performed based on trained I3D-shufflenet model. The experimental results show that the shuffle layer improves the composition of features in each channel which can promote the utilization of useful information. The Histogram of Oriented Gradients (HOG) spatial-temporal features of the object are extracted for training, which can significantly improve the ability of human action expression and reduce the calculation of feature extraction. The I3D-shufflenet is testified on the UCF101 dataset, and compared with other models. The final result shows that the I3D-shufflenet has higher accuracy than the original I3D with an accuracy of 96.4%.

Download Full-text

Distinct Two-Stream Convolutional Networks for Human Action Recognition in Videos Using Segment-Based Temporal Modeling

Data ◽

10.3390/data5040104 ◽

2020 ◽

Vol 5 (4) ◽

pp. 104

Author(s):

Ashok Sarabu ◽

Ajit Kumar Santra

Keyword(s):

Action Recognition ◽

Data Augmentation ◽

Main Idea ◽

Human Action Recognition ◽

Human Action ◽

Great Success ◽

Temporal Modeling ◽

Convolutional Networks ◽

Temporal Features ◽

Augmentation Techniques

The Two-stream convolution neural network (CNN) has proven a great success in action recognition in videos. The main idea is to train the two CNNs in order to learn spatial and temporal features separately, and two scores are combined to obtain final scores. In the literature, we observed that most of the methods use similar CNNs for two streams. In this paper, we design a two-stream CNN architecture with different CNNs for the two streams to learn spatial and temporal features. Temporal Segment Networks (TSN) is applied in order to retrieve long-range temporal features, and to differentiate the similar type of sub-action in videos. Data augmentation techniques are employed to prevent over-fitting. Advanced cross-modal pre-training is discussed and introduced to the proposed architecture in order to enhance the accuracy of action recognition. The proposed two-stream model is evaluated on two challenging action recognition datasets: HMDB-51 and UCF-101. The findings of the proposed architecture shows the significant performance increase and it outperforms the existing methods.

Download Full-text