Deep Learning of Fuzzy Weighted Multi-Resolution Depth Motion Maps with Spatial Feature Fusion for Action Recognition

Mahmoud Al-Faris; John Chiverton; Yanyan Yang; David Ndzi

doi:10.3390/jimaging5100082

Deep Learning of Fuzzy Weighted Multi-Resolution Depth Motion Maps with Spatial Feature Fusion for Action Recognition

Journal of Imaging ◽

10.3390/jimaging5100082 ◽

2019 ◽

Vol 5 (10) ◽

pp. 82 ◽

Cited By ~ 2

Author(s):

Mahmoud Al-Faris ◽

John Chiverton ◽

Yanyan Yang ◽

David Ndzi

Keyword(s):

Action Recognition ◽

Spatial Information ◽

Feature Fusion ◽

Human Action Recognition ◽

Human Action ◽

Single Type ◽

Weight Functions ◽

Motion Model ◽

Temporal Dimension ◽

Depth Motion Maps

Human action recognition (HAR) is an important yet challenging task. This paper presents a novel method. First, fuzzy weight functions are used in computations of depth motion maps (DMMs). Multiple length motion information is also used. These features are referred to as fuzzy weighted multi-resolution DMMs (FWMDMMs). This formulation allows for various aspects of individual actions to be emphasized. It also helps to characterise the importance of the temporal dimension. This is important to help overcome, e.g., variations in time over which a single type of action might be performed. A deep convolutional neural network (CNN) motion model is created and trained to extract discriminative and compact features. Transfer learning is also used to extract spatial information from RGB and depth data using the AlexNet network. Different late fusion techniques are then investigated to fuse the deep motion model with the spatial network. The result is a spatial temporal HAR model. The developed approach is capable of recognising both human action and human–object interaction. Three public domain datasets are used to evaluate the proposed solution. The experimental results demonstrate the robustness of this approach compared with state-of-the art algorithms.

Download Full-text

Human Action Recognition Based on Multi-level Feature Fusion

2020 IEEE 6th International Conference on Computer and Communications (ICCC) ◽

10.1109/iccc51575.2020.9344943 ◽

2020 ◽

Author(s):

Xi Cai ◽

Wan Su ◽

Guang Han

Keyword(s):

Action Recognition ◽

Feature Fusion ◽

Human Action Recognition ◽

Human Action ◽

Multi Level

Download Full-text

Exploring 3D Human Action Recognition Using STACOG on Multi-View Depth Motion Maps Sequences

Sensors ◽

10.3390/s21113642 ◽

2021 ◽

Vol 21 (11) ◽

pp. 3642

Author(s):

Mohammad Farhad Bulbul ◽

Sadiya Tabussum ◽

Hazrat Ali ◽

Wenli Zheng ◽

Mi Young Lee ◽

...

Keyword(s):

Action Recognition ◽

Depth Map ◽

Human Action Recognition ◽

Human Action ◽

Collaborative Representation ◽

Auto Correlation ◽

Time Operation ◽

Real Time Operation ◽

Benchmark Datasets ◽

Depth Motion Maps

This paper proposes an action recognition framework for depth map sequences using the 3D Space-Time Auto-Correlation of Gradients (STACOG) algorithm. First, each depth map sequence is split into two sets of sub-sequences of two different frame lengths individually. Second, a number of Depth Motion Maps (DMMs) sequences from every set are generated and are fed into STACOG to find an auto-correlation feature vector. For two distinct sets of sub-sequences, two auto-correlation feature vectors are obtained and applied gradually to L2-regularized Collaborative Representation Classifier (L2-CRC) for computing a pair of sets of residual values. Next, the Logarithmic Opinion Pool (LOGP) rule is used to combine the two different outcomes of L2-CRC and to allocate an action label of the depth map sequence. Finally, our proposed framework is evaluated on three benchmark datasets named MSR-action 3D dataset, DHA dataset, and UTD-MHAD dataset. We compare the experimental results of our proposed framework with state-of-the-art approaches to prove the effectiveness of the proposed framework. The computational efficiency of the framework is also analyzed for all the datasets to check whether it is suitable for real-time operation or not.

Download Full-text

Human Action Recognition Based On Multi-level Feature Fusion

Proceedings of the International Conference on Computer Information Systems and Industrial Applications ◽

10.2991/cisia-15.2015.96 ◽

2015 ◽

Author(s):

Y.Y Xu ◽

G.Q Xiao ◽

X.Q Tang

Keyword(s):

Action Recognition ◽

Feature Fusion ◽

Human Action Recognition ◽

Human Action ◽

Multi Level

Download Full-text

Human action recognition using Adaptive Hierarchical Depth Motion Maps and Gabor filter

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp.2017.7952393 ◽

2017 ◽

Cited By ~ 11

Author(s):

Hong Liu ◽

Qinqin He ◽

Mengyuan Liu

Keyword(s):

Action Recognition ◽

Gabor Filter ◽

Human Action Recognition ◽

Human Action ◽

Depth Motion Maps

Download Full-text

Using a Multilearner to Fuse Multimodal Features for Human Action Recognition

Mathematical Problems in Engineering ◽

10.1155/2020/4358728 ◽

2020 ◽

Vol 2020 ◽

pp. 1-18

Author(s):

Chao Tang ◽

Huosheng Hu ◽

Wenjian Wang ◽

Wei Li ◽

Hua Peng ◽

...

Keyword(s):

Action Recognition ◽

Feature Fusion ◽

Human Action Recognition ◽

Human Action ◽

Human Motion ◽

Image Features ◽

Depth Image ◽

Good Ability ◽

Multimodal Features ◽

Feature Based

The representation and selection of action features directly affect the recognition effect of human action recognition methods. Single feature is often affected by human appearance, environment, camera settings, and other factors. Aiming at the problem that the existing multimodal feature fusion methods cannot effectively measure the contribution of different features, this paper proposed a human action recognition method based on RGB-D image features, which makes full use of the multimodal information provided by RGB-D sensors to extract effective human action features. In this paper, three kinds of human action features with different modal information are proposed: RGB-HOG feature based on RGB image information, which has good geometric scale invariance; D-STIP feature based on depth image, which maintains the dynamic characteristics of human motion and has local invariance; and S-JRPF feature-based skeleton information, which has good ability to describe motion space structure. At the same time, multiple K-nearest neighbor classifiers with better generalization ability are used to integrate decision-making classification. The experimental results show that the algorithm achieves ideal recognition results on the public G3D and CAD60 datasets.

Download Full-text

Improved Collaborative Representation Classifier Based on l2-Regularized for Human Action Recognition

Journal of Electrical and Computer Engineering ◽

10.1155/2017/8191537 ◽

2017 ◽

Vol 2017 ◽

pp. 1-6

Author(s):

Shirui Huo ◽

Tianrui Hu ◽

Ce Li

Keyword(s):

Action Recognition ◽

Human Action Recognition ◽

Test Sample ◽

Human Action ◽

Superior Performance ◽

Depth Image ◽

Collaborative Representation ◽

Depth Images ◽

Spatiotemporal Information ◽

Depth Motion Maps

Human action recognition is an important recent challenging task. Projecting depth images onto three depth motion maps (DMMs) and extracting deep convolutional neural network (DCNN) features are discriminant descriptor features to characterize the spatiotemporal information of a specific action from a sequence of depth images. In this paper, a unified improved collaborative representation framework is proposed in which the probability that a test sample belongs to the collaborative subspace of all classes can be well defined and calculated. The improved collaborative representation classifier (ICRC) based on l2-regularized for human action recognition is presented to maximize the likelihood that a test sample belongs to each class, then theoretical investigation into ICRC shows that it obtains a final classification by computing the likelihood for each class. Coupled with the DMMs and DCNN features, experiments on depth image-based action recognition, including MSRAction3D and MSRGesture3D datasets, demonstrate that the proposed approach successfully using a distance-based representation classifier achieves superior performance over the state-of-the-art methods, including SRC, CRC, and SVM.

Download Full-text

Human action recognition based on multiple feature fusion

Advances in Modelling and Analysis B ◽

10.18280/ama_b.600102 ◽

2017 ◽

Vol 60 (1) ◽

pp. 25-42

Author(s):

R.J. Ma ◽

H.S. Zhang

Keyword(s):

Action Recognition ◽

Feature Fusion ◽

Human Action Recognition ◽

Human Action ◽

Multiple Feature ◽

Multiple Feature Fusion

Download Full-text

Depth Motion Maps and Log-Gabor Features Based Human Action Recognition Using Support Vector Machine

Journal of Mathematics and Informatics ◽

10.22457/jmi.149av17a7 ◽

2019 ◽

Vol 17 ◽

pp. 73-89

Author(s):

Biplab Madhu ◽

Md. Zahidul Islam ◽

Lasker Ershad Ali

Keyword(s):

Support Vector Machine ◽

Action Recognition ◽

Human Action Recognition ◽

Human Action ◽

Support Vector ◽

Gabor Features ◽

Depth Motion Maps ◽

Log Gabor

Download Full-text

Depth Motion Maps and Log-Gabor Features Based Human Action Recognition Using Support Vector Machine

Journal of Mathematics and Informatics ◽

10.22457/jmi.146av17a7 ◽

2019 ◽

Vol 17 ◽

pp. 73-79

Author(s):

Biplab Madhu ◽

◽

Md. Zahidul Islam ◽

Lasker Ershad Ali

Keyword(s):

Support Vector Machine ◽

Action Recognition ◽

Human Action Recognition ◽

Human Action ◽

Support Vector ◽

Gabor Features ◽

Depth Motion Maps ◽

Log Gabor

Download Full-text

Two-stream spatiotemporal feature fusion for human action recognition

The Visual Computer ◽

10.1007/s00371-020-01940-3 ◽

2020 ◽

Author(s):

Amany Abdelbaky ◽

Saleh Aly

Keyword(s):

Action Recognition ◽

Feature Fusion ◽

Human Action Recognition ◽

Human Action ◽

Spatiotemporal Feature

Download Full-text