Complex Human–Object Interactions Analyzer Using a DCNN and SVM Hybrid Approach

Cho Nilar Phyo; Thi Thi Zin; Pyke Tin

doi:10.3390/app9091869

Complex Human–Object Interactions Analyzer Using a DCNN and SVM Hybrid Approach

Applied Sciences ◽

10.3390/app9091869 ◽

2019 ◽

Vol 9 (9) ◽

pp. 1869 ◽

Cited By ~ 3

Author(s):

Cho Nilar Phyo ◽

Thi Thi Zin ◽

Pyke Tin

Keyword(s):

Hybrid Approach ◽

Human Action Recognition ◽

Cost Effective ◽

The Elderly ◽

Human Action ◽

Daily Activities ◽

Support Vector ◽

Deep Convolutional Neural Networks ◽

Human Object ◽

Object Interactions

Nowadays, with the emergence of sophisticated electronic devices, human daily activities are becoming more and more complex. On the other hand, research has begun on the use of reliable, cost-effective sensors, patient monitoring systems, and other systems that make daily life more comfortable for the elderly. Moreover, in the field of computer vision, human action recognition (HAR) has drawn much attention as a subject of research because of its potential for numerous cost-effective applications. Although much research has investigated the use of HAR, most has dealt with simple basic actions in a simplified environment; not much work has been done in more complex, real-world environments. Therefore, a need exists for a system that can recognize complex daily activities in a variety of realistic environments. In this paper, we propose a system for recognizing such activities, in which humans interact with various objects, taking into consideration object-oriented activity information, the use of deep convolutional neural networks, and a multi-class support vector machine (multi-class SVM). The experiments are performed on a publicly available cornell activity dataset: CAD-120 which is a dataset of human–object interactions featuring ten high-level daily activities. The outcome results show that the proposed system achieves an accuracy of 93.33%, which is higher than other state-of-the-art methods, and has great potential for applications recognizing complex daily activities.

Download Full-text

Group Sparse Regression-Based Learning Model for Real-Time Depth-Based Human Action Prediction

Mathematical Problems in Engineering ◽

10.1155/2018/8201509 ◽

2018 ◽

Vol 2018 ◽

pp. 1-7 ◽

Cited By ~ 3

Author(s):

Meng Li ◽

Liang Yan ◽

Qianying Wang

Keyword(s):

Real Time ◽

Human Action Recognition ◽

Human Action ◽

Group Sparsity ◽

Human Actions ◽

Depth Data ◽

Human Object ◽

Benchmark Datasets ◽

Object Interactions ◽

Sparse Set

This paper addresses the problem of predicting human actions in depth videos. Due to the complex spatiotemporal structure of human actions, it is difficult to infer ongoing human actions before they are fully executed. To handle this challenging issue, we first propose two new depth-based features called pairwise relative joint orientations (PRJOs) and depth patch motion maps (DPMMs) to represent the relative movements between each pair of joints and human-object interactions, respectively. The two proposed depth-based features are suitable for recognizing and predicting human actions in real-time fashion. Then, we propose a regression-based learning approach with a group sparsity inducing regularizer to learn action predictor based on the combination of PRJOs and DPMMs for a sparse set of joints. Experimental results on benchmark datasets have demonstrated that our proposed approach significantly outperforms existing methods for real-time human action recognition and prediction from depth data.

Download Full-text

Human action recognition using support vector machines and 3D convolutional neural networks

International Journal of Advances in Intelligent Informatics ◽

10.26555/ijain.v3i1.89 ◽

2017 ◽

Vol 3 (1) ◽

pp. 47 ◽

Cited By ~ 10

Author(s):

Majd Latah

Keyword(s):

Neural Networks ◽

Support Vector Machines ◽

Convolutional Neural Networks ◽

Action Recognition ◽

Recognition Task ◽

Human Action Recognition ◽

Human Action ◽

Support Vector ◽

Deep Convolutional Neural Networks ◽

Vector Machines

Recently, deep learning approach has been used widely in order to enhance the recognition accuracy with different application areas. In this paper, both of deep convolutional neural networks (CNN) and support vector machines approach were employed in human action recognition task. Firstly, 3D CNN approach was used to extract spatial and temporal features from adjacent video frames. Then, support vector machines approach was used in order to classify each instance based on previously extracted features. Both of the number of CNN layers and the resolution of the input frames were reduced to meet the limited memory constraints. The proposed architecture was trained and evaluated on KTH action recognition dataset and achieved a good performance.

Download Full-text

2D Information Space Based Action Recognition

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.a1060.1191s19 ◽

2019 ◽

Vol 9 (1S) ◽

pp. 295-298

Keyword(s):

Convex Hull ◽

Action Recognition ◽

Video Sequence ◽

Feature Vector ◽

Human Action Recognition ◽

Human Action ◽

New Approach ◽

Human Object ◽

Background Data ◽

2D Data

Video based human action recognition has attained more attraction from the researchers and it predominates in the field of computer vision and pattern recognition. In this paper we deliver a new approach to suppress the background data and to extract 2D data of foreground human object of the video sequence. A combination of convex hull area, convex hull perimeter, solidity and eccentricity is used to represent the feature vector. Experiments are conducted on Weizmann video dataset to assess how the system is doing. The discriminative nature of the feature vectors assures accuracy in action recognition.

Download Full-text

Driving Posture Recognition by Joint Application of Motion History Image and Pyramid Histogram of Oriented Gradients

International Journal of Vehicular Technology ◽

10.1155/2014/719413 ◽

2014 ◽

Vol 2014 ◽

pp. 1-11 ◽

Cited By ~ 8

Author(s):

Chao Yan ◽

Frans Coenen ◽

Bailing Zhang

Keyword(s):

Intelligent Transportation System ◽

Human Action Recognition ◽

Human Action ◽

Support Vector ◽

Histogram Of Oriented Gradients ◽

Motion History Image ◽

Joint Application ◽

A Cell ◽

Posture Recognition ◽

Driving Posture

In the field of intelligent transportation system (ITS), automatic interpretation of a driver’s behavior is an urgent and challenging topic. This paper studies vision-based driving posture recognition in the human action recognition framework. A driving action dataset was prepared by a side-mounted camera looking at a driver’s left profile. The driving actions, including operating the shift lever, talking on a cell phone, eating, and smoking, are first decomposed into a number of predefined action primitives, that is, interaction with shift lever, operating the shift lever, interaction with head, and interaction with dashboard. A global grid-based representation for the action primitives was emphasized, which first generate the silhouette shape from motion history image, followed by application of the pyramid histogram of oriented gradients (PHOG) for more discriminating characterization. The random forest (RF) classifier was then exploited to classify the action primitives together with comparisons to some other commonly applied classifiers such as kNN, multiple layer perceptron, and support vector machine. Classification accuracy is over 94% for the RF classifier in holdout and cross-validation experiments on the four manually decomposed driving actions.

Download Full-text

Human action recognition with group lasso regularized-support vector machine

Journal of Electronic Imaging ◽

10.1117/1.jei.25.3.033015 ◽

2016 ◽

Vol 25 (3) ◽

pp. 033015 ◽

Cited By ~ 2

Author(s):

Huiwu Luo ◽

Huanzhang Lu ◽

Yabei Wu ◽

Fei Zhao

Keyword(s):

Support Vector Machine ◽

Action Recognition ◽

Human Action Recognition ◽

Human Action ◽

Group Lasso ◽

Support Vector

Download Full-text

Hybrid Feature Vector-Assisted Action Representation for Human Action Recognition Using Support Vector Machines

Methodologies and Applications of Computational Statistics for Machine Intelligence - Advances in Systems Analysis, Software Engineering, and High Performance Computing ◽

10.4018/978-1-7998-7701-1.ch001 ◽

2021 ◽

pp. 1-22

Author(s):

L. Nirmala Devi ◽

A.Nageswar Rao

Keyword(s):

Action Recognition ◽

Feature Vector ◽

Learning Algorithm ◽

Gabor Filter ◽

Principal Component ◽

Human Action Recognition ◽

Human Action ◽

Visual Surveillance ◽

Support Vector ◽

Significant Research

Human action recognition (HAR) is one of most significant research topics, and it has attracted the concentration of many researchers. Automatic HAR system is applied in several fields like visual surveillance, data retrieval, healthcare, etc. Based on this inspiration, in this chapter, the authors propose a new HAR model that considers an image as input and analyses and exposes the action present in it. Under the analysis phase, they implement two different feature extraction methods with the help of rotation invariant Gabor filter and edge adaptive wavelet filter. For every action image, a new vector called as composite feature vector is formulated and then subjected to dimensionality reduction through principal component analysis (PCA). Finally, the authors employ the most popular supervised machine learning algorithm (i.e., support vector machine [SVM]) for classification. Simulation is done over two standard datasets; they are KTH and Weizmann, and the performance is measured through an accuracy metric.

Download Full-text

Feature Fusion of Deep Spatial Features and Handcrafted Spatiotemporal Features for Human Action Recognition

Sensors ◽

10.3390/s19071599 ◽

2019 ◽

Vol 19 (7) ◽

pp. 1599 ◽

Cited By ~ 6

Author(s):

Md Uddin ◽

Young-Koo Lee

Keyword(s):

Action Recognition ◽

State Of The Art ◽

Human Action Recognition ◽

Human Action ◽

Support Vector ◽

Feature Descriptor ◽

Weber’S Law ◽

Weber's Law ◽

Spatiotemporal Features ◽

Spatial Features

Human action recognition plays a significant part in the research community due to its emerging applications. A variety of approaches have been proposed to resolve this problem, however, several issues still need to be addressed. In action recognition, effectively extracting and aggregating the spatial-temporal information plays a vital role to describe a video. In this research, we propose a novel approach to recognize human actions by considering both deep spatial features and handcrafted spatiotemporal features. Firstly, we extract the deep spatial features by employing a state-of-the-art deep convolutional network, namely Inception-Resnet-v2. Secondly, we introduce a novel handcrafted feature descriptor, namely Weber’s law based Volume Local Gradient Ternary Pattern (WVLGTP), which brings out the spatiotemporal features. It also considers the shape information by using gradient operation. Furthermore, Weber’s law based threshold value and the ternary pattern based on an adaptive local threshold is presented to effectively handle the noisy center pixel value. Besides, a multi-resolution approach for WVLGTP based on an averaging scheme is also presented. Afterward, both these extracted features are concatenated and feed to the Support Vector Machine to perform the classification. Lastly, the extensive experimental analysis shows that our proposed method outperforms state-of-the-art approaches in terms of accuracy.

Download Full-text

A Set of New Hermite Kernel Functions in Kernel Extreme Learning Machine and Application in Human Action Recognition

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001419550140 ◽

2019 ◽

Vol 33 (12) ◽

pp. 1955014 ◽

Cited By ~ 1

Author(s):

Xueping Liu ◽

Xingzuo Yue

Keyword(s):

Extreme Learning Machine ◽

Action Recognition ◽

Structural Information ◽

Image Data ◽

Human Action Recognition ◽

Human Action ◽

Kernel Functions ◽

Support Vector ◽

Learning Speed ◽

Learning Machine

The kernel function has been successfully utilized in the extreme learning machine (ELM) that provides a stabilized and generalized performance and greatly reduces the computational complexity. However, the selection and optimization of the parameters constituting the most common kernel functions are tedious and time-consuming. In this study, a set of new Hermit kernel functions derived from the generalized Hermit polynomials has been proposed. The significant contributions of the proposed kernel include only one parameter selected from a small set of natural numbers; thus, the parameter optimization is greatly facilitated and excessive structural information of the sample data is retained. Consequently, the new kernel functions can be used as optimal alternatives to other common kernel functions for ELM at a rapid learning speed. The experimental results showed that the proposed kernel ELM method tends to have similar or better robustness and generalized performance at a faster learning speed than the other common kernel ELM and support vector machine methods. Consequently, when applied to human action recognition by depth video sequence, the method also achieves excellent performance, demonstrating its time-based advantage on the video image data.

Download Full-text

Human Action Recognition Based on Foreground Trajectory and Motion Difference Descriptors

Applied Sciences ◽

10.3390/app9102126 ◽

2019 ◽

Vol 9 (10) ◽

pp. 2126 ◽

Cited By ~ 1

Author(s):

Suge Dong ◽

Daidi Hu ◽

Ruijun Li ◽

Mingtao Ge

Keyword(s):

Action Recognition ◽

Human Action Recognition ◽

Human Action ◽

Support Vector ◽

Recognition Method ◽

Foreground Region ◽

Dense Trajectory ◽

Trajectory Method ◽

Direction Information ◽

Action Category

Aimed at the problems of high redundancy of trajectory and susceptibility to background interference in traditional dense trajectory behavior recognition methods, a human action recognition method based on foreground trajectory and motion difference descriptors is proposed. First, the motion magnitude of each frame is estimated by optical flow, and the foreground region is determined according to each motion magnitude of the pixels; the trajectories are only extracted from behavior-related foreground regions. Second, in order to better describe the relative temporal information between different actions, a motion difference descriptor is introduced to describe the foreground trajectory, and the direction histogram of the motion difference is constructed by calculating the direction information of the motion difference per unit time of the trajectory point. Finally, a Fisher vector (FV) is used to encode histogram features to obtain video-level action features, and a support vector machine (SVM) is utilized to classify the action category. Experimental results show that this method can better extract the action-related trajectory, and it can improve the recognition accuracy by 7% compared to the traditional dense trajectory method.

Download Full-text

Depth Motion Maps and Log-Gabor Features Based Human Action Recognition Using Support Vector Machine

Journal of Mathematics and Informatics ◽

10.22457/jmi.149av17a7 ◽

2019 ◽

Vol 17 ◽

pp. 73-89

Author(s):

Biplab Madhu ◽

Md. Zahidul Islam ◽

Lasker Ershad Ali

Keyword(s):

Support Vector Machine ◽

Action Recognition ◽

Human Action Recognition ◽

Human Action ◽

Support Vector ◽

Gabor Features ◽

Depth Motion Maps ◽

Log Gabor

Download Full-text