scholarly journals Human Action Recognition Algorithm Based on Improved ResNet and Skeletal Keypoints in Single Image

2020 ◽  
Vol 2020 ◽  
pp. 1-12
Author(s):  
Yixue Lin ◽  
Wanda Chi ◽  
Wenxue Sun ◽  
Shicai Liu ◽  
Di Fan

Human action recognition is an important part for computers to understand the behavior of people in pictures or videos. In a single image, there is no context information for recognition, so its accuracy still needs to be greatly improved. In this paper, a single-image human action recognition method based on improved ResNet and skeletal keypoints is proposed, and the accuracy is improved by several methods. We improved the backbone network ResNet-50 and CPN to a certain extent and constructed a multitask network to suit the human action recognition task, which not only improves the accuracy but also balances the total number of parameters and solves the problem of large network and slow operation. In this paper, the improvement methods of ResNet-50, CPN, and whole network are tested, respectively. The results show that the single-image human action recognition based on improved ResNet and skeletal keypoints can accurately identify human action in the case of different human movements, different background light, and occlusion. Compared with the original network and the main human action recognition algorithms, the accuracy of our method has its certain advantages.

2013 ◽  
Vol 18 (2-3) ◽  
pp. 49-60 ◽  
Author(s):  
Damian Dudzńiski ◽  
Tomasz Kryjak ◽  
Zbigniew Mikrut

Abstract In this paper a human action recognition algorithm, which uses background generation with shadow elimination, silhouette description based on simple geometrical features and a finite state machine for recognizing particular actions is described. The performed tests indicate that this approach obtains a 81 % correct recognition rate allowing real-time image processing of a 360 X 288 video stream.


Author(s):  
Mohammad Farhad Bulbul ◽  
Yunsheng Jiang ◽  
Jinwen Ma

The emerging cost-effective depth sensors have facilitated the action recognition task significantly. In this paper, the authors address the action recognition problem using depth video sequences combining three discriminative features. More specifically, the authors generate three Depth Motion Maps (DMMs) over the entire video sequence corresponding to the front, side, and top projection views. Contourlet-based Histogram of Oriented Gradients (CT-HOG), Local Binary Patterns (LBP), and Edge Oriented Histograms (EOH) are then computed from the DMMs. To merge these features, the authors consider decision-level fusion, where a soft decision-fusion rule, Logarithmic Opinion Pool (LOGP), is used to combine the classification outcomes from multiple classifiers each with an individual set of features. Experimental results on two datasets reveal that the fusion scheme achieves superior action recognition performance over the situations when using each feature individually.


Algorithms ◽  
2020 ◽  
Vol 13 (11) ◽  
pp. 301
Author(s):  
Guocheng Liu ◽  
Caixia Zhang ◽  
Qingyang Xu ◽  
Ruoshi Cheng ◽  
Yong Song ◽  
...  

In view of difficulty in application of optical flow based human action recognition due to large amount of calculation, a human action recognition algorithm I3D-shufflenet model is proposed combining the advantages of I3D neural network and lightweight model shufflenet. The 5 × 5 convolution kernel of I3D is replaced by a double 3 × 3 convolution kernels, which reduces the amount of calculations. The shuffle layer is adopted to achieve feature exchange. The recognition and classification of human action is performed based on trained I3D-shufflenet model. The experimental results show that the shuffle layer improves the composition of features in each channel which can promote the utilization of useful information. The Histogram of Oriented Gradients (HOG) spatial-temporal features of the object are extracted for training, which can significantly improve the ability of human action expression and reduce the calculation of feature extraction. The I3D-shufflenet is testified on the UCF101 dataset, and compared with other models. The final result shows that the I3D-shufflenet has higher accuracy than the original I3D with an accuracy of 96.4%.


2019 ◽  
Vol 9 (10) ◽  
pp. 2126 ◽  
Author(s):  
Suge Dong ◽  
Daidi Hu ◽  
Ruijun Li ◽  
Mingtao Ge

Aimed at the problems of high redundancy of trajectory and susceptibility to background interference in traditional dense trajectory behavior recognition methods, a human action recognition method based on foreground trajectory and motion difference descriptors is proposed. First, the motion magnitude of each frame is estimated by optical flow, and the foreground region is determined according to each motion magnitude of the pixels; the trajectories are only extracted from behavior-related foreground regions. Second, in order to better describe the relative temporal information between different actions, a motion difference descriptor is introduced to describe the foreground trajectory, and the direction histogram of the motion difference is constructed by calculating the direction information of the motion difference per unit time of the trajectory point. Finally, a Fisher vector (FV) is used to encode histogram features to obtain video-level action features, and a support vector machine (SVM) is utilized to classify the action category. Experimental results show that this method can better extract the action-related trajectory, and it can improve the recognition accuracy by 7% compared to the traditional dense trajectory method.


2014 ◽  
Vol 989-994 ◽  
pp. 2731-2734
Author(s):  
Hai Long Jia ◽  
Kun Cao

The choice of the motion features affects the result of the human action recognition method directly. Many factors often influence the single feature differently, such as appearance of human body, environment and video camera. So the accuracy of action recognition is limited. On the basis of studying the representation and recognition of human actions, and giving full consideration to the advantages and disadvantages of different features, this paper proposes a mixed feature which combines global silhouette feature and local optical flow feature. This combined representation is used for human action recognition.


2013 ◽  
Vol 631-632 ◽  
pp. 1303-1308
Author(s):  
He Jin Yuan

A novel human action recognition algorithm based on key posture is proposed in this paper. In the method, the mesh features of each image in human action sequences are firstly calculated; then the key postures of the human mesh features are generated through k-medoids clustering algorithm; and the motion sequences are thus represented as vectors of key postures. The component of the vector is the occurrence number of the corresponding posture included in the action. For human action recognition, the observed action is firstly changed into key posture vector; then the correlevant coefficients to the training samples are calculated and the action which best matches the observed sequence is chosen as the final category. The experiments on Weizmann dataset demonstrate that our method is effective for human action recognition. The average recognition accuracy can exceed 90%.


Sign in / Sign up

Export Citation Format

Share Document