scholarly journals 3D Human Pose Estimation in Vietnamese Traditional Martial Art Videos

2019 ◽  
Vol 3 (3) ◽  
pp. 471
Author(s):  
Tuong Thanh Nguyen ◽  
Van-Hung Le ◽  
Duy-Long Duong ◽  
Thanh-Cong Pham ◽  
Dung Le

Preserving, maintaining and teaching traditional martial arts are very important activities in social life. That helps preserve national culture, exercise and self-defense for practitioners. However, traditional martial arts have many different postures and activities of the body and body parts are diverse. The problem of estimating the actions of the human body still has many challenges, such as accuracy, obscurity, etc. In this paper, we survey several strong studies in the recent years for 3-D human pose estimation. Statistical tables have been compiled for years, typical results of these studies on the Human 3.6m dataset have been summarized. We also present a comparative study for 3-D human pose estimation based on the method that uses a single image. This study based on the methods that use the Convolutional Neural Network (CNN) for 2-D pose estimation, and then using 3-D pose library for mapping the 2-D results into the 3-D space. The CNNs model is trained on the benchmark datasets as MSCOCO Keypoints Challenge dataset [1], Human 3.6m [2], MPII dataset [3], LSP [4], [5], etc. We final publish the dataset of Vietnamese's traditional martial arts in Binh Dinh province for evaluating the 3-D human pose estimation. Quantitative results are presented and evaluated.This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium provided the original work is properly cited.  

Author(s):  
Nguyễn Tường Thành ◽  
Lê Văn Hùng ◽  
Phạm Thành Công

Preserving, maintaining, and teaching traditional martial arts are very important activities in social life. That helps individuals preserve national culture, exercise, and practice self-defense. However, traditional martial arts have many differentposturesaswellasvariedmovementsofthebodyand body parts. The problem of estimating the actions of human body still has many challenges, such as accuracy, obscurity, and so forth. This paper begins with a review of several methods of 2-D human pose estimation on the RGB images, in which the methods of using the Convolutional Neural Network (CNN) models have outstanding advantages in terms of processing time and accuracy. In this work we built a small dataset and used CNN for estimating keypoints and joints of actions in traditional martial arts videos. Next we applied the measurements (length of joints, deviation angle of joints, and deviation of keypoints) for evaluating pose estimation in 2-D and 3-D spaces. The estimator was trained on the classic MSCOCO Keypoints Challenge dataset, the results were evaluated on a well-known dataset of Martial Arts, Dancing, and Sports dataset. The results were quantitatively evaluated and reported in this paper.


2017 ◽  
Vol 61 ◽  
pp. 22-39 ◽  
Author(s):  
Weichen Zhang ◽  
Zhiguang Liu ◽  
Liuyang Zhou ◽  
Howard Leung ◽  
Antoni B. Chan

2017 ◽  
Vol 11 (6) ◽  
pp. 426-433 ◽  
Author(s):  
Manuel I. López‐Quintero ◽  
Manuel J. Marín‐Jiménez ◽  
Rafael Muñoz‐Salinas ◽  
Rafael Medina‐Carnicer

2021 ◽  
pp. 1-11
Author(s):  
Min Zhang ◽  
Haijie Yang ◽  
Pengfei Li ◽  
Ming Jiang

Human pose estimation is still a challenging task in computer vision, especially in the case of camera view transformation, joints occlusions and overlapping, the task will be of ever-increasing difficulty to achieve success. Most existing methods pass the input through a network, which typically consists of high-to-low resolution sub-networks that are connected in series. Still, during the up-sampling process, the spatial relationships and details might be lost. This paper designs a parallel atrous convolutional network with body structure constraints (PAC-BCNet) to address the problem. Among the mentioned techniques, the parallel atrous convolution (PAC) is constructed to deal with scale changes by connecting multiple different atrous convolution sub-networks in parallel. And it is used to extract features from different scales without reducing the resolution. Besides, the body structure constraints (BC), which enhance the correlation between each keypoint, are constructed to obtain better spatial relationships of the body by designing keypoints constraints sets and improving the loss function. In this work, a comparative experiment of the serial atrous convolution, the parallel atrous convolution, the ablation study with and without body structure constraints are conducted, which reasonably proves the effectiveness of the approach. The model is evaluated on two widely used human pose estimation benchmarks (MPII and LSP). The method achieves better performance on both datasets.


2018 ◽  
Author(s):  
◽  
Guanghan Ning

[ACCESS RESTRICTED TO THE UNIVERSITY OF MISSOURI AT AUTHOR'S REQUEST.] The task of human pose estimation in natural scenes is to determine the precise pixel locations of body keypoints. It is very important for many high-level computer vision tasks, including action and activity recognition, human-computer interaction, motion capture, and animation. We cover two different approaches for this task: top-down approach and bottom-up approach. In the top-down approach, we propose a human tracking method called ROLO that localizes each person. We then propose a state-of-the-art single-person human pose estimator that predicts the body keypoints of each individual. In the bottomup approach, we propose an efficient multi-person pose estimator with which we participated in a PoseTrack challenge [11]. On top of these, we propose to employ adversarial training to further boost the performance of single-person human pose estimator while generating synthetic images. We also propose a novel PoSeg network that jointly estimates the multi-person human poses and semantically segment the portraits of these persons at pixel-level. Lastly, we extend some of the proposed methods on human pose estimation and portrait segmentation to the task of human parsing, a more finegrained computer vision perception of humans.


Sensors ◽  
2021 ◽  
Vol 21 (7) ◽  
pp. 2464
Author(s):  
Zhe Zhang ◽  
Chunyu Wang ◽  
Wenhu Qin

Multiple-camera systems can expand coverage and mitigate occlusion problems. However, temporal synchronization remains a problem for budget cameras and capture devices. We propose an out-of-the-box framework to temporally synchronize multiple cameras using semantic human pose estimation from the videos. Human pose predictions are obtained with an out-of-the-shelf pose estimator for each camera. Our method firstly calibrates each pair of cameras by minimizing an energy function related to epipolar distances. We also propose a simple yet effective multiple-person association algorithm across cameras and a score-regularized energy function for improved performance. Secondly, we integrate the synchronized camera pairs into a graph and derive the optimal temporal displacement configuration for the multiple-camera system. We evaluate our method on four public benchmark datasets and demonstrate robust sub-frame synchronization accuracy on all of them.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Shili Niu ◽  
Weihua Ou ◽  
Shihua Feng ◽  
Jianping Gou ◽  
Fei Long ◽  
...  

Existing methods for human pose estimation usually use a large intermediate tensor, leading to a high computational load, which is detrimental to resource-limited devices. To solve this problem, we propose a low computational cost pose estimation network, MobilePoseNet, which includes encoder, decoder, and parallel nonmaximum suppression operation. Specifically, we design a lightweight upsampling block instead of transposing the convolution as the decoder and use the lightweight network as our downsampling part. Then, we choose the high-resolution features as the input for upsampling to reduce the number of model parameters. Finally, we propose a parallel OKS-NMS, which significantly outperforms the conventional NMS in terms of accuracy and speed. Experimental results on the benchmark datasets show that MobilePoseNet obtains almost comparable results to state-of-the-art methods with a low compilation load. Compared to SimpleBaseline, the parameter of MobilePoseNet is only 4%, while the estimation accuracy reaches 98%.


Sign in / Sign up

Export Citation Format

Share Document