scholarly journals A Baseline for Cross-Database 3D Human Pose Estimation

Sensors ◽  
2021 ◽  
Vol 21 (11) ◽  
pp. 3769
Author(s):  
Michał Rapczyński ◽  
Philipp Werner ◽  
Sebastian Handrich ◽  
Ayoub Al-Hamadi

Vision-based 3D human pose estimation approaches are typically evaluated on datasets that are limited in diversity regarding many factors, e.g., subjects, poses, cameras, and lighting. However, for real-life applications, it would be desirable to create systems that work under arbitrary conditions (“in-the-wild”). To advance towards this goal, we investigated the commonly used datasets HumanEva-I, Human3.6M, and Panoptic Studio, discussed their biases (that is, their limitations in diversity), and illustrated them in cross-database experiments (for which we used a surrogate for roughly estimating in-the-wild performance). For this purpose, we first harmonized the differing skeleton joint definitions of the datasets, reducing the biases and systematic test errors in cross-database experiments. We further proposed a scale normalization method that significantly improved generalization across camera viewpoints, subjects, and datasets. In additional experiments, we investigated the effect of using more or less cameras, training with multiple datasets, applying a proposed anatomy-based pose validation step, and using OpenPose as the basis for the 3D pose estimation. The experimental results showed the usefulness of the joint harmonization, of the scale normalization, and of augmenting virtual cameras to significantly improve cross-database and in-database generalization. At the same time, the experiments showed that there were dataset biases that could not be compensated and call for new datasets covering more diversity. We discussed our results and promising directions for future work.

Author(s):  
Dushyant Mehta ◽  
Helge Rhodin ◽  
Dan Casas ◽  
Pascal Fua ◽  
Oleksandr Sotnychenko ◽  
...  

Author(s):  
Vincent Leroy ◽  
Philippe Weinzaepfel ◽  
Romain Bregier ◽  
Hadrien Combaluzier ◽  
Gregory Rogez

Symmetry ◽  
2020 ◽  
Vol 12 (7) ◽  
pp. 1116 ◽  
Author(s):  
Jun Sun ◽  
Mantao Wang ◽  
Xin Zhao ◽  
Dejun Zhang

In this paper, we study the problem of monocular 3D human pose estimation based on deep learning. Due to single view limitations, the monocular human pose estimation cannot avoid the inherent occlusion problem. The common methods use the multi-view based 3D pose estimation method to solve this problem. However, single-view images cannot be used directly in multi-view methods, which greatly limits practical applications. To address the above-mentioned issues, we propose a novel end-to-end 3D pose estimation network for monocular 3D human pose estimation. First, we propose a multi-view pose generator to predict multi-view 2D poses from the 2D poses in a single view. Secondly, we propose a simple but effective data augmentation method for generating multi-view 2D pose annotations, on account of the existing datasets (e.g., Human3.6M, etc.) not containing a large number of 2D pose annotations in different views. Thirdly, we employ graph convolutional network to infer a 3D pose from multi-view 2D poses. From experiments conducted on public datasets, the results have verified the effectiveness of our method. Furthermore, the ablation studies show that our method improved the performance of existing 3D pose estimation networks.


Sign in / Sign up

Export Citation Format

Share Document