Using DeepLabCut for 3D markerless pose estimation across species and behaviors

Mapping Intimacies ◽

10.1101/476531 ◽

2018 ◽

Cited By ~ 10

Author(s):

Tanmay Nath ◽

Alexander Mathis ◽

An Chi Chen ◽

Amir Patel ◽

Matthias Bethge ◽

...

Keyword(s):

Pose Estimation ◽

User Interfaces ◽

Deep Neural Network ◽

State Of The Art ◽

Estimation Algorithm ◽

Graphical User Interfaces ◽

Training Data ◽

Behavioral Tracking ◽

Human Pose ◽

Python Package

Noninvasive behavioral tracking of animals during experiments is crucial to many scientific pursuits. Extracting the poses of animals without using markers is often essential for measuring behavioral effects in biomechanics, genetics, ethology & neuroscience. Yet, extracting detailed poses without markers in dynamically changing backgrounds has been challenging. We recently introduced an open source toolbox called DeepLabCut that builds on a state-of-the-art human pose estimation algorithm to allow a user to train a deep neural network using limited training data to precisely track user-defined features that matches human labeling accuracy. Here, with this paper we provide an updated toolbox that is self contained within a Python package that includes new features such as graphical user interfaces and active-learning based network refinement. Lastly, we provide a step-by-step guide for using DeepLabCut.

Download Full-text

Testing Graphical User Interfaces

Encyclopedia of Information Science and Technology, Second Edition ◽

10.4018/978-1-60566-026-4.ch596 ◽

2011 ◽

pp. 3739-3744 ◽

Cited By ~ 1

Author(s):

Jaymie Strecker ◽

Atif M. Memon

Keyword(s):

User Interfaces ◽

State Of The Art ◽

Graphical User Interfaces ◽

The State ◽

Regression Testing ◽

Test Cases ◽

Test Oracles ◽

Gui Testing ◽

The Many ◽

Technological World

This chapter describes the state of the art in testing GUI-based software. Traditionally, GUI testing has been performed manually or semimanually, with the aid of capture- replay tools. Since this process may be too slow and ineffective to meet the demands of today’s developers and users, recent research in GUI testing has pushed toward automation. Model-based approaches are being used to generate and execute test cases, implement test oracles, and perform regression testing of GUIs automatically. This chapter shows how research to date has addressed the difficulties of testing GUIs in today’s rapidly evolving technological world, and it points to the many challenges that lie ahead.

Download Full-text

Calligraphic Video

International Journal of Creative Interfaces and Computer Graphics ◽

10.4018/jcicg.2010010106 ◽

2010 ◽

Vol 1 (1) ◽

pp. 67-83 ◽

Cited By ~ 2

Author(s):

Sha Xin Wei

Keyword(s):

Heat Flow ◽

User Interfaces ◽

Physical Environment ◽

State Of The Art ◽

Computational Physics ◽

Graphical User Interfaces ◽

The State ◽

Point Of View ◽

The World ◽

Geometric Elements

Since 1984, Graphical User Interfaces have typically relied on visual icons that mimic physical objects like the folder, button, and trash can, or canonical geometric elements like menus, and spreadsheet cells. GUI’s leverage our intuition about the physical environment. But the world can be thought of as being made of stuff as well as things. Making interfaces from this point of view requires a way to simulate the physics of stuff in realtime response to continuous gesture, driven by behavior logic that can be understood by the user and the designer. The author argues for leveraging the corporeal intuition that people learn from birth about heat flow, water, smoke, to develop interfaces at the density of matter that leverage in turn the state of the art in computational physics.

Download Full-text

DGCN: Dynamic Graph Convolutional Network for Efficient Multi-Person Pose Estimation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6867 ◽

2020 ◽

Vol 34 (07) ◽

pp. 11924-11931

Author(s):

Zhongwei Qiu ◽

Kai Qiu ◽

Jianlong Fu ◽

Dongmei Fu

Keyword(s):

Pose Estimation ◽

State Of The Art ◽

Semantic Relations ◽

Dynamic Graphs ◽

Dynamic Graph ◽

Convolutional Network ◽

Bottom Up ◽

Multi Level ◽

Human Pose ◽

Relative Gains

Multi-person pose estimation aims to detect human keypoints from images with multiple persons. Bottom-up methods for multi-person pose estimation have attracted extensive attention, owing to the good balance between efficiency and accuracy. Recent bottom-up methods usually follow the principle of keypoints localization and grouping, where relations between keypoints are the keys to group keypoints. These relations spontaneously construct a graph of keypoints, where the edges represent the relations between two nodes (i.e., keypoints). Existing bottom-up methods mainly define relations by empirically picking out edges from this graph, while omitting edges that may contain useful semantic relations. In this paper, we propose a novel Dynamic Graph Convolutional Module (DGCM) to model rich relations in the keypoints graph. Specifically, we take into account all relations (all edges of the graph) and construct dynamic graphs to tolerate large variations of human pose. The DGCM is quite lightweight, which allows it to be stacked like a pyramid architecture and learn structural relations from multi-level features. Our network with single DGCM based on ResNet-50 achieves relative gains of 3.2% and 4.8% over state-of-the-art bottom-up methods on COCO keypoints and MPII dataset, respectively.

Download Full-text

3D Human Pose Estimation Using Spatio-Temporal Networks with Explicit Occlusion Training

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6689 ◽

2020 ◽

Vol 34 (07) ◽

pp. 10631-10638

Author(s):

Yu Cheng ◽

Bo Yang ◽

Bo Wang ◽

Robby T. Tan

Keyword(s):

Pose Estimation ◽

Ground Truth ◽

Video Data ◽

Training Data ◽

Human Pose Estimation ◽

Ground Truth Data ◽

Public Data ◽

Spatio Temporal ◽

Human Pose ◽

3D Human Pose Estimation

Estimating 3D poses from a monocular video is still a challenging task, despite the significant progress that has been made in the recent years. Generally, the performance of existing methods drops when the target person is too small/large, or the motion is too fast/slow relative to the scale and speed of the training data. Moreover, to our knowledge, many of these methods are not designed or trained under severe occlusion explicitly, making their performance on handling occlusion compromised. Addressing these problems, we introduce a spatio-temporal network for robust 3D human pose estimation. As humans in videos may appear in different scales and have various motion speeds, we apply multi-scale spatial features for 2D joints or keypoints prediction in each individual frame, and multi-stride temporal convolutional networks (TCNs) to estimate 3D joints or keypoints. Furthermore, we design a spatio-temporal discriminator based on body structures as well as limb motions to assess whether the predicted pose forms a valid pose and a valid movement. During training, we explicitly mask out some keypoints to simulate various occlusion cases, from minor to severe occlusion, so that our network can learn better and becomes robust to various degrees of occlusion. As there are limited 3D ground truth data, we further utilize 2D video data to inject a semi-supervised learning capability to our network. Experiments on public data sets validate the effectiveness of our method, and our ablation studies show the strengths of our network's individual submodules.

Download Full-text

Simulation and deep learning on point clouds for robot grasping

Assembly Automation ◽

10.1108/aa-07-2020-0096 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Zhengtuo Wang ◽

Yuetong Xu ◽

Guanhua Xu ◽

Jianzhong Fu ◽

Jiongyan Yu ◽

...

Keyword(s):

Deep Learning ◽

Pose Estimation ◽

Point Clouds ◽

Estimation Algorithm ◽

Training Data ◽

Learning Method ◽

Data Set ◽

Content Type ◽

Experimental Platform ◽

Robot Grasping

Purpose In this work, the authors aim to provide a set of convenient methods for generating training data, and then develop a deep learning method based on point clouds to estimate the pose of target for robot grasping. Design/methodology/approach This work presents a deep learning method PointSimGrasp on point clouds for robot grasping. In PointSimGrasp, a point cloud emulator is introduced to generate training data and a pose estimation algorithm, which, based on deep learning, is designed. After trained with the emulation data set, the pose estimation algorithm could estimate the pose of target. Findings In experiment part, an experimental platform is built, which contains a six-axis industrial robot, a binocular structured-light sensor and a base platform with adjustable inclination. A data set that contains three subsets is set up on the experimental platform. After trained with the emulation data set, the PointSimGrasp is tested on the experimental data set, and an average translation error of about 2–3 mm and an average rotation error of about 2–5 degrees are obtained. Originality/value The contributions are as follows: first, a deep learning method on point clouds is proposed to estimate 6D pose of target; second, a convenient training method for pose estimation algorithm is presented and a point cloud emulator is introduced to generate training data; finally, an experimental platform is built, and the PointSimGrasp is tested on the platform.

Download Full-text

Recent Advances in 3D Human Pose Estimation: From Optimization to Implementation and Beyond

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001422550035 ◽

2022 ◽

Author(s):

Jielu Yan ◽

MingLiang Zhou ◽

Jinli Pan ◽

Meng Yin ◽

Bin Fang

Keyword(s):

Real Time ◽

Pose Estimation ◽

Recent Progress ◽

State Of The Art ◽

The State ◽

Human Pose Estimation ◽

Estimation Techniques ◽

Comprehensive Survey ◽

Human Pose ◽

3D Human Pose Estimation

3D human pose estimation describes estimating 3D articulation structure of a person from an image or a video. The technology has massive potential because it can enable tracking people and analyzing motion in real time. Recently, much research has been conducted to optimize human pose estimation, but few works have focused on reviewing 3D human pose estimation. In this paper, we offer a comprehensive survey of the state-of-the-art methods for 3D human pose estimation, referred to as pose estimation solutions, implementations on images or videos that contain different numbers of people and advanced 3D human pose estimation techniques. Furthermore, different kinds of algorithms are further subdivided into sub-categories and compared in light of different methodologies. To the best of our knowledge, this is the first such comprehensive survey of the recent progress of 3D human pose estimation and will hopefully facilitate the completion, refinement and applications of 3D human pose estimation.

Download Full-text

Turbo Learning Framework for Human-Object Interactions Recognition and Human Pose Estimation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.3301898 ◽

2019 ◽

Vol 33 ◽

pp. 898-905 ◽

Cited By ~ 2

Author(s):

Wei Feng ◽

Wentao Liu ◽

Tong Li ◽

Jing Peng ◽

Chen Qian ◽

...

Keyword(s):

Pose Estimation ◽

Message Passing ◽

Closed Loop ◽

State Of The Art ◽

Human Action ◽

Complementary Information ◽

Learning Framework ◽

Human Object ◽

Human Pose ◽

Object Interactions

Human-object interactions (HOI) recognition and pose estimation are two closely related tasks. Human pose is an essential cue for recognizing actions and localizing the interacted objects. Meanwhile, human action and their interacted objects’ localizations provide guidance for pose estimation. In this paper, we propose a turbo learning framework to perform HOI recognition and pose estimation simultaneously. First, two modules are designed to enforce message passing between the tasks, i.e. pose aware HOI recognition module and HOI guided pose estimation module. Then, these two modules form a closed loop to utilize the complementary information iteratively, which can be trained in an end-to-end manner. The proposed method achieves the state-of-the-art performance on two public benchmarks including Verbs in COCO (V-COCO) and HICO-DET datasets.

Download Full-text

EfficientPose: Scalable single-person pose estimation

Applied Intelligence ◽

10.1007/s10489-020-01918-7 ◽

2020 ◽

Author(s):

Daniel Groos ◽

Heri Ramampiaro ◽

Espen AF Ihlen

Keyword(s):

Pose Estimation ◽

Network Architecture ◽

State Of The Art ◽

Computational Cost ◽

Real Life ◽

Low Complexity ◽

Human Pose Estimation ◽

Efficient Detection ◽

Single Person ◽

Human Pose

Abstract Single-person human pose estimation facilitates markerless movement analysis in sports, as well as in clinical applications. Still, state-of-the-art models for human pose estimation generally do not meet the requirements of real-life applications. The proliferation of deep learning techniques has resulted in the development of many advanced approaches. However, with the progresses in the field, more complex and inefficient models have also been introduced, which have caused tremendous increases in computational demands. To cope with these complexity and inefficiency challenges, we propose a novel convolutional neural network architecture, called EfficientPose, which exploits recently proposed EfficientNets in order to deliver efficient and scalable single-person pose estimation. EfficientPose is a family of models harnessing an effective multi-scale feature extractor and computationally efficient detection blocks using mobile inverted bottleneck convolutions, while at the same time ensuring that the precision of the pose configurations is still improved. Due to its low complexity and efficiency, EfficientPose enables real-world applications on edge devices by limiting the memory footprint and computational cost. The results from our experiments, using the challenging MPII single-person benchmark, show that the proposed EfficientPose models substantially outperform the widely-used OpenPose model both in terms of accuracy and computational efficiency. In particular, our top-performing model achieves state-of-the-art accuracy on single-person MPII, with low-complexity ConvNets.

Download Full-text

A human pose estimation algorithm based on the integration of improved convolutional neural networks and multi-level graph structure constrained model

Personal and Ubiquitous Computing ◽

10.1007/s00779-019-01255-8 ◽

2019 ◽

Vol 23 (3-4) ◽

pp. 607-616

Author(s):

Yang You ◽

Yanmin Zhao

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Pose Estimation ◽

Estimation Algorithm ◽

Human Pose Estimation ◽

Graph Structure ◽

Multi Level ◽

Constrained Model ◽

Human Pose

Download Full-text

Cascaded Deep Monocular 3D Human Pose Estimation With Evolutionary Training Data

2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) ◽

10.1109/cvpr42600.2020.00621 ◽

2020 ◽

Cited By ~ 1

Author(s):

Shichao Li ◽

Lei Ke ◽

Kevin Pratama ◽

Yu-Wing Tai ◽

Chi-Keung Tang ◽

...

Keyword(s):

Pose Estimation ◽

Training Data ◽

Human Pose Estimation ◽

Human Pose ◽

3D Human Pose Estimation

Download Full-text