scholarly journals Real-time 6D Racket Pose Estimation and Classification for Table Tennis Robots

Author(s):  
Yapeng Gao

For table tennis robots, it is a significant challenge to understand the opponent's movements and return the ball accordingly with high performance. One has to cope with various ball speeds and spins resulting from different stroke types. In this paper, we propose a real-time 6D racket pose detection method and classify racket movements into five stroke categories with a neural network. By using two monocular cameras, we can extract the racket's contours and choose some special points as feature points in image coordinates. With the 3D geometrical information of a racket, a wide baseline stereo matching method is proposed to find the corresponding feature points and compute the 3D position and orientation of the racket by triangulation and plane fitting. Then, a Kalman filter is adopted to track the racket pose, and a multilayer perceptron (MLP) neural network is used to classify the pose movements. We conduct two experiments to evaluate the accuracy of racket pose detection and classification, in which the average error in position and orientation is around 7.8 mm and 7.2 by comparing with the ground truth from a KUKA robot. The classification accuracy is 98%, the same as the human pose estimation method with Convolutional Pose Machines (CPMs).

2019 ◽  
Vol 2019 ◽  
pp. 1-11
Author(s):  
Daoyong Fu ◽  
Wei Li ◽  
Songchen Han ◽  
Xinyan Zhang ◽  
Zhaohuan Zhan ◽  
...  

The pose estimation of the aircraft in the airport plays an important role in preventing collisions and constructing the real-time scene of the airport. However, current airport target surveillance methods regard the aircraft as a point, neglecting the importance of pose estimation. Inspired by human pose estimation, this paper presents an aircraft pose estimation method based on a convolutional neural network through reconstructing the two-dimensional skeleton of an aircraft. Firstly, the key points of an aircraft and the matching relationship are defined to design a 2D skeleton of an aircraft. Secondly, a convolutional neural network is designed to predict all key points and components of the aircraft kept in the confidence maps and the Correlation Fields, respectively. Thirdly, all key points are coarsely matched based on the matching relationship and then refined through the Correlation Fields. Finally, the 2D skeleton of an aircraft is reconstructed. To overcome the lack of benchmark dataset, the airport surveillance video and Autodesk 3ds Max are utilized to build two datasets. Experiment results show that the proposed method get better performance in terms of accuracy and efficiency compared with other related methods.


2021 ◽  
Author(s):  
Dengqing Tang ◽  
Lincheng Shen ◽  
Xiaojiao Xiang ◽  
Han Zhou ◽  
Tianjiang Hu

<p>We propose a learning-type anchors-driven real-time pose estimation method for the autolanding fixed-wing unmanned aerial vehicle (UAV). The proposed method enables online tracking of both position and attitude by the ground stereo vision system in the Global Navigation Satellite System denied environments. A pipeline of convolutional neural network (CNN)-based UAV anchors detection and anchors-driven UAV pose estimation are employed. To realize robust and accurate anchors detection, we design and implement a Block-CNN architecture to reduce the impact of the outliers. With the basis of the anchors, monocular and stereo vision-based filters are established to update the UAV position and attitude. To expand the training dataset without extra outdoor experiments, we develop a parallel system containing the outdoor and simulated systems with the same configuration. Simulated and outdoor experiments are performed to demonstrate the remarkable pose estimation accuracy improvement compared with the conventional Perspective-N-Points solution. In addition, the experiments also validate the feasibility of the proposed architecture and algorithm in terms of the accuracy and real-time capability requirements for fixed-wing autolanding UAVs.</p>


Sensors ◽  
2020 ◽  
Vol 20 (10) ◽  
pp. 2828
Author(s):  
Mhd Rashed Al Koutayni ◽  
Vladimir Rybalkin ◽  
Jameel Malik ◽  
Ahmed Elhayek ◽  
Christian Weis ◽  
...  

The estimation of human hand pose has become the basis for many vital applications where the user depends mainly on the hand pose as a system input. Virtual reality (VR) headset, shadow dexterous hand and in-air signature verification are a few examples of applications that require to track the hand movements in real-time. The state-of-the-art 3D hand pose estimation methods are based on the Convolutional Neural Network (CNN). These methods are implemented on Graphics Processing Units (GPUs) mainly due to their extensive computational requirements. However, GPUs are not suitable for the practical application scenarios, where the low power consumption is crucial. Furthermore, the difficulty of embedding a bulky GPU into a small device prevents the portability of such applications on mobile devices. The goal of this work is to provide an energy efficient solution for an existing depth camera based hand pose estimation algorithm. First, we compress the deep neural network model by applying the dynamic quantization techniques on different layers to achieve maximum compression without compromising accuracy. Afterwards, we design a custom hardware architecture. For our device we selected the FPGA as a target platform because FPGAs provide high energy efficiency and can be integrated in portable devices. Our solution implemented on Xilinx UltraScale+ MPSoC FPGA is 4.2× faster and 577.3× more energy efficient than the original implementation of the hand pose estimation algorithm on NVIDIA GeForce GTX 1070.


IEEE Access ◽  
2020 ◽  
Vol 8 ◽  
pp. 191542-191550
Author(s):  
Ali Rohan ◽  
Mohammed Rabah ◽  
Tarek Hosny ◽  
Sung-Ho Kim

2019 ◽  
Vol 52 (7-8) ◽  
pp. 855-868 ◽  
Author(s):  
Guo-Qin Gao ◽  
Qian Zhang ◽  
Shu Zhang

For the factors of complex image background, unobvious end-effector characteristics and uneven illumination in the pose detection of parallel robot based on binocular vision, the detection speed, and accuracy cannot meet the requirement of the closed-loop control. So a pose detection method based on improved RANSAC algorithm is presented. First, considering that the image of parallel robot is rigid and has multiple corner points, the Harris–Scale Invariant Feature Transform algorithm is adopted to realize image prematching. The feature points are extracted by Harris and matched by Scale Invariant Feature Transform to realize good accuracy and real-time performance. Second, for the mismatching from prematching, an improved RANSAC algorithm is proposed to refine the prematching results. This improved algorithm can overcome the disadvantages of mismatching and time-consuming of the conventional RANSAC algorithm by selecting feature points in separated grids of the images and predetecting to validate provisional model. The improved RANSAC algorithm was applied to a self-developed novel 3-degrees of freedom parallel robot to verify the validity. The experiment results show that, compared with the conventional algorithm, the average matching time decreases by 63.45%, the average matching accuracy increases by 15.66%, the average deviations of pose detection in Y direction, Z direction, and roll angle [Formula: see text] decrease by 0.871 mm, 0.82 mm, and 0.704°, respectively, using improved algorithm to refine the prematching results. The real-time performance and accuracy of pose detection of parallel robot can be improved.


2015 ◽  
Vol 2015 ◽  
pp. 1-15
Author(s):  
Huan Liu ◽  
Kuangrong Hao ◽  
Yongsheng Ding ◽  
Chunjuan Ouyang

Stereo feature matching is a technique that finds an optimal match in two images from the same entity in the three-dimensional world. The stereo correspondence problem is formulated as an optimization task where an energy function, which represents the constraints on the solution, is to be minimized. A novel intelligent biological network (Bio-Net), which involves the human B-T cells immune system into neural network, is proposed in this study in order to learn the robust relationship between the input feature points and the output matched points. A model from input-output data (left reference point-right target point) is established. In the experiments, the abdomen reconstructions for different-shape mannequins are then performed by means of the proposed method. The final results are compared and analyzed, which demonstrate that the proposed approach greatly outperforms the single neural network and the conventional matching algorithm in precise. Particularly, as far as time cost and efficiency, the proposed method exhibits its significant promising and potential for improvement. Hence, it is entirely considered as an effective and feasible alternative option for stereo matching.


2021 ◽  
Author(s):  
Zhimin Zhang ◽  
◽  
Jianzhong Qiao ◽  
Shukuan Lin ◽  
◽  
...  

The depth and pose information are the basic issues in the field of robotics, autonomous driving, and virtual reality, and are also the focus and difficult issues of computer vision research. The supervised monocular depth and pose estimation learning are not feasible in environments where labeled data is not abundant. Self-supervised monocular video methods can learn effectively only by applying photometric constraints without expensive ground true depth label constraints, which results in an inefficient training process and suboptimal estimation accuracy. To solve these problems, a monocular weakly supervised depth and pose estimation method based on multi-information fusion is proposed in this paper. First, we design a high-precision stereo matching method to generate a depth and pose data as the "Ground Truth" labels to solve the problem that the ground truth labels are difficult to obtain. Then, we construct a multi-information fusion network model based on the "Ground truth" labels, video sequence, and IMU information to improve the estimation accuracy. Finally, we design the loss function of supervised cues based on "Ground Truth" labels cues and self-supervised cues to optimize our model. In the testing phase, the network model can separately output high-precision depth and pose data from a monocular video sequence. The resulting model outperforms mainstream monocular depth and poses estimation methods as well as the partial stereo matching method in the challenging KITTI dataset by only using a small number of real training data(200 pairs).


Sensors ◽  
2018 ◽  
Vol 18 (9) ◽  
pp. 3083 ◽  
Author(s):  
Hao Li ◽  
Qibing Zhu ◽  
Min Huang ◽  
Ya Guo ◽  
Jianwei Qin

The space pose of fruits is necessary for accurate detachment in automatic harvesting. This study presents a novel pose estimation method for sweet pepper detachment. In this method, the normal to the local plane at each point in the sweet-pepper point cloud was first calculated. The point cloud was separated by a number of candidate planes, and the scores of each plane were then separately calculated using the scoring strategy. The plane with the lowest score was selected as the symmetry plane of the point cloud. The symmetry axis could be finally calculated from the selected symmetry plane, and the pose of sweet pepper in the space was obtained using the symmetry axis. The performance of the proposed method was evaluated by simulated and sweet-pepper cloud dataset tests. In the simulated test, the average angle error between the calculated symmetry and real axes was approximately 6.5°. In the sweet-pepper cloud dataset test, the average error was approximately 7.4° when the peduncle was removed. When the peduncle of sweet pepper was complete, the average error was approximately 6.9°. These results suggested that the proposed method was suitable for pose estimation of sweet peppers and could be adjusted for use with other fruits and vegetables.


2018 ◽  
Vol 103 ◽  
pp. 1-12 ◽  
Author(s):  
Byungtae Ahn ◽  
Dong-Geol Choi ◽  
Jaesik Park ◽  
In So Kweon

Sign in / Sign up

Export Citation Format

Share Document