scholarly journals The Estimation of Hand Pose Based on Mean-Shift Tracking Using the Fusion of Color and Depth Information for Marker-less Augmented Reality

2012 ◽  
Vol 17 (7) ◽  
pp. 155-166 ◽  
Author(s):  
Sun-Hyoung Lee ◽  
Hern-Soo Hahn ◽  
Young-Joon Han
10.29007/72d4 ◽  
2018 ◽  
Author(s):  
He Liu ◽  
Edouard Auvinet ◽  
Joshua Giles ◽  
Ferdinando Rodriguez Y Baena

Computer Aided Surgery (CAS) is helpful, but it clutters an already overcrowded operating theatre, and tends to disrupt the workflow of conventional surgery. In order to provide seamless computer assistance with improved immersion and a more natural surgical workflow, we propose an augmented-reality based navigation system for CAS. Here, we choose to focus on the proximal femoral anatomy, which we register to a plan by processing depth information of the surgical site captured by a commercial depth camera. Intra-operative three-dimensional surgical guidance is then provided to the surgeon through a commercial augmented reality headset, to drill a pilot hole in the femoral head, so that the user can perform the operation without additional physical guides. The user can interact intuitively with the system by simple gestures and voice commands, resulting in a more natural workflow. To assess the surgical accuracy of the proposed setup, 30 experiments of pilot hole drilling were performed on femur phantoms. The position and the orientation of the drilled guide holes were measured and compared with the preoperative plan, and the mean errors were within 2mm and 2°, results which are in line with commercial computer assisted orthopedic systems today.


Sensors ◽  
2021 ◽  
Vol 21 (18) ◽  
pp. 6095
Author(s):  
Xiaojing Sun ◽  
Bin Wang ◽  
Longxiang Huang ◽  
Qian Zhang ◽  
Sulei Zhu ◽  
...  

Despite recent successes in hand pose estimation from RGB images or depth maps, inherent challenges remain. RGB-based methods suffer from heavy self-occlusions and depth ambiguity. Depth sensors rely heavily on distance and can only be used indoors, thus there are many limitations to the practical application of depth-based methods. The aforementioned challenges have inspired us to combine the two modalities to offset the shortcomings of the other. In this paper, we propose a novel RGB and depth information fusion network to improve the accuracy of 3D hand pose estimation, which is called CrossFuNet. Specifically, the RGB image and the paired depth map are input into two different subnetworks, respectively. The feature maps are fused in the fusion module in which we propose a completely new approach to combine the information from the two modalities. Then, the common method is used to regress the 3D key-points by heatmaps. We validate our model on two public datasets and the results reveal that our model outperforms the state-of-the-art methods.


Sensors ◽  
2021 ◽  
Vol 21 (20) ◽  
pp. 6747
Author(s):  
Yang Liu ◽  
Jie Jiang ◽  
Jiahao Sun ◽  
Xianghan Wang

Hand pose estimation from RGB images has always been a difficult task, owing to the incompleteness of the depth information. Moon et al. improved the accuracy of hand pose estimation by using a new network, InterNet, through their unique design. Still, the network still has potential for improvement. Based on the architecture of MobileNet v3 and MoGA, we redesigned a feature extractor that introduced the latest achievements in the field of computer vision, such as the ACON activation function and the new attention mechanism module, etc. Using these modules effectively with our network, architecture can better extract global features from an RGB image of the hand, leading to a greater performance improvement compared to InterNet and other similar networks.


Author(s):  
Yue Wang ◽  
Shusheng Zhang ◽  
Xiaoliang Bai

To improve the robustness and applicability of 3D tracking and registration for augmented reality(AR) aided mechanical assembly system, a 3D registration and tracking method based on the point cloud and visual features is proposed. Firstly, the reference model point cloud is used to definite absolute tracking coordinate system, thus the locating datum of the virtual assembly guidance information is determined. Then by adding visual features matching to the iterative closest points (ICP) registration process, the robustness of tracking and registration is improved. In order to obtain sufficient number of visual feature matching points in this process, a visual feature matching strategy based on orientation vector consistency is proposed. Finally, the loop closure detection and global pose optimization from key frames are added in the tracking registration process. The experimental result shows that the proposed method has good real-time performance and accuracy, and the running speed can reach 30 frames per second. Moreover, it also shows good robustness when the camera is moving fast and the depth information is inaccurate, and the comprehensive performance of the proposed method is better than the KinectFusion method.


2021 ◽  
Vol 9 ◽  
Author(s):  
Yunpeng Liu ◽  
Xingpeng Yan ◽  
Xinlei Liu ◽  
Xi Wang ◽  
Tao Jing ◽  
...  

In this paper, an optical field coding method for the fusion of real and virtual scenes is proposed to implement an augmented reality (AR)-based holographic stereogram. The occlusion relationship between the real and virtual scenes is analyzed, and a fusion strategy based on instance segmentation and depth determination is proposed. A real three-dimensional (3D) scene sampling system is built, and the foreground contour of the sampled perspective image is extracted by the Mask R-CNN instance segmentation algorithm. The virtual 3D scene is rendered by a computer to obtain the virtual sampled images as well as their depth maps. According to the occlusion relation of the fusion scenes, the pseudo-depth map of the real scene is derived, and the fusion coding of 3D real and virtual scenes information is implemented by the depth information comparison. The optical experiment indicates that AR-based holographic stereogram fabricated by our coding method can reconstruct real and virtual fused 3D scenes with correct occlusion and depth cues on full parallax.


Author(s):  
Anna L. Roethe ◽  
Judith Rösler ◽  
Martin Misch ◽  
Peter Vajkoczy ◽  
Thomas Picht

Abstract Background Augmented reality (AR) has the potential to support complex neurosurgical interventions by including visual information seamlessly. This study examines intraoperative visualization parameters and clinical impact of AR in brain tumor surgery. Methods Fifty-five intracranial lesions, operated either with AR-navigated microscope (n = 39) or conventional neuronavigation (n = 16) after randomization, have been included prospectively. Surgical resection time, duration/type/mode of AR, displayed objects (n, type), pointer-based navigation checks (n), usability of control, quality indicators, and overall surgical usefulness of AR have been assessed. Results AR display has been used in 44.4% of resection time. Predominant AR type was navigation view (75.7%), followed by target volumes (20.1%). Predominant AR mode was picture-in-picture (PiP) (72.5%), followed by 23.3% overlay display. In 43.6% of cases, vision of important anatomical structures has been partially or entirely blocked by AR information. A total of 7.7% of cases used MRI navigation only, 30.8% used one, 23.1% used two, and 38.5% used three or more object segmentations in AR navigation. A total of 66.7% of surgeons found AR visualization helpful in the individual surgical case. AR depth information and accuracy have been rated acceptable (median 3.0 vs. median 5.0 in conventional neuronavigation). The mean utilization of the navigation pointer was 2.6 × /resection hour (AR) vs. 9.7 × /resection hour (neuronavigation); navigation effort was significantly reduced in AR (P < 0.001). Conclusions The main benefit of HUD-based AR visualization in brain tumor surgery is the integrated continuous display allowing for pointer-less navigation. Navigation view (PiP) provides the highest usability while blocking the operative field less frequently. Visualization quality will benefit from improvements in registration accuracy and depth impression. German clinical trials registration number. DRKS00016955.


2009 ◽  
Vol 21 (5) ◽  
pp. 509-521 ◽  
Author(s):  
Jiejie Zhu ◽  
Zhigeng Pan ◽  
Chao Sun ◽  
Wenzhi Chen

2021 ◽  
Author(s):  
Digang Sun ◽  
Ping Zhang ◽  
Mingxuan Chen ◽  
Jiaxin Chen

With an increasing number of robots are employed in manufacturing, a human-robot interaction method that can teach robots in a natural, accurate, and rapid manner is needed. In this paper, we propose a novel human-robot interface based on the combination of static hand gestures and hand poses. In our proposed interface, the pointing direction of the index finger and the orientation of the whole hand are extracted to indicate the moving direction and orientation of the robot in a fast-teaching mode. A set of hand gestures are designed according to their usage in humans' daily life and recognized to control the position and orientation of the robot in a fine-teaching mode. We employ the feature extraction ability of the hand pose estimation network via transfer learning and utilize attention mechanisms to improve the performance of the hand gesture recognition network. The inputs of hand pose estimation and hand gesture recognition networks are monocular RGB images, making our method independent of depth information input and applicable to more scenarios. In the regular shape reconstruction experiments on the UR3 robot, the mean error of the reconstructed shape is less than 1 mm, which demonstrates the effectiveness and efficiency of our method.


2021 ◽  
Author(s):  
Digang Sun ◽  
Ping Zhang ◽  
Mingxuan Chen ◽  
Jiaxin Chen

With an increasing number of robots are employed in manufacturing, a human-robot interaction method that can teach robots in a natural, accurate, and rapid manner is needed. In this paper, we propose a novel human-robot interface based on the combination of static hand gestures and hand poses. In our proposed interface, the pointing direction of the index finger and the orientation of the whole hand are extracted to indicate the moving direction and orientation of the robot in a fast-teaching mode. A set of hand gestures are designed according to their usage in humans' daily life and recognized to control the position and orientation of the robot in a fine-teaching mode. We employ the feature extraction ability of the hand pose estimation network via transfer learning and utilize attention mechanisms to improve the performance of the hand gesture recognition network. The inputs of hand pose estimation and hand gesture recognition networks are monocular RGB images, making our method independent of depth information input and applicable to more scenarios. In the regular shape reconstruction experiments on the UR3 robot, the mean error of the reconstructed shape is less than 1 mm, which demonstrates the effectiveness and efficiency of our method.


Sign in / Sign up

Export Citation Format

Share Document