scholarly journals A New Filtering System for Using a Consumer Depth Camera at Close Range

Sensors ◽  
2019 ◽  
Vol 19 (16) ◽  
pp. 3460 ◽  
Author(s):  
Yuanxing Dai ◽  
Yanming Fu ◽  
Baichun Li ◽  
Xuewei Zhang ◽  
Tianbiao Yu ◽  
...  

Using consumer depth cameras at close range yields a higher surface resolution of the object, but this makes more serious noises. This form of noise tends to be located at or on the edge of the realistic surface over a large area, which is an obstacle for real-time applications that do not rely on point cloud post-processing. In order to fill this gap, by analyzing the noise region based on position and shape, we proposed a composite filtering system for using consumer depth cameras at close range. The system consists of three main modules that are used to eliminate different types of noise areas. Taking the human hand depth image as an example, the proposed filtering system can eliminate most of the noise areas. All algorithms in the system are not based on window smoothing and are accelerated by the GPU. By using Kinect v2 and SR300, a large number of contrast experiments show that the system can get good results and has extremely high real-time performance, which can be used as a pre-step for real-time human-computer interaction, real-time 3D reconstruction, and further filtering.

2021 ◽  
Vol 40 (3) ◽  
pp. 1-12
Author(s):  
Hao Zhang ◽  
Yuxiao Zhou ◽  
Yifei Tian ◽  
Jun-Hai Yong ◽  
Feng Xu

Reconstructing hand-object interactions is a challenging task due to strong occlusions and complex motions. This article proposes a real-time system that uses a single depth stream to simultaneously reconstruct hand poses, object shape, and rigid/non-rigid motions. To achieve this, we first train a joint learning network to segment the hand and object in a depth image, and to predict the 3D keypoints of the hand. With most layers shared by the two tasks, computation cost is saved for the real-time performance. A hybrid dataset is constructed here to train the network with real data (to learn real-world distributions) and synthetic data (to cover variations of objects, motions, and viewpoints). Next, the depth of the two targets and the keypoints are used in a uniform optimization to reconstruct the interacting motions. Benefitting from a novel tangential contact constraint, the system not only solves the remaining ambiguities but also keeps the real-time performance. Experiments show that our system handles different hand and object shapes, various interactive motions, and moving cameras.


2013 ◽  
Vol 765-767 ◽  
pp. 2826-2829 ◽  
Author(s):  
Song Lin ◽  
Rui Min Hu ◽  
Yu Lian Xiao ◽  
Li Yu Gong

In this paper, we propose a novel real-time 3D hand gesture recognition algorithm based on depth information. We segment out the hand region from depth image and convert it to a point cloud. Then, 3D moment invariant features are computed at the point cloud. Finally, support vector machine (SVM) is employed to classify the shape of hand into different categories. We collect a benchmark dataset using Microsoft Kinect for Xbox and test the propose algorithm on it. Experimental results prove the robustness of our proposed algorithm.


Complexity ◽  
2018 ◽  
Vol 2018 ◽  
pp. 1-10 ◽  
Author(s):  
Lei Yu

The interactive projection systems based on deep images are usually disturbed by the mixed noise. Generally, several filtering methods are used in combination to resolve this problem. Although the hybrid filter can guarantee the accuracy of the image, but the algorithm is complex and time-consuming, which affects the real-time performance of the interactive projection system. In this paper, the switching system method is introduced into the filter for the first time, and an arbitrary switching filter algorithm is proposed and applied to the depth image filtering system based on Kinect sensor. The experimental results demonstrate and validate that the proposed switching filter algorithm not only effectively removes the noise but also ensures the real-time performance of tracking and achieves good target tracking performance, which makes it applicable in various image filtering processing systems.


Electronics ◽  
2020 ◽  
Vol 9 (5) ◽  
pp. 836 ◽  
Author(s):  
Young-Hoon Jin ◽  
In-Tae Hwang ◽  
Won-Hyung Lee

Augmented reality (AR) is a useful visualization technology that displays information by adding virtual images to the real world. In AR systems that require three-dimensional information, point cloud data is easy to use after real-time acquisition, however, it is difficult to measure and visualize real-time objects due to the large amount of data and a matching process. In this paper we explored a method of estimating pipes from point cloud data and visualizing them in real-time through augmented reality devices. In general, pipe estimation in a point cloud uses a Hough transform and is performed through a preprocessing process, such as noise filtering, normal estimation, or segmentation. However, there is a disadvantage in that the execution time is slow due to a large amount of computation. Therefore, for the real-time visualization in augmented reality devices, the fast cylinder matching method using random sample consensus (RANSAC) is required. In this paper, we proposed parallel processing, multiple frames, adjustable scale, and error correction for real-time visualization. The real-time visualization method through the augmented reality device obtained a depth image from the sensor and configured a uniform point cloud using a voxel grid algorithm. The constructed data was analyzed according to the fast cylinder matching method using RANSAC. The real-time visualization method through augmented reality devices is expected to be used to identify problems, such as the sagging of pipes, through real-time measurements at plant sites due to the spread of various AR devices.


Sensors ◽  
2021 ◽  
Vol 21 (3) ◽  
pp. 685
Author(s):  
Xuan Gong ◽  
Zichun Le ◽  
Yukun Wu ◽  
Hui Wang

This paper explored a pragmatic approach to research the real-time performance of a multiway concurrent multiobject tracking (MOT) system. At present, most research has focused on the tracking of single-image sequences, but in practical applications, multiway video streams need to be processed in parallel by MOT systems. There have been few studies on the real-time performance of multiway concurrent MOT systems. In this paper, we proposed a new MOT framework to solve multiway concurrency scenario based on a tracking-by-detection (TBD) model. The new framework mainly focuses on concurrency and real-time based on limited computing and storage resources, while considering the algorithm performance. For the former, three aspects were studied: (1) Expanded width and depth of tracking-by-detection model. In terms of width, the MOT system can support the process of multiway video sequence at the same time; in terms of depth, image collectors and bounding box collectors were introduced to support batch processing. (2) Considering the real-time performance and multiway concurrency ability, we proposed one kind of real-time MOT algorithm based on directly driven detection. (3) Optimization of system level—we also utilized the inference optimization features of NVIDIA TensorRT to accelerate the deep neural network (DNN) in the tracking algorithm. To trade off the performance of the algorithm, a negative sample (false detection sample) filter was designed to ensure tracking accuracy. Meanwhile, the factors that affect the system real-time performance and concurrency were studied. The experiment results showed that our method has a good performance in processing multiple concurrent real-time video streams.


Author(s):  
Le Wang ◽  
Shengquan Xie ◽  
Wenjun Xu ◽  
Bitao Yao ◽  
Jia Cui ◽  
...  

Abstract In complex industrial human-robot collaboration (HRC) environment, obstacles in the shared working space will occlude the operator, and the industrial robot will threaten the safety of the operator if it is unable to get the complete human spatial point cloud. This paper proposes a real-time human point cloud inpainting method based on the deep generative model. The method can recover the human point cloud occluded by obstacles in the shared working space to ensure the safety of the operator. The method proposed in this paper can be mainly divided into three parts: (i) real-time obstacles detection. This process can detect obstacle locations in real time and generate the image of obstacles. (ii) the application of the deep generative model algorithm. It is a complete convolutional neural network (CNN) structure and introduces advanced generative adversarial loss. The model can generate the missing depth data of operators at arbitrary position in the human depth image. (iii) spatial mapping of the depth image. The depth image will be mapped to point cloud by coordinate system conversion. The effectiveness of the method is verified by filling hole of the human point cloud occluded by obstacles in industrial HRC environment. The experiment results show that the proposed method can accurately generate the occluded human point cloud in real time and ensure the safety of the operator.


Author(s):  
Sara Greenberg ◽  
John McPhee ◽  
Alexander Wong

Fitting a kinematic model of the human body to an image withoutthe use of markers is a method of pose estimation that is usefulfor tracking and posture evaluation. This model-fitting is challengingdue to the variation in human physique and the large numberof possible poses. One type of modeling is to represent the humanbody as a set of rigid body volumes. These volumes can beregistered to a target point cloud acquired from a depth camerausing the Iterative Closest Point (ICP) algorithm. The speed of ICPregistration is inversely proportional to the number of points in themodel and the target point clouds, and using the entire target pointcloud in this registration is too slow for real-time applications. Thiswork proposes the use of data-driven Monte Carlo methods to selecta subset of points from the target point cloud that maintains orimproves the accuracy of the point cloud registration for joint localizationin real time. For this application, we investigate curvature ofthe depth image as the driving variable to guide the sampling, andcompare it with benchmark random sampling techniques.


2019 ◽  
Vol 4 (2) ◽  
pp. 177
Author(s):  
Ervin Yohannes ◽  
Fitri Utaminingrum ◽  
Timothy K. Shih

In recent years, depth images are popular research in imageprocessing, especially in clustering field. The depth image can captureby depth cameras such as Kinect, Intel Real Sense, Leap Motion, and etc.Many objects and methods can be implemented in clustering field andissues. One of popular object is human hand since has many functionsand important parts of human body for daily routines. Besides, theclustering method has been developed for any goal and even combinewith another method. One of clustering method is Density-Based SpatialClustering of Applications with Noise (DBSCAN) which automaticclustering method consists of minimum points and epsilon. Define theepsilon in DBSCAN is important thing since the result depends on those.We want to look for the best epsilon for clustering human hand in thedepth images. We selected the epsilon from 5 until 100 for getting thebest clustering results. Moreover, those epsilons will be testing in threedistance to get accurate results.


2013 ◽  
Vol 765-767 ◽  
pp. 2822-2825 ◽  
Author(s):  
Lin Song ◽  
Rui Min Hu ◽  
Yu Lian Xiao ◽  
Li Yu Gong

In this paper, we propose a depth image based real-time 3D hand tracking method. Our method is based on the fact that human hand is an end point of human body. Therefore, we locate human hand by finding the end point from a predicted position of hand based on the hand position of the previous frame. We iteratively grow a region around the predicted position. The end point on the major axis of the region which stops moving with region growing is selected as the final position of human hand. Experiments on Microsoft Kinect for Xbox captured sequences show the effectiveness and efficiency of our proposed method.


2020 ◽  
Vol 10 (23) ◽  
pp. 8534
Author(s):  
Haozhe Yang ◽  
Zhiling Wang ◽  
Linglong Lin ◽  
Huawei Liang ◽  
Weixin Huang ◽  
...  

The perception system has become a topic of great importance for autonomous vehicles, as high accuracy and real-time performance can ensure safety in complex urban scenarios. Clustering is a fundamental step for parsing point cloud due to the extensive input data (over 100,000 points) of a wide variety of complex objects. It is still challenging to achieve high precision real-time performance with limited vehicle-mounted computing resources, which need to balance the accuracy and processing time. We propose a method based on a Two-Layer-Graph (TLG) structure, which can be applied in a real autonomous vehicle under urban scenarios. TLG can describe the point clouds hierarchically, we use a range graph to represent point clouds and a set graph for point cloud sets, which reduce both processing time and memory consumption. In the range graph, Euclidean distance and the angle of the sensor position with two adjacent vectors (calculated from continuing points to different direction) are used as the segmentation standard, which use the local concave features to distinguish different objects close to each other. In the set graph, we use the start and end position to express the whole set of continuous points concisely, and an improved Breadth-First-Search (BFS) algorithm is designed to update categories of point cloud sets between different channels. This method is evaluated on real vehicles and major datasets. The results show that TLG succeeds in providing a real-time performance (less than 20 ms per frame), and a high segmentation accuracy rate (93.64%) for traffic objects in the road of urban scenarios.


Sign in / Sign up

Export Citation Format

Share Document