scholarly journals Real-Time Height Measurement for Moving Pedestrians

Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-15
Author(s):  
Wenju Zhou ◽  
Fulong Yao ◽  
Wei Feng ◽  
Haikuan Wang

Height measurement for moving pedestrians is quite significant in many scenarios, such as pedestrian positioning, criminal suspect tracking, and virtual reality. Although some existing height measurement methods can detect the height of the static people, it is hard to measure height accurately for moving pedestrians. Considering the height fluctuations in dynamic situation, this paper proposes a real-time height measurement based on the Time-of-Flight (TOF) camera. Depth images in a continuous sequence are addressed to obtain the real-time height of the pedestrian with moving. Firstly, a normalization equation is presented to convert the depth image into the grey image for a lower time cost and better performance. Secondly, a difference-particle swarm optimization (D-PSO) algorithm is proposed to remove the complex background and reduce the noises. Thirdly, a segmentation algorithm based on the maximally stable extremal regions (MSERs) is introduced to extract the pedestrian head region. Then, a novel multilayer iterative average algorithm (MLIA) is developed for obtaining the height of dynamic pedestrians. Finally, Kalman filtering is used to improve the measurement accuracy by combining the current measurement and the height at the last moment. In addition, the VICON system is adopted as the ground truth to verify the proposed method, and the result shows that our method can accurately measure the real-time height of moving pedestrians.

2021 ◽  
Vol 40 (3) ◽  
pp. 1-12
Author(s):  
Hao Zhang ◽  
Yuxiao Zhou ◽  
Yifei Tian ◽  
Jun-Hai Yong ◽  
Feng Xu

Reconstructing hand-object interactions is a challenging task due to strong occlusions and complex motions. This article proposes a real-time system that uses a single depth stream to simultaneously reconstruct hand poses, object shape, and rigid/non-rigid motions. To achieve this, we first train a joint learning network to segment the hand and object in a depth image, and to predict the 3D keypoints of the hand. With most layers shared by the two tasks, computation cost is saved for the real-time performance. A hybrid dataset is constructed here to train the network with real data (to learn real-world distributions) and synthetic data (to cover variations of objects, motions, and viewpoints). Next, the depth of the two targets and the keypoints are used in a uniform optimization to reconstruct the interacting motions. Benefitting from a novel tangential contact constraint, the system not only solves the remaining ambiguities but also keeps the real-time performance. Experiments show that our system handles different hand and object shapes, various interactive motions, and moving cameras.


Sensors ◽  
2021 ◽  
Vol 21 (4) ◽  
pp. 1299
Author(s):  
Honglin Yuan ◽  
Tim Hoogenkamp ◽  
Remco C. Veltkamp

Deep learning has achieved great success on robotic vision tasks. However, when compared with other vision-based tasks, it is difficult to collect a representative and sufficiently large training set for six-dimensional (6D) object pose estimation, due to the inherent difficulty of data collection. In this paper, we propose the RobotP dataset consisting of commonly used objects for benchmarking in 6D object pose estimation. To create the dataset, we apply a 3D reconstruction pipeline to produce high-quality depth images, ground truth poses, and 3D models for well-selected objects. Subsequently, based on the generated data, we produce object segmentation masks and two-dimensional (2D) bounding boxes automatically. To further enrich the data, we synthesize a large number of photo-realistic color-and-depth image pairs with ground truth 6D poses. Our dataset is freely distributed to research groups by the Shape Retrieval Challenge benchmark on 6D pose estimation. Based on our benchmark, different learning-based approaches are trained and tested by the unified dataset. The evaluation results indicate that there is considerable room for improvement in 6D object pose estimation, particularly for objects with dark colors, and photo-realistic images are helpful in increasing the performance of pose estimation algorithms.


2020 ◽  
Author(s):  
Rui Fan ◽  
Hengli Wang ◽  
Bohuan Xue ◽  
Huaiyang Huang ◽  
Yuan Wang ◽  
...  

Over the past decade, significant efforts have been made to improve the trade-off between speed and accuracy of surface normal estimators (SNEs). This paper introduces an accurate and ultrafast SNE for structured range data. The proposed approach computes surface normals by simply performing three filtering operations, namely, two image gradient filters (in horizontal and vertical directions, respectively) and a mean/median filter, on an inverse depth image or a disparity image. Despite the simplicity of the method, no similar method already exists in the literature. In our experiments, we created three large-scale synthetic datasets (easy, medium and hard) using 24 3-dimensional (3D) mesh models. Each mesh model is used to generate 1800--2500 pairs of 480x640 pixel depth images and the corresponding surface normal ground truth from different views. The average angular errors with respect to the easy, medium and hard datasets are 1.6 degrees, 5.6 degrees and 15.3 degrees, respectively. Our C++ and CUDA implementations achieve a processing speed of over 260 Hz and 21 kHz, respectively. Our proposed SNE achieves a better overall performance than all other existing computer vision-based SNEs. Our datasets and source code are publicly available at: sites.google.com/view/3f2n.


2020 ◽  
Author(s):  
Rui Fan ◽  
Hengli Wang ◽  
Bohuan Xue ◽  
Huaiyang Huang ◽  
Yuan Wang, ◽  
...  

Over the past decade, significant efforts have been made to improve the trade-off between speed and accuracy of surface normal estimators (SNEs). This paper introduces an accurate and ultrafast SNE for structured range data. The proposed approach computes surface normals by simply performing three filtering operations, namely, two image gradient filters (in horizontal and vertical directions, respectively) and a mean/median filter, on an inverse depth image or a disparity image. Despite the simplicity of the method, no similar method already exists in the literature. In our experiments, we created three large-scale synthetic datasets (easy, medium and hard) using 24 3-dimensional (3D) mesh models. Each mesh model is used to generate 1800--2500 pairs of 480x640 pixel depth images and the corresponding surface normal ground truth from different views. The average angular errors with respect to the easy, medium and hard datasets are 1.6 degrees, 5.6 degrees and 15.3 degrees, respectively. Our C++ and CUDA implementations achieve a processing speed of over 260 Hz and 21 kHz, respectively. Our proposed SNE achieves a better overall performance than all other existing computer vision-based SNEs. Our datasets and source code are publicly available at: sites.google.com/view/3f2n.


Author(s):  
Jiazhen Pang ◽  
Yuan Li ◽  
Jie Zhang ◽  
Jianfeng Yu

Abstract Manual work is a weak link within the intelligent manufacturing, however, it plays an important role in the highly customized and multi-variety assembling. Assisted by intelligent assembling technology such as augmented reality, a manual worker can integrate into the cyber-physics system to improve efficiency and reduce errors, which is of great engineering significance in the assembling field of industry 4.0. Assembly recognition is the initial part of progress analysis and it has predictable changing progress stages which can be matched with the digital model for recognition constraints. Therefore, based on the similarity between spatial increment information and part model, a real-time assembly recognition method is proposed in this paper. Firstly, the depth images from the multi-camera system were used to capture the assembling scene. Then, compared with the previous assembling scene, the spatial incremental information was used to quantitatively represent the assembled part. The spatial increment information and digital model are described with distance distribution. Finally, based on Earth mover’s distance algorithm, the matching between the spatial increment information and the part model indicates the part which had been assembled to realize the real-time assembly recognition. In the case study, an assembling process for 3D printing assembly which corresponded with the digital model was used to approve the feasibility of the real-time assembly recognition method.


Complexity ◽  
2018 ◽  
Vol 2018 ◽  
pp. 1-10 ◽  
Author(s):  
Lei Yu

The interactive projection systems based on deep images are usually disturbed by the mixed noise. Generally, several filtering methods are used in combination to resolve this problem. Although the hybrid filter can guarantee the accuracy of the image, but the algorithm is complex and time-consuming, which affects the real-time performance of the interactive projection system. In this paper, the switching system method is introduced into the filter for the first time, and an arbitrary switching filter algorithm is proposed and applied to the depth image filtering system based on Kinect sensor. The experimental results demonstrate and validate that the proposed switching filter algorithm not only effectively removes the noise but also ensures the real-time performance of tracking and achieves good target tracking performance, which makes it applicable in various image filtering processing systems.


Electronics ◽  
2020 ◽  
Vol 9 (5) ◽  
pp. 836 ◽  
Author(s):  
Young-Hoon Jin ◽  
In-Tae Hwang ◽  
Won-Hyung Lee

Augmented reality (AR) is a useful visualization technology that displays information by adding virtual images to the real world. In AR systems that require three-dimensional information, point cloud data is easy to use after real-time acquisition, however, it is difficult to measure and visualize real-time objects due to the large amount of data and a matching process. In this paper we explored a method of estimating pipes from point cloud data and visualizing them in real-time through augmented reality devices. In general, pipe estimation in a point cloud uses a Hough transform and is performed through a preprocessing process, such as noise filtering, normal estimation, or segmentation. However, there is a disadvantage in that the execution time is slow due to a large amount of computation. Therefore, for the real-time visualization in augmented reality devices, the fast cylinder matching method using random sample consensus (RANSAC) is required. In this paper, we proposed parallel processing, multiple frames, adjustable scale, and error correction for real-time visualization. The real-time visualization method through the augmented reality device obtained a depth image from the sensor and configured a uniform point cloud using a voxel grid algorithm. The constructed data was analyzed according to the fast cylinder matching method using RANSAC. The real-time visualization method through augmented reality devices is expected to be used to identify problems, such as the sagging of pipes, through real-time measurements at plant sites due to the spread of various AR devices.


Sensors ◽  
2021 ◽  
Vol 21 (3) ◽  
pp. 685
Author(s):  
Xuan Gong ◽  
Zichun Le ◽  
Yukun Wu ◽  
Hui Wang

This paper explored a pragmatic approach to research the real-time performance of a multiway concurrent multiobject tracking (MOT) system. At present, most research has focused on the tracking of single-image sequences, but in practical applications, multiway video streams need to be processed in parallel by MOT systems. There have been few studies on the real-time performance of multiway concurrent MOT systems. In this paper, we proposed a new MOT framework to solve multiway concurrency scenario based on a tracking-by-detection (TBD) model. The new framework mainly focuses on concurrency and real-time based on limited computing and storage resources, while considering the algorithm performance. For the former, three aspects were studied: (1) Expanded width and depth of tracking-by-detection model. In terms of width, the MOT system can support the process of multiway video sequence at the same time; in terms of depth, image collectors and bounding box collectors were introduced to support batch processing. (2) Considering the real-time performance and multiway concurrency ability, we proposed one kind of real-time MOT algorithm based on directly driven detection. (3) Optimization of system level—we also utilized the inference optimization features of NVIDIA TensorRT to accelerate the deep neural network (DNN) in the tracking algorithm. To trade off the performance of the algorithm, a negative sample (false detection sample) filter was designed to ensure tracking accuracy. Meanwhile, the factors that affect the system real-time performance and concurrency were studied. The experiment results showed that our method has a good performance in processing multiple concurrent real-time video streams.


2021 ◽  
Author(s):  
Rui Fan ◽  
Hengli Wang ◽  
Bohuan Xue ◽  
Huaiyang Huang ◽  
Yuan Wang ◽  
...  

This paper proposes three-filters-to-normal (3F2N), an accurate and ultrafast surface normal estimator (SNE), which is designed for structured range sensor data, e.g., depth/disparity images. 3F2N SNE computes surface normals by simply performing three filtering operations (two image gradient filters in horizontal and vertical directions, respectively, and a mean/median filter) on an inverse depth image or a disparity image. Despite the simplicity of 3F2N SNE, no similar method already exists in the literature. To evaluate the performance of our proposed SNE, we created three large-scale synthetic datasets (easy, medium and hard) using 24 3D mesh models, each of which is used to generate 1800--2500 pairs of depth images (resolution: 480X640 pixels) and the corresponding ground-truth surface normal maps from different views. 3F2N SNE demonstrates the state-of-the-art performance, outperforming all other existing geometry-based SNEs, where the average angular errors with respect to the easy, medium and hard datasets are 1.66 degrees, 5.69 degrees and 15.31 degrees, respectively. Furthermore, our C++ and CUDA implementations achieve a processing speed of over 260 Hz and 21 kHz, respectively. Our datasets and source code are publicly available at sites.google.com/view/3f2n.


Sensors ◽  
2020 ◽  
Vol 20 (23) ◽  
pp. 6725
Author(s):  
Longyu Zhang ◽  
Hao Xia ◽  
Yanyou Qiao

A depth camera is a kind of sensor that can directly collect distance information between an object and the camera. The RealSense D435i is a low-cost depth camera that is currently in widespread use. When collecting data, an RGB image and a depth image are acquired simultaneously. The quality of the RGB image is good, whereas the depth image typically has many holes. In a lot of applications using depth images, these holes can lead to serious problems. In this study, a repair method of depth images was proposed. The depth image is repaired using the texture synthesis algorithm with the RGB image, which is segmented through a multi-scale object-oriented method. The object difference parameter is added to the process of selecting the best sample block. In contrast with previous methods, the experimental results show that the proposed method avoids the error filling of holes, the edge of the filled holes is consistent with the edge of RGB images, and the repair accuracy is better. The root mean square error, peak signal-to-noise ratio, and structural similarity index measure from the repaired depth images and ground-truth image were better than those obtained by two other methods. We believe that the repair of the depth image can improve the effects of depth image applications.


Sign in / Sign up

Export Citation Format

Share Document