scholarly journals Object-Level Semantic Map Construction for Dynamic Scenes

2021 ◽  
Vol 11 (2) ◽  
pp. 645
Author(s):  
Xujie Kang ◽  
Jing Li ◽  
Xiangtao Fan ◽  
Hongdeng Jian ◽  
Chen Xu

Visual simultaneous localization and mapping (SLAM) is challenging in dynamic environments as moving objects can impair camera pose tracking and mapping. This paper introduces a method for robust dense bject-level SLAM in dynamic environments that takes a live stream of RGB-D frame data as input, detects moving objects, and segments the scene into different objects while simultaneously tracking and reconstructing their 3D structures. This approach provides a new method of dynamic object detection, which integrates prior knowledge of the object model database constructed, object-oriented 3D tracking against the camera pose, and the association between the instance segmentation results on the current frame data and an object database to find dynamic objects in the current frame. By leveraging the 3D static model for frame-to-model alignment, as well as dynamic object culling, the camera motion estimation reduced the overall drift. According to the camera pose accuracy and instance segmentation results, an object-level semantic map representation was constructed for the world map. The experimental results obtained using the TUM RGB-D dataset, which compares the proposed method to the related state-of-the-art approaches, demonstrating that our method achieves similar performance in static scenes and improved accuracy and robustness in dynamic scenes.

Robotica ◽  
2019 ◽  
Vol 38 (2) ◽  
pp. 256-270 ◽  
Author(s):  
Jiyu Cheng ◽  
Yuxiang Sun ◽  
Max Q.-H. Meng

SummaryVisual simultaneous localization and mapping (visual SLAM) has been well developed in recent decades. To facilitate tasks such as path planning and exploration, traditional visual SLAM systems usually provide mobile robots with the geometric map, which overlooks the semantic information. To address this problem, inspired by the recent success of the deep neural network, we combine it with the visual SLAM system to conduct semantic mapping. Both the geometric and semantic information will be projected into the 3D space for generating a 3D semantic map. We also use an optical-flow-based method to deal with the moving objects such that our method is capable of working robustly in dynamic environments. We have performed our experiments in the public TUM dataset and our recorded office dataset. Experimental results demonstrate the feasibility and impressive performance of the proposed method.


2020 ◽  
Vol 2020 ◽  
pp. 1-14
Author(s):  
Chongben Tao ◽  
Yufeng Jin ◽  
Feng Cao ◽  
Zufeng Zhang ◽  
Chunguang Li ◽  
...  

In view of existing Visual SLAM (VSLAM) algorithms when constructing semantic map of indoor environment, there are problems with low accuracy and low label classification accuracy when feature points are sparse. This paper proposed a 3D semantic VSLAM algorithm called BMASK-RCNN based on Mask Scoring RCNN. Firstly, feature points of images are extracted by Binary Robust Invariant Scalable Keypoints (BRISK) algorithm. Secondly, map points of reference key frame are projected to current frame for feature matching and pose estimation, and an inverse depth filter is used to estimate scene depth of created key frame to obtain camera pose changes. In order to achieve object detection and semantic segmentation for both static objects and dynamic objects in indoor environments and then construct dense 3D semantic map with VSLAM algorithm, a Mask Scoring RCNN is used to adjust its structure partially, where a TUM RGB-D SLAM dataset for transfer learning is employed. Semantic information of independent targets in scenes provides semantic information including categories, which not only provides high accuracy of localization but also realizes the probability update of semantic estimation by marking movable objects, thereby reducing the impact of moving objects on real-time mapping. Through simulation and actual experimental comparison with other three algorithms, results show the proposed algorithm has better robustness, and semantic information used in 3D semantic mapping can be accurately obtained.


Sensors ◽  
2019 ◽  
Vol 19 (17) ◽  
pp. 3714 ◽  
Author(s):  
Guihua Liu ◽  
Weilin Zeng ◽  
Bo Feng ◽  
Feng Xu

Presently, although many impressed SLAM systems have achieved exceptional accuracy in a real environment, most of them are verified in the static environment. However, for mobile robots and autonomous driving, the dynamic objects in the scene can result in tracking failure or large deviation during pose estimation. In this paper, a general visual SLAM system for dynamic scenes with multiple sensors called DMS-SLAM is proposed. First, the combination of GMS and sliding window is used to achieve the initialization of the system, which can eliminate the influence of dynamic objects and construct a static initialization 3D map. Then, the corresponding 3D points of the current frame in the local map are obtained by reprojection. These points are combined with the constant speed model or reference frame model to achieve the position estimation of the current frame and the update of the 3D map points in the local map. Finally, the keyframes selected by the tracking module are combined with the GMS feature matching algorithm to add static 3D map points to the local map. DMS-SLAM implements pose tracking, closed-loop detection and relocalization based on static 3D map points of the local map and supports monocular, stereo and RGB-D visual sensors in dynamic scenes. Exhaustive evaluation in public TUM and KITTI datasets demonstrates that DMS-SLAM outperforms state-of-the-art visual SLAM systems in accuracy and speed in dynamic scenes.


2020 ◽  
Author(s):  
Guoliang Liu

Visual simultaneous localization and mapping (SLAM) is the core of intelligent robot navigation system. Many traditional SLAM algorithms assume that the scene is static. When a dynamic object appears in the environment, the accuracy of visual SLAM can degrade due to the interference of dynamic features of moving objects. This strong hypothesis limits the SLAM applications for service robot or driverless car intherealdynamicenvironment.Inthispaper,adynamicobject removal algorithm that combines object recognition and optical flow techniques is proposed in the visual SLAM framework for dynamic scenes. The experimental results show that our new method can detect moving object effectively and improve the SLAM performance compared to the state of the art methods.<br>


2020 ◽  
Author(s):  
Guoliang Liu

Visual simultaneous localization and mapping (SLAM) is the core of intelligent robot navigation system. Many traditional SLAM algorithms assume that the scene is static. When a dynamic object appears in the environment, the accuracy of visual SLAM can degrade due to the interference of dynamic features of moving objects. This strong hypothesis limits the SLAM applications for service robot or driverless car intherealdynamicenvironment.Inthispaper,adynamicobject removal algorithm that combines object recognition and optical flow techniques is proposed in the visual SLAM framework for dynamic scenes. The experimental results show that our new method can detect moving object effectively and improve the SLAM performance compared to the state of the art methods.<br>


Author(s):  
SHENGPING ZHANG ◽  
HONGXUN YAO ◽  
SHAOHUI LIU

Traditional background subtraction methods perform poorly when scenes contain dynamic backgrounds such as waving tree branches, spouting fountain, illumination changes, camera jitters, etc. In this paper, from the view of spatial context, we present a novel and effective dynamic background method with three contributions. First, we present a novel local dependency descriptor, called local dependency histogram (LDH), to effectively model the spatial dependencies between a pixel and its neighboring pixels. The spatial dependencies contain substantial evidence for differentiating dynamic background regions from moving objects of interest. Second, based on the proposed LDH, an effective approach to dynamic background subtraction is proposed, in which each pixel is modeled as a group of weighted LDHs. Labeling a pixel as foreground or background is done by comparing the LDH computed in current frame against its model LDHs. The model LDHs are adaptively updated by the current LDH. Finally, unlike traditional approaches using a fixed threshold to judge whether a pixel matches to its model, an adaptive thresholding technique is also proposed. Experimental results on a diverse set of dynamic scenes validate that the proposed method significantly outperforms traditional methods for dynamic background subtraction.


Sensors ◽  
2021 ◽  
Vol 21 (1) ◽  
pp. 230
Author(s):  
Xiangwei Dang ◽  
Zheng Rong ◽  
Xingdong Liang

Accurate localization and reliable mapping is essential for autonomous navigation of robots. As one of the core technologies for autonomous navigation, Simultaneous Localization and Mapping (SLAM) has attracted widespread attention in recent decades. Based on vision or LiDAR sensors, great efforts have been devoted to achieving real-time SLAM that can support a robot’s state estimation. However, most of the mature SLAM methods generally work under the assumption that the environment is static, while in dynamic environments they will yield degenerate performance or even fail. In this paper, first we quantitatively evaluate the performance of the state-of-the-art LiDAR-based SLAMs taking into account different pattens of moving objects in the environment. Through semi-physical simulation, we observed that the shape, size, and distribution of moving objects all can impact the performance of SLAM significantly, and obtained instructive investigation results by quantitative comparison between LOAM and LeGO-LOAM. Secondly, based on the above investigation, a novel approach named EMO to eliminating the moving objects for SLAM fusing LiDAR and mmW-radar is proposed, towards improving the accuracy and robustness of state estimation. The method fully uses the advantages of different characteristics of two sensors to realize the fusion of sensor information with two different resolutions. The moving objects can be efficiently detected based on Doppler effect by radar, accurately segmented and localized by LiDAR, then filtered out from the point clouds through data association and accurate synchronized in time and space. Finally, the point clouds representing the static environment are used as the input of SLAM. The proposed approach is evaluated through experiments using both semi-physical simulation and real-world datasets. The results demonstrate the effectiveness of the method at improving SLAM performance in accuracy (decrease by 30% at least in absolute position error) and robustness in dynamic environments.


2020 ◽  
Vol 17 (4) ◽  
pp. 172988142094727
Author(s):  
Wenlong Zhang ◽  
Xiaoliang Sun ◽  
Qifeng Yu

Due to the clutter background motion, accurate moving object segmentation in unconstrained videos remains a significant open problem, especially for the slow-moving object. This article proposes an accurate moving object segmentation method based on robust seed selection. The seed pixels of the object and background are selected robustly by using the optical flow cues. Firstly, this article detects the moving object’s rough contour according to the local difference in the weighted orientation cues of the optical flow. Then, the detected rough contour is used to guide the object and the background seed pixel selection. The object seed pixels in the previous frame are propagated to the current frame according to the optical flow to improve the robustness of the seed selection. Finally, we adopt the random walker algorithm to segment the moving object accurately according to the selected seed pixels. Experiments on publicly available data sets indicate that the proposed method shows excellent performance in segmenting moving objects accurately in unconstraint videos.


Sign in / Sign up

Export Citation Format

Share Document