PLD-SLAM: A New RGB-D SLAM Method with Point and Line Features for Indoor Dynamic Scene

Chenyang Zhang; Teng Huang; Rongchun Zhang; Xuefeng Yi

doi:10.3390/ijgi10030163

PLD-SLAM: A New RGB-D SLAM Method with Point and Line Features for Indoor Dynamic Scene

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10030163 ◽

2021 ◽

Vol 10 (3) ◽

pp. 163

Author(s):

Chenyang Zhang ◽

Teng Huang ◽

Rongchun Zhang ◽

Xuefeng Yi

Keyword(s):

Deep Learning ◽

Clustering Algorithm ◽

Ground Truth ◽

Depth Information ◽

Dynamic Features ◽

Dynamic Scenes ◽

Performance Point ◽

Localization And Mapping ◽

Camera Pose ◽

Line Features

RGB-D SLAM (Simultaneous Localization and Mapping) generally performs smoothly in a static environment. However, in dynamic scenes, dynamic features often cause wrong data associations, which degrade accuracy and robustness. To address this problem, in this paper, a new RGB-D dynamic SLAM method, PLD-SLAM, which is based on point and line features for dynamic scenes, is proposed. First, to avoid under-over segmentation caused by deep learning, PLD-SLAM combines deep learning for semantic information segmentation with the K-Means clustering algorithm considering depth information to detect the underlying dynamic features. Next, two consistency check strategies are utilized to check and filter out the dynamic features more reasonably. Then, to obtain a better practical performance, point and line features are utilized to calculate camera pose in the dynamic SLAM, which is also different from most published dynamic SLAM algorithms based merely on point features. The optimization model with point and line features is constructed and utilized to calculate the camera pose with higher accuracy. Third, enough experiments on the public TUM RGB-D dataset and the real-world scenes are conducted to verify the location accuracy and performance of PLD-SLAM. We compare our experimental results with several state-of-the-art dynamic SLAM methods in terms of average localization errors and the visual difference between the estimation trajectories and the ground-truth trajectories. Through the comprehensive comparisons with these dynamic SLAM schemes, it can be fully demonstrated that PLD-SLAM can achieve comparable or better performances in dynamic scenes. Moreover, the feasibility of camera pose estimation based on both point features and line features has been proven by the corresponding experiments from a comparison with our proposed PLD-SLAM only based on point features.

Download Full-text

Visual SLAM Based on Dynamic Object Removal

10.36227/techrxiv.11687190.v1 ◽

2020 ◽

Author(s):

Guoliang Liu

Keyword(s):

Moving Objects ◽

State Of The Art ◽

Service Robot ◽

Visual Slam ◽

Dynamic Object ◽

Dynamic Features ◽

Dynamic Scenes ◽

The Core ◽

Localization And Mapping ◽

Strong Hypothesis

Visual simultaneous localization and mapping (SLAM) is the core of intelligent robot navigation system. Many traditional SLAM algorithms assume that the scene is static. When a dynamic object appears in the environment, the accuracy of visual SLAM can degrade due to the interference of dynamic features of moving objects. This strong hypothesis limits the SLAM applications for service robot or driverless car intherealdynamicenvironment.Inthispaper,adynamicobject removal algorithm that combines object recognition and optical ﬂow techniques is proposed in the visual SLAM framework for dynamic scenes. The experimental results show that our new method can detect moving object effectively and improve the SLAM performance compared to the state of the art methods.<br>

Download Full-text

Stereo Visual Odometry Pose Correction through Unsupervised Deep Learning

Sensors ◽

10.3390/s21144735 ◽

2021 ◽

Vol 21 (14) ◽

pp. 4735

Author(s):

Sumin Zhang ◽

Shouyi Lu ◽

Rui He ◽

Zhipeng Bao

Keyword(s):

Deep Learning ◽

State Of The Art ◽

Depth Map ◽

Ground Truth ◽

Visual Odometry ◽

Vital Role ◽

Positioning Accuracy ◽

Multiview Geometry ◽

Localization And Mapping ◽

Unsupervised Deep Learning

Visual simultaneous localization and mapping (VSLAM) plays a vital role in the field of positioning and navigation. At the heart of VSLAM is visual odometry (VO), which uses continuous images to estimate the camera’s ego-motion. However, due to many assumptions of the classical VO system, robots can hardly operate in challenging environments. To solve this challenge, we combine the multiview geometry constraints of the classical stereo VO system with the robustness of deep learning to present an unsupervised pose correction network for the classical stereo VO system. The pose correction network regresses a pose correction that results in positioning error due to violation of modeling assumptions to make the classical stereo VO positioning more accurate. The pose correction network does not rely on the dataset with ground truth poses for training. The pose correction network also simultaneously generates a depth map and an explainability mask. Extensive experiments on the KITTI dataset show the pose correction network can significantly improve the positioning accuracy of the classical stereo VO system. Notably, the corrected classical stereo VO system’s average absolute trajectory error, average translational relative pose error, and average translational root-mean-square drift on a length of 100–800 m in the KITTI dataset is 13.77 cm, 0.038 m, and 1.08%, respectively. Therefore, the improved stereo VO system has almost reached the state of the art.

Download Full-text

JD-SLAM: Joint camera pose estimation and moving object segmentation for simultaneous localization and mapping in dynamic scenes

International Journal of Advanced Robotic Systems ◽

10.1177/1729881421994447 ◽

2021 ◽

Vol 18 (1) ◽

pp. 172988142199444

Author(s):

Yujia Zhai ◽

Baoli Lu ◽

Weijun Li ◽

Jian Xu ◽

Shuangyi Ma

Keyword(s):

Object Detection ◽

Real Time ◽

Object Segmentation ◽

Three Dimensional ◽

Simultaneous Localization And Mapping ◽

Processing Unit ◽

Dynamic Scenes ◽

The Real ◽

Localization And Mapping ◽

Camera Pose

As a fundamental assumption in simultaneous localization and mapping, the static scenes hypothesis can be hardly fulfilled in applications of indoor/outdoor navigation or localization. Recent works about simultaneous localization and mapping in dynamic scenes commonly use heavy pixel-level segmentation net to distinguish dynamic objects, which brings enormous calculations and limits the real-time performance of the system. That restricts the application of simultaneous localization and mapping on the mobile terminal. In this article, we present a lightweight system for monocular simultaneous localization and mapping in dynamic scenes, which can run in real time on central processing unit (CPU) and generate a semantic probability map. The pixel-wise semantic segmentation net is replaced with a lightweight object detection net combined with three-dimensional segmentation based on motion clustering. And a framework integrated with an improved weighted-random sample consensus solver is proposed to jointly solve the camera pose and perform three-dimensional object segmentation, which enables high accuracy and efficiency. Besides, the prior information of the generated map and the object detection results is introduced for better estimation. The experiments on the public data set, and in the real-world demonstrate that our method obtains an outstanding improvement in both accuracy and speed compared to state-of-the-art methods.

Download Full-text

Visual SLAM Based on Dynamic Object Removal

10.36227/techrxiv.11687190 ◽

2020 ◽

Author(s):

Guoliang Liu

Keyword(s):

Moving Objects ◽

State Of The Art ◽

Service Robot ◽

Visual Slam ◽

Dynamic Object ◽

Dynamic Features ◽

Dynamic Scenes ◽

The Core ◽

Localization And Mapping ◽

Strong Hypothesis

Download Full-text

OC-SLAM: Steadily Tracking and Mapping in Dynamic Environments

Frontiers in Energy Research ◽

10.3389/fenrg.2021.803631 ◽

2021 ◽

Vol 9 ◽

Author(s):

Zhenyu Wu ◽

Xiangyu Deng ◽

Shengming Li ◽

Yingshun Li

Keyword(s):

Object Detection ◽

Point Cloud ◽

Point Clouds ◽

Dynamic Environments ◽

Detection Algorithm ◽

Dynamic Features ◽

Dynamic Scenes ◽

Original Algorithm ◽

Localization And Mapping ◽

Dense Point

Visual Simultaneous Localization and Mapping (SLAM) system is mainly used in real-time localization and mapping tasks of robots in various complex environments, while traditional monocular vision algorithms are struggling to cope with weak texture and dynamic scenes. To solve these problems, this work presents an object detection and clustering assisted SLAM algorithm (OC-SLAM), which adopts a faster object detection algorithm to add semantic information to the image and conducts geometrical constraint on the dynamic keypoints in the prediction box to optimize the camera pose. It also uses RGB-D camera to perform dense point cloud reconstruction with the dynamic objects rejected, and facilitates European clustering of dense point clouds to jointly eliminate dynamic features combining with object detection algorithm. Experiments in the TUM dataset indicate that OC-SLAM enhances the localization accuracy of the SLAM system in the dynamic environments compared with original algorithm and it has shown impressive performance in the localizition and can build a more precise dense point cloud map in dynamic scenes.

Download Full-text

Proposal of a Monitoring System to Determine the Possibility of Contact with Confirmed Infectious Diseases Using K-means Clustering Algorithm and Deep Learning Based Crowd Counting

Korean Institute of Smart Media ◽

10.30693/smj.2020.9.3.122 ◽

2020 ◽

Vol 9 (3) ◽

pp. 122-129

Author(s):

Dongsu Lee ◽

ASHIQUZZAMAN A K M ◽

Yeonggwang Kim ◽

혜주 신 ◽

Jinsul Kim

Keyword(s):

Deep Learning ◽

Infectious Diseases ◽

Monitoring System ◽

Clustering Algorithm ◽

Crowd Counting

Download Full-text

Comparison of different SLAM approaches for a driverless race car

tm - Technisches Messen ◽

10.1515/teme-2021-0004 ◽

2021 ◽

Vol 0 (0) ◽

Author(s):

Nick Le Large ◽

Frank Bieder ◽

Martin Lauer

Keyword(s):

Kalman Filter ◽

Extended Kalman Filter ◽

Ground Truth ◽

Time Constraints ◽

Differential Gps ◽

Real World Data ◽

Race Car ◽

Localization And Mapping ◽

Slam Algorithm ◽

Computational Resources

Abstract For the application of an automated, driverless race car, we aim to assure high map and localization quality for successful driving on previously unknown, narrow race tracks. To achieve this goal, it is essential to choose an algorithm that fulfills the requirements in terms of accuracy, computational resources and run time. We propose both a filter-based and a smoothing-based Simultaneous Localization and Mapping (SLAM) algorithm and evaluate them using real-world data collected by a Formula Student Driverless race car. The accuracy is measured by comparing the SLAM-generated map to a ground truth map which was acquired using high-precision Differential GPS (DGPS) measurements. The results of the evaluation show that both algorithms meet required time constraints thanks to a parallelized architecture, with GraphSLAM draining the computational resources much faster than Extended Kalman Filter (EKF) SLAM. However, the analysis of the maps generated by the algorithms shows that GraphSLAM outperforms EKF SLAM in terms of accuracy.

Download Full-text

Geometric property-based convolutional neural network for indoor object detection

International Journal of Advanced Robotic Systems ◽

10.1177/1729881421993323 ◽

2021 ◽

Vol 18 (1) ◽

pp. 172988142199332

Author(s):

Xintao Ding ◽

Boquan Li ◽

Jinbao Wang

Keyword(s):

Neural Network ◽

Object Detection ◽

Convolutional Neural Network ◽

Geometric Property ◽

Ground Truth ◽

Geometric Constraints ◽

Depth Information ◽

Training Set ◽

Object Knowledge ◽

The Mean

Indoor object detection is a very demanding and important task for robot applications. Object knowledge, such as two-dimensional (2D) shape and depth information, may be helpful for detection. In this article, we focus on region-based convolutional neural network (CNN) detector and propose a geometric property-based Faster R-CNN method (GP-Faster) for indoor object detection. GP-Faster incorporates geometric property in Faster R-CNN to improve the detection performance. In detail, we first use mesh grids that are the intersections of direct and inverse proportion functions to generate appropriate anchors for indoor objects. After the anchors are regressed to the regions of interest produced by a region proposal network (RPN-RoIs), we then use 2D geometric constraints to refine the RPN-RoIs, in which the 2D constraint of every classification is a convex hull region enclosing the width and height coordinates of the ground-truth boxes on the training set. Comparison experiments are implemented on two indoor datasets SUN2012 and NYUv2. Since the depth information is available in NYUv2, we involve depth constraints in GP-Faster and propose 3D geometric property-based Faster R-CNN (DGP-Faster) on NYUv2. The experimental results show that both GP-Faster and DGP-Faster increase the performance of the mean average precision.

Download Full-text

Spectroscopic and deep learning-based approaches to identify and quantify cerebral microhemorrhages

Scientific Reports ◽

10.1038/s41598-021-88236-1 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Christian Crouzet ◽

Gwangjin Jeong ◽

Rachel H. Chae ◽

Krystal T. LoPresti ◽

Cody E. Dunn ◽

...

Keyword(s):

Deep Learning ◽

Prussian Blue ◽

Processing Speed ◽

Digital Pathology ◽

Ground Truth ◽

Individual Variability ◽

Rgb Images ◽

Cerebral Microhemorrhages ◽

Phasor Analysis ◽

Better Than

AbstractCerebral microhemorrhages (CMHs) are associated with cerebrovascular disease, cognitive impairment, and normal aging. One method to study CMHs is to analyze histological sections (5–40 μm) stained with Prussian blue. Currently, users manually and subjectively identify and quantify Prussian blue-stained regions of interest, which is prone to inter-individual variability and can lead to significant delays in data analysis. To improve this labor-intensive process, we developed and compared three digital pathology approaches to identify and quantify CMHs from Prussian blue-stained brain sections: (1) ratiometric analysis of RGB pixel values, (2) phasor analysis of RGB images, and (3) deep learning using a mask region-based convolutional neural network. We applied these approaches to a preclinical mouse model of inflammation-induced CMHs. One-hundred CMHs were imaged using a 20 × objective and RGB color camera. To determine the ground truth, four users independently annotated Prussian blue-labeled CMHs. The deep learning and ratiometric approaches performed better than the phasor analysis approach compared to the ground truth. The deep learning approach had the most precision of the three methods. The ratiometric approach has the most versatility and maintained accuracy, albeit with less precision. Our data suggest that implementing these methods to analyze CMH images can drastically increase the processing speed while maintaining precision and accuracy.

Download Full-text

Detection of Malicious Software by Analyzing Distinct Artifacts Using Machine Learning and Deep Learning Algorithms

Electronics ◽

10.3390/electronics10141694 ◽

2021 ◽

Vol 10 (14) ◽

pp. 1694

Author(s):

Mathew Ashik ◽

A. Jyothish ◽

S. Anandaram ◽

P. Vinod ◽

Francesco Mercaldo ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Deep Learning ◽

Support Vector ◽

Malware Analysis ◽

Learning Approaches ◽

Dynamic Features ◽

System Calls ◽

Prevention Methods ◽

Structural Aspects

Malware is one of the most significant threats in today’s computing world since the number of websites distributing malware is increasing at a rapid rate. Malware analysis and prevention methods are increasingly becoming necessary for computer systems connected to the Internet. This software exploits the system’s vulnerabilities to steal valuable information without the user’s knowledge, and stealthily send it to remote servers controlled by attackers. Traditionally, anti-malware products use signatures for detecting known malware. However, the signature-based method does not scale in detecting obfuscated and packed malware. Considering that the cause of a problem is often best understood by studying the structural aspects of a program like the mnemonics, instruction opcode, API Call, etc. In this paper, we investigate the relevance of the features of unpacked malicious and benign executables like mnemonics, instruction opcodes, and API to identify a feature that classifies the executable. Prominent features are extracted using Minimum Redundancy and Maximum Relevance (mRMR) and Analysis of Variance (ANOVA). Experiments were conducted on four datasets using machine learning and deep learning approaches such as Support Vector Machine (SVM), Naïve Bayes, J48, Random Forest (RF), and XGBoost. In addition, we also evaluate the performance of the collection of deep neural networks like Deep Dense network, One-Dimensional Convolutional Neural Network (1D-CNN), and CNN-LSTM in classifying unknown samples, and we observed promising results using APIs and system calls. On combining APIs/system calls with static features, a marginal performance improvement was attained comparing models trained only on dynamic features. Moreover, to improve accuracy, we implemented our solution using distinct deep learning methods and demonstrated a fine-tuned deep neural network that resulted in an F1-score of 99.1% and 98.48% on Dataset-2 and Dataset-3, respectively.

Download Full-text