scholarly journals Using Unsupervised Deep Learning Technique for Monocular Visual Odometry

IEEE Access ◽  
2019 ◽  
Vol 7 ◽  
pp. 18076-18088 ◽  
Author(s):  
Qiang Liu ◽  
Ruihao Li ◽  
Huosheng Hu ◽  
Dongbing Gu
2020 ◽  
Vol 10 (16) ◽  
pp. 5426 ◽  
Author(s):  
Qiang Liu ◽  
Haidong Zhang ◽  
Yiming Xu ◽  
Li Wang

Recently, deep learning frameworks have been deployed in visual odometry systems and achieved comparable results to traditional feature matching based systems. However, most deep learning-based frameworks inevitably need labeled data as ground truth for training. On the other hand, monocular odometry systems are incapable of restoring absolute scale. External or prior information has to be introduced for scale recovery. To solve these problems, we present a novel deep learning-based RGB-D visual odometry system. Our two main contributions are: (i) during network training and pose estimation, the depth images are fed into the network to form a dual-stream structure with the RGB images, and a dual-stream deep neural network is proposed. (ii) the system adopts an unsupervised end-to-end training method, thus the labor-intensive data labeling task is not required. We have tested our system on the KITTI dataset, and results show that the proposed RGB-D Visual Odometry (VO) system has obvious advantages over other state-of-the-art systems in terms of both translation and rotation errors.


Sensors ◽  
2021 ◽  
Vol 21 (14) ◽  
pp. 4735
Author(s):  
Sumin Zhang ◽  
Shouyi Lu ◽  
Rui He ◽  
Zhipeng Bao

Visual simultaneous localization and mapping (VSLAM) plays a vital role in the field of positioning and navigation. At the heart of VSLAM is visual odometry (VO), which uses continuous images to estimate the camera’s ego-motion. However, due to many assumptions of the classical VO system, robots can hardly operate in challenging environments. To solve this challenge, we combine the multiview geometry constraints of the classical stereo VO system with the robustness of deep learning to present an unsupervised pose correction network for the classical stereo VO system. The pose correction network regresses a pose correction that results in positioning error due to violation of modeling assumptions to make the classical stereo VO positioning more accurate. The pose correction network does not rely on the dataset with ground truth poses for training. The pose correction network also simultaneously generates a depth map and an explainability mask. Extensive experiments on the KITTI dataset show the pose correction network can significantly improve the positioning accuracy of the classical stereo VO system. Notably, the corrected classical stereo VO system’s average absolute trajectory error, average translational relative pose error, and average translational root-mean-square drift on a length of 100–800 m in the KITTI dataset is 13.77 cm, 0.038 m, and 1.08%, respectively. Therefore, the improved stereo VO system has almost reached the state of the art.


2021 ◽  
pp. 1-12
Author(s):  
Gaurav Sarraf ◽  
Anirudh Ramesh Srivatsa ◽  
MS Swetha

With the ever-rising threat to security, multiple industries are always in search of safer communication techniques both in rest and transit. Multiple security institutions agree that any systems security can be modeled around three major concepts: Confidentiality, Availability, and Integrity. We try to reduce the holes in these concepts by developing a Deep Learning based Steganography technique. In our study, we have seen, data compression has to be at the heart of any sound steganography system. In this paper, we have shown that it is possible to compress and encode data efficiently to solve critical problems of steganography. The deep learning technique, which comprises an auto-encoder with Convolutional Neural Network as its building block, not only compresses the secret file but also learns how to hide the compressed data in the cover file efficiently. The proposed techniques can encode secret files of the same size as of cover, or in some sporadic cases, even larger files can be encoded. We have also shown that the same model architecture can theoretically be applied to any file type. Finally, we show that our proposed technique surreptitiously evades all popular steganalysis techniques.


Sign in / Sign up

Export Citation Format

Share Document