Deep Ordinal Regression Network for Monocular Depth Estimation

CORNet: Context-based Ordinal Regression Network for Monocular Depth Estimation

IEEE Transactions on Circuits and Systems for Video Technology ◽

10.1109/tcsvt.2021.3128505 ◽

2021 ◽

pp. 1-1

Author(s):

Xuyang Meng ◽

Chunxiao Fan ◽

Yue Ming ◽

Hui Yu

Keyword(s):

Depth Estimation ◽

Ordinal Regression ◽

Monocular Depth

Download Full-text

Non-Uniform Discretization-based Ordinal Regression for Monocular Depth Estimation of an Indoor Drone

Electronics ◽

10.3390/electronics9111767 ◽

2020 ◽

Vol 9 (11) ◽

pp. 1767

Author(s):

Xiangzhu Zhang ◽

Lijia Zhang ◽

Frank L. Lewis ◽

Hailong Pei

Keyword(s):

Deep Learning ◽

Binary Classification ◽

Depth Estimation ◽

Ordinal Regression ◽

Classification Model ◽

Security Requirements ◽

Data Set ◽

Decision Algorithm ◽

Decision Area ◽

Monocular Depth

At present, the main methods of solving the monocular depth estimation for indoor drones are the simultaneous localization and mapping (SLAM) algorithm and the deep learning algorithm. SLAM requires the construction of a depth map of the unknown environment, which is slow to calculate and generally requires expensive sensors, whereas current deep learning algorithms are mostly based on binary classification or regression. The output of the binary classification model gives the decision algorithm relatively rough control over the unmanned aerial vehicle. The regression model solves the problem of the binary classification, but it carries out the same processing for long and short distances, resulting in a decline in short-range prediction performance. In order to solve the above problems, according to the characteristics of the strong order correlation of the distance value, we propose a non-uniform spacing-increasing discretization-based ordinal regression algorithm (NSIDORA) to solve the monocular depth estimation for indoor drone tasks. According to the security requirements of this task, the distance label of the data set is discretized into three major areas—the dangerous area, decision area, and safety area—and the decision area is discretized based on spacing-increasing discretization. Considering the inconsistency of ordinal regression, a new distance decoder is produced. Experimental evaluation shows that the root-mean-square error (RMSE) of NSIDORA in the decision area is 33.5% lower than that of non-uniform discretization (NUD)-based ordinal regression methods. Although it is higher overall than that of the state-of-the-art two-stream regression algorithm, the RMSE of the NSIDORA in the top 10 categories of the decision area is 21.8% lower than that of the two-stream regression algorithm. The inference speed of NSIDORA is 3.4 times faster than that of two-stream ordinal regression. Furthermore, the effectiveness of the decoder has been proved through ablation experiments.

Download Full-text

Unsupervised Monocular Depth Estimation for Autonomous Driving

Proceedings of the International Display Workshops ◽

10.36463/idw.2019.3dsap2_3dp2-2 ◽

2019 ◽

pp. 128

Author(s):

Chih-Shuan Huang ◽

Wan-Nung Tsung ◽

Wei-Jong Yang ◽

Chin-Hsing Chen

Keyword(s):

Depth Estimation ◽

Autonomous Driving ◽

Monocular Depth

Download Full-text

On the Uncertainty of Self-Supervised Monocular Depth Estimation

2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) ◽

10.1109/cvpr42600.2020.00329 ◽

2020 ◽

Cited By ~ 1

Author(s):

Matteo Poggi ◽

Filippo Aleotti ◽

Fabio Tosi ◽

Stefano Mattoccia

Keyword(s):

Depth Estimation ◽

Monocular Depth

Download Full-text

Constant Velocity Constraints for Self-Supervised Monocular Depth Estimation

European Conference on Visual Media Production ◽

10.1145/3429341.3429355 ◽

2020 ◽

Author(s):

Hang Zhou ◽

David Greenwood ◽

Sarah Taylor ◽

Han Gong

Keyword(s):

Constant Velocity ◽

Depth Estimation ◽

Monocular Depth ◽

Velocity Constraints

Download Full-text

Hierarchical Object Relationship Constrained Monocular Depth Estimation.

Pattern Recognition ◽

10.1016/j.patcog.2021.108116 ◽

2021 ◽

pp. 108116

Author(s):

Shuai Li ◽

Jiaying Shi ◽

Wenfeng Song ◽

Aimin Hao ◽

Hong Qin

Keyword(s):

Depth Estimation ◽

Monocular Depth ◽

Object Relationship

Download Full-text

Monocular Depth Estimation with Joint Attention Feature Distillation and Wavelet-Based Loss Function

Sensors ◽

10.3390/s21010054 ◽

2020 ◽

Vol 21 (1) ◽

pp. 54

Author(s):

Peng Liu ◽

Zonghua Zhang ◽

Zhaozong Meng ◽

Nan Gao

Keyword(s):

Joint Attention ◽

Loss Function ◽

Depth Estimation ◽

Depth Information ◽

3D Vision ◽

Network Training ◽

Crucial Component ◽

Benchmark Datasets ◽

Ill Posed ◽

Monocular Depth

Depth estimation is a crucial component in many 3D vision applications. Monocular depth estimation is gaining increasing interest due to flexible use and extremely low system requirements, but inherently ill-posed and ambiguous characteristics still cause unsatisfactory estimation results. This paper proposes a new deep convolutional neural network for monocular depth estimation. The network applies joint attention feature distillation and wavelet-based loss function to recover the depth information of a scene. Two improvements were achieved, compared with previous methods. First, we combined feature distillation and joint attention mechanisms to boost feature modulation discrimination. The network extracts hierarchical features using a progressive feature distillation and refinement strategy and aggregates features using a joint attention operation. Second, we adopted a wavelet-based loss function for network training, which improves loss function effectiveness by obtaining more structural details. The experimental results on challenging indoor and outdoor benchmark datasets verified the proposed method’s superiority compared with current state-of-the-art methods.

Download Full-text

Time- and Resource-Efficient Time-to-Collision Forecasting for Indoor Pedestrian Obstacles Avoidance

Journal of Imaging ◽

10.3390/jimaging7040061 ◽

2021 ◽

Vol 7 (4) ◽

pp. 61

Author(s):

David Urban ◽

Alice Caplier

Keyword(s):

Neural Network ◽

Autonomous Vehicles ◽

Depth Estimation ◽

Video Camera ◽

Obstacle Detection ◽

Navigation Systems ◽

Time To Collision ◽

Static Data ◽

Monocular Depth ◽

Fully Connected

As difficult vision-based tasks like object detection and monocular depth estimation are making their way in real-time applications and as more light weighted solutions for autonomous vehicles navigation systems are emerging, obstacle detection and collision prediction are two very challenging tasks for small embedded devices like drones. We propose a novel light weighted and time-efficient vision-based solution to predict Time-to-Collision from a monocular video camera embedded in a smartglasses device as a module of a navigation system for visually impaired pedestrians. It consists of two modules: a static data extractor made of a convolutional neural network to predict the obstacle position and distance and a dynamic data extractor that stacks the obstacle data from multiple frames and predicts the Time-to-Collision with a simple fully connected neural network. This paper focuses on the Time-to-Collision network’s ability to adapt to new sceneries with different types of obstacles with supervised learning.

Download Full-text