Depth Estimation from Light Field Geometry Using Convolutional Neural Networks

Lei Han; Xiaohua Huang; Zhan Shi; Shengnan Zheng

doi:10.3390/s21186061

Depth Estimation from Light Field Geometry Using Convolutional Neural Networks

Sensors ◽

10.3390/s21186061 ◽

2021 ◽

Vol 21 (18) ◽

pp. 6061

Author(s):

Lei Han ◽

Xiaohua Huang ◽

Zhan Shi ◽

Shengnan Zheng

Keyword(s):

Neural Networks ◽

Feature Extraction ◽

Convolutional Neural Networks ◽

Field Data ◽

Light Field ◽

Depth Estimation ◽

Computational Time ◽

Synthetic Image ◽

Light Field Imaging ◽

Field Imaging

Depth estimation based on light field imaging is a new methodology that has succeeded the traditional binocular stereo matching and depth from monocular images. Significant progress has been made in light-field depth estimation. Nevertheless, the balance between computational time and the accuracy of depth estimation is still worth exploring. The geometry in light field imaging is the basis of depth estimation, and the abundant light-field data provides convenience for applying deep learning algorithms. The Epipolar Plane Image (EPI) generated from the light-field data has a line texture containing geometric information. The slope of the line is proportional to the depth of the corresponding object. Considering the light field depth estimation as a spatial density prediction task, we design a convolutional neural network (ESTNet) to estimate the accurate depth quickly. Inspired by the strong image feature extraction ability of convolutional neural networks, especially for texture images, we propose to generate EPI synthetic images from light field data as the input of ESTNet to improve the effect of feature extraction and depth estimation. The architecture of ESTNet is characterized by three input streams, encoding-decoding structure, and skipconnections. The three input streams receive horizontal EPI synthetic image (EPIh), vertical EPI synthetic image (EPIv), and central view image (CV), respectively. EPIh and EPIv contain rich texture and depth cues, while CV provides pixel position association information. ESTNet consists of two stages: encoding and decoding. The encoding stage includes several convolution modules, and correspondingly, the decoding stage embodies some transposed convolution modules. In addition to the forward propagation of the network ESTNet, some skip-connections are added between the convolution module and the corresponding transposed convolution module to fuse the shallow local and deep semantic features. ESTNet is trained on one part of a synthetic light-field dataset and then tested on another part of the synthetic light-field dataset and real light-field dataset. Ablation experiments show that our ESTNet structure is reasonable. Experiments on the synthetic light-field dataset and real light-field dataset show that our ESTNet can balance the accuracy of depth estimation and computational time.

Download Full-text

Improved Depth Estimation for Occlusion Scenes Using a Light-Field Camera

Photogrammetric Engineering & Remote Sensing ◽

10.14358/pers.86.7.443 ◽

2020 ◽

Vol 86 (7) ◽

pp. 443-456

Author(s):

Changkun Yang ◽

Zhaoqin Liu ◽

Kaichang Di ◽

Changqing Hu ◽

Yexin Wang ◽

...

Keyword(s):

Field Data ◽

Light Field ◽

State Of The Art ◽

Depth Estimation ◽

Data Sets ◽

Identification Algorithm ◽

Novel Method ◽

Light Field Imaging ◽

The Cost ◽

Field Imaging

With the development of light-field imaging technology, depth estimation using light-field cameras has become a hot topic in recent years. Even through many algorithms have achieved good performance for depth estimation using light-field cameras, removing the influence of occlusion, especially multi-occlusion, is still a challenging task. The photo-consistency assumption does not hold in the presence of occlusions, which makes most depth estimation of light-field imaging unreliable. In this article, a novel method to handle complex occlusion in depth estimation of light-field imaging is proposed. The method can effectively identify occluded pixels using a refocusing algorithm, accurately select unoccluded views using the adaptive unoccluded-view identification algorithm, and then improve the depth estimation by computing the cost volumes in the unoccluded views. Experimental results demonstrate the advantages of our proposed algorithm compared with conventional state-of-the art algorithms on both synthetic and real light-field data sets.

Download Full-text

Semantic Segmentation with Light Field Imaging and Convolutional Neural Networks

IEEE Transactions on Instrumentation and Measurement ◽

10.1109/tim.2021.3115204 ◽

2021 ◽

pp. 1-1

Author(s):

Chen Jia ◽

Fan Shi ◽

Meng Zhao ◽

Yao Zhang ◽

Xu Cheng ◽

...

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Light Field ◽

Semantic Segmentation ◽

Light Field Imaging ◽

Field Imaging

Download Full-text

Data-driven light field depth estimation using deep Convolutional Neural Networks

2016 International Joint Conference on Neural Networks (IJCNN) ◽

10.1109/ijcnn.2016.7727222 ◽

2016 ◽

Author(s):

Xing Sun ◽

Zhimin Xu ◽

Nan Meng ◽

Edmund Y. Lam ◽

Hayden K.-H. So

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Light Field ◽

Depth Estimation ◽

Data Driven ◽

Deep Convolutional Neural Networks

Download Full-text

Light-Field Reconstruction and Depth Estimation from Focal Stack Images Using Convolutional Neural Networks

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp40776.2020.9053586 ◽

2020 ◽

Author(s):

Zhengyu Huang ◽

Jeffrey A. Fessler ◽

Theodore B. Norris ◽

Il Yong Chun

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Light Field ◽

Depth Estimation ◽

Field Reconstruction ◽

Focal Stack

Download Full-text

Application of Landweber Method for Three-Dimensional Temperature Field Reconstruction Based on the Light-Field Imaging Technique

Journal of Heat Transfer ◽

10.1115/1.4039305 ◽

2018 ◽

Vol 140 (8) ◽

Cited By ~ 12

Author(s):

Xing Huang ◽

Hong Qi ◽

Xiao-Luo Zhang ◽

Ya-Tao Ren ◽

Li-Ming Ruan ◽

...

Keyword(s):

Temperature Distribution ◽

Light Field ◽

Imaging Technique ◽

Three Dimensional ◽

Computational Time ◽

Reconstruction Technique ◽

Absorbing Media ◽

Light Field Imaging ◽

Landweber Method ◽

Field Imaging

Combined with the light-field imaging technique, the Landweber method is applied to the reconstruction of three-dimensional (3D) temperature distribution in absorbing media theoretically and experimentally. In the theoretical research, simulated exit radiation intensities on the boundary of absorbing media according to the computing model of light field are employed as inputs for inverse analysis. Compared with the commonly used iterative methods, i.e., the least-square QR decomposition method and algebraic reconstruction technique (ART), the Landweber method can produce reconstruction results with better quality and less computational time. Based on the numerical study, an experimental investigation is conducted to validate the suitability of the proposed method. The temperature distribution of the ethylene diffusion flame is reconstructed by using the Landweber method from the flame image captured by a light-field camera. Good agreement was found between the reconstructed temperature distribution and the measured temperature data obtained by a thermocouple. All the experimental results demonstrate that the temperature distribution of ethylene flame can be reconstructed reasonably by using the Landweber method combined with the light-field imaging technique, which is proven to have potential for the use in noncontract measurement of temperature distribution in practical engineering applications.

Download Full-text

Research on Depth Estimation Method of Light Field Imaging Based on Big Data in Internet of Things From Camera Array

IEEE Access ◽

10.1109/access.2018.2870394 ◽

2018 ◽

Vol 6 ◽

pp. 52308-52320

Author(s):

Yue Wu

Keyword(s):

Big Data ◽

Internet Of Things ◽

Light Field ◽

Estimation Method ◽

Depth Estimation ◽

Camera Array ◽

Light Field Imaging ◽

Field Imaging

Download Full-text

EPIModules on a Geodesic: Toward 360° Light-Field Imaging

Electronic Imaging ◽

10.2352/issn.2470-1173.2019.3.sda-636 ◽

2019 ◽

Vol 2019 (3) ◽

pp. 636-1-636-6

Author(s):

H. Harlyn Baker ◽

Gregorij Kurillo ◽

Allan Miller ◽

Alessandro Temil ◽

Tom Defanti ◽

...

Keyword(s):

Light Field ◽

Light Field Imaging ◽

Field Imaging

Download Full-text

Rethinking environmental sound classification using convolutional neural networks: optimized parameter tuning of single feature extraction

Neural Computing and Applications ◽

10.1007/s00521-021-06091-7 ◽

2021 ◽

Author(s):

Yousef Abd Al-Hattab ◽

Hasan Firdaus Zaki ◽

Amir Akramin Shafie

Keyword(s):

Neural Networks ◽

Feature Extraction ◽

Convolutional Neural Networks ◽

Parameter Tuning ◽

Environmental Sound ◽

Sound Classification ◽

Single Feature

Download Full-text

Image Classification for the Automatic Feature Extraction in Human Worn Fashion Data

Mathematics ◽

10.3390/math9060624 ◽

2021 ◽

Vol 9 (6) ◽

pp. 624

Author(s):

Stefan Rohrmanstorfer ◽

Mikhail Komarov ◽

Felix Mödritscher

Keyword(s):

Neural Networks ◽

Feature Extraction ◽

Image Classification ◽

Convolutional Neural Networks ◽

Data Augmentation ◽

State Of The Art ◽

Image Data ◽

Classification Model ◽

Upper Body ◽

Automatic Feature Extraction

With the always increasing amount of image data, it has become a necessity to automatically look for and process information in these images. As fashion is captured in images, the fashion sector provides the perfect foundation to be supported by the integration of a service or application that is built on an image classification model. In this article, the state of the art for image classification is analyzed and discussed. Based on the elaborated knowledge, four different approaches will be implemented to successfully extract features out of fashion data. For this purpose, a human-worn fashion dataset with 2567 images was created, but it was significantly enlarged by the performed image operations. The results show that convolutional neural networks are the undisputed standard for classifying images, and that TensorFlow is the best library to build them. Moreover, through the introduction of dropout layers, data augmentation and transfer learning, model overfitting was successfully prevented, and it was possible to incrementally improve the validation accuracy of the created dataset from an initial 69% to a final validation accuracy of 84%. More distinct apparel like trousers, shoes and hats were better classified than other upper body clothes.

Download Full-text

Writer adaptive feature extraction based on convolutional neural networks for online handwritten Chinese character recognition

2015 13th International Conference on Document Analysis and Recognition (ICDAR) ◽

10.1109/icdar.2015.7333880 ◽

2015 ◽

Cited By ~ 8

Author(s):

Jun Du ◽

Jian-Fang Zhai ◽

Jin-Shui Hu ◽

Bo Zhu ◽

Si Wei ◽

...

Keyword(s):

Neural Networks ◽

Feature Extraction ◽

Convolutional Neural Networks ◽

Character Recognition ◽

Chinese Character ◽

Chinese Character Recognition ◽

Handwritten Chinese Character Recognition ◽

Adaptive Feature Extraction

Download Full-text