scholarly journals Unsupervised Learning for Depth, Ego-Motion, and Optical Flow Estimation Using Coupled Consistency Conditions

Sensors ◽  
2019 ◽  
Vol 19 (11) ◽  
pp. 2459 ◽  
Author(s):  
Ji-Hun Mun ◽  
Moongu Jeon ◽  
Byung-Geun Lee

Herein, we propose an unsupervised learning architecture under coupled consistency conditions to estimate the depth, ego-motion, and optical flow. Previously invented learning techniques in computer vision adopted a large amount of the ground truth dataset for network training. A ground truth dataset, including depth and optical flow collected from the real world, requires tremendous effort in pre-processing due to the exposure to noise artifacts. In this paper, we propose a framework that trains networks while using a different type of data with combined losses that are derived from a coupled consistency structure. The core concept is composed of two parts. First, we compare the optical flows, which are estimated from both the depth plus ego-motion and flow estimation network. Subsequently, to prevent the effects of the artifacts of the occluded regions in the estimated optical flow, we compute flow local consistency along the forward–backward directions. Second, synthesis consistency enables the exploration of the geometric correlation between the spatial and temporal domains in a stereo video. We perform extensive experiments on the depth, ego-motion, and optical flow estimation on the Karlsruhe Institute of Technology and Toyota Technological Institute (KITTI) dataset. We verify that the flow local consistency loss improves the optical flow accuracy in terms of the occluded regions. Furthermore, we also show that the view-synthesis-based photometric loss enhances the depth and ego-motion accuracy via scene projection. The experimental results exhibit the competitive performance of the estimated depth and the optical flow; moreover, the induced ego-motion is comparable to that obtained from other unsupervised methods.

2020 ◽  
Vol 34 (07) ◽  
pp. 10713-10720
Author(s):  
Mingyu Ding ◽  
Zhe Wang ◽  
Bolei Zhou ◽  
Jianping Shi ◽  
Zhiwu Lu ◽  
...  

A major challenge for video semantic segmentation is the lack of labeled data. In most benchmark datasets, only one frame of a video clip is annotated, which makes most supervised methods fail to utilize information from the rest of the frames. To exploit the spatio-temporal information in videos, many previous works use pre-computed optical flows, which encode the temporal consistency to improve the video segmentation. However, the video segmentation and optical flow estimation are still considered as two separate tasks. In this paper, we propose a novel framework for joint video semantic segmentation and optical flow estimation. Semantic segmentation brings semantic information to handle occlusion for more robust optical flow estimation, while the non-occluded optical flow provides accurate pixel-level temporal correspondences to guarantee the temporal consistency of the segmentation. Moreover, our framework is able to utilize both labeled and unlabeled frames in the video through joint training, while no additional calculation is required in inference. Extensive experiments show that the proposed model makes the video semantic segmentation and optical flow estimation benefit from each other and outperforms existing methods under the same settings in both tasks.


Author(s):  
Shuaicheng Liu ◽  
Kunming Luo ◽  
Nianjin Ye ◽  
Chuan Wang ◽  
Jue Wanga ◽  
...  

2012 ◽  
Vol 24 (4) ◽  
pp. 686-698 ◽  
Author(s):  
Lei Chen ◽  
◽  
Hua Yang ◽  
Takeshi Takaki ◽  
Idaku Ishii

In this paper, we propose a novel method for accurate optical flow estimation in real time for both high-speed and low-speed moving objects based on High-Frame-Rate (HFR) videos. We introduce a multiframe-straddling function to select several pairs of images with different frame intervals from an HFR image sequence even when the estimated optical flow is required to output at standard video rates (NTSC at 30 fps and PAL at 25 fps). The multiframestraddling function can remarkably improve the measurable range of velocities in optical flow estimation without heavy computation by adaptively selecting a small frame interval for high-speed objects and a large frame interval for low-speed objects. On the basis of the relationship between the frame intervals and the accuracies of the optical flows estimated by the Lucas–Kanade method, we devise a method to determine multiple frame intervals in optical flow estimation and select an optimal frame interval from these intervals according to the amplitude of the estimated optical flow. Our method was implemented using software on a high-speed vision platform, IDP Express. The estimated optical flows were accurately outputted at intervals of 40 ms in real time by using three pairs of 512×512 images; these images were selected by frame-straddling a 2000-fps video with intervals of 0.5, 1.5, and 5 ms. Several experiments were performed for high-speed movements to verify that our method can remarkably improve the measurable range of velocities in optical flow estimation, compared to optical flows estimated for 25-fps videos with the Lucas–Kanade method.


Sensors ◽  
2021 ◽  
Vol 21 (4) ◽  
pp. 1150
Author(s):  
Jun Nagata ◽  
Yusuke Sekikawa ◽  
Yoshimitsu Aoki

In this work, we propose a novel method of estimating optical flow from event-based cameras by matching the time surface of events. The proposed loss function measures the timestamp consistency between the time surface formed by the latest timestamp of each pixel and the one that is slightly shifted in time. This makes it possible to estimate dense optical flows with high accuracy without restoring luminance or additional sensor information. In the experiment, we show that the gradient was more correct and the loss landscape was more stable than the variance loss in the motion compensation approach. In addition, we show that the optical flow can be estimated with high accuracy by optimization with L1 smoothness regularization using publicly available datasets.


Author(s):  
Pengpeng Liu ◽  
Irwin King ◽  
Michael R. Lyu ◽  
Jia Xu

We present DDFlow, a data distillation approach to learning optical flow estimation from unlabeled data. The approach distills reliable predictions from a teacher network, and uses these predictions as annotations to guide a student network to learn optical flow. Unlike existing work relying on handcrafted energy terms to handle occlusion, our approach is data-driven, and learns optical flow for occluded pixels. This enables us to train our model with a much simpler loss function, and achieve a much higher accuracy. We conduct a rigorous evaluation on the challenging Flying Chairs, MPI Sintel, KITTI 2012 and 2015 benchmarks, and show that our approach significantly outperforms all existing unsupervised learning methods, while running at real time.


Author(s):  
Claudio S. Ravasio ◽  
Theodoros Pissas ◽  
Edward Bloch ◽  
Blanca Flores ◽  
Sepehr Jalali ◽  
...  

Abstract Purpose Sustained delivery of regenerative retinal therapies by robotic systems requires intra-operative tracking of the retinal fundus. We propose a supervised deep convolutional neural network to densely predict semantic segmentation and optical flow of the retina as mutually supportive tasks, implicitly inpainting retinal flow information missing due to occlusion by surgical tools. Methods As manual annotation of optical flow is infeasible, we propose a flexible algorithm for generation of large synthetic training datasets on the basis of given intra-operative retinal images. We evaluate optical flow estimation by tracking a grid and sparsely annotated ground truth points on a benchmark of challenging real intra-operative clips obtained from an extensive internally acquired dataset encompassing representative vitreoretinal surgical cases. Results The U-Net-based network trained on the synthetic dataset is shown to generalise well to the benchmark of real surgical videos. When used to track retinal points of interest, our flow estimation outperforms variational baseline methods on clips containing tool motions which occlude the points of interest, as is routinely observed in intra-operatively recorded surgery videos. Conclusions The results indicate that complex synthetic training datasets can be used to specifically guide optical flow estimation. Our proposed algorithm therefore lays the foundation for a robust system which can assist with intra-operative tracking of moving surgical targets even when occluded.


Sign in / Sign up

Export Citation Format

Share Document