scholarly journals The Design and Implementation of Postprocessing for Depth Map on Real-Time Extraction System

2014 ◽  
Vol 2014 ◽  
pp. 1-10 ◽  
Author(s):  
Zhiwei Tang ◽  
Bin Li ◽  
Huosheng Li ◽  
Zheng Xu

Depth estimation becomes the key technology to resolve the communications of the stereo vision. We can get the real-time depth map based on hardware, which cannot implement complicated algorithm as software, because there are some restrictions in the hardware structure. Eventually, some wrong stereo matching will inevitably exist in the process of depth estimation by hardware, such as FPGA. In order to solve the problem a postprocessing function is designed in this paper. After matching cost unique test, the both left-right and right-left consistency check solutions are implemented, respectively; then, the cavities in depth maps can be filled by right depth values on the basis of right-left consistency check solution. The results in the experiments have shown that the depth map extraction and postprocessing function can be implemented in real time in the same system; what is more, the quality of the depth maps is satisfactory.

Sensors ◽  
2018 ◽  
Vol 19 (1) ◽  
pp. 81
Author(s):  
Inwook Shim ◽  
Tae-Hyun Oh ◽  
In Kweon

This paper presents a depth upsampling method that produces a high-fidelity dense depth map using a high-resolution RGB image and LiDAR sensor data. Our proposed method explicitly handles depth outliers and computes a depth upsampling with confidence information. Our key idea is the self-learning framework, which automatically learns to estimate the reliability of the upsampled depth map without human-labeled annotation. Thereby, our proposed method can produce a clear and high-fidelity dense depth map that preserves the shape of object structures well, which can be favored by subsequent algorithms for follow-up tasks. We qualitatively and quantitatively evaluate our proposed method by comparing other competing methods on the well-known Middlebury 2014 and KITTIbenchmark datasets. We demonstrate that our method generates accurate depth maps with smaller errors favorable against other methods while preserving a larger number of valid points, as we also show that our approach can be seamlessly applied to improve the quality of depth maps from other depth generation algorithms such as stereo matching and further discuss potential applications and limitations. Compared to previous work, our proposed method has similar depth errors on average, while retaining at least 3% more valid depth points.


Entropy ◽  
2021 ◽  
Vol 23 (5) ◽  
pp. 546
Author(s):  
Zhenni Li ◽  
Haoyi Sun ◽  
Yuliang Gao ◽  
Jiao Wang

Depth maps obtained through sensors are often unsatisfactory because of their low-resolution and noise interference. In this paper, we propose a real-time depth map enhancement system based on a residual network which uses dual channels to process depth maps and intensity maps respectively and cancels the preprocessing process, and the algorithm proposed can achieve real-time processing speed at more than 30 fps. Furthermore, the FPGA design and implementation for depth sensing is also introduced. In this FPGA design, intensity image and depth image are captured by the dual-camera synchronous acquisition system as the input of neural network. Experiments on various depth map restoration shows our algorithms has better performance than existing LRMC, DE-CNN and DDTF algorithms on standard datasets and has a better depth map super-resolution, and our FPGA completed the test of the system to ensure that the data throughput of the USB 3.0 interface of the acquisition system is stable at 226 Mbps, and support dual-camera to work at full speed, that is, 54 fps@ (1280 × 960 + 328 × 248 × 3).


2021 ◽  
Vol 11 (6) ◽  
pp. 2666
Author(s):  
Hafiz Muhammad Usama Hassan Alvi ◽  
Muhammad Shahid Farid ◽  
Muhammad Hassan Khan ◽  
Marcin Grzegorzek

Emerging 3D-related technologies such as augmented reality, virtual reality, mixed reality, and stereoscopy have gained remarkable growth due to their numerous applications in the entertainment, gaming, and electromedical industries. In particular, the 3D television (3DTV) and free-viewpoint television (FTV) enhance viewers’ television experience by providing immersion. They need an infinite number of views to provide a full parallax to the viewer, which is not practical due to various financial and technological constraints. Therefore, novel 3D views are generated from a set of available views and their depth maps using depth-image-based rendering (DIBR) techniques. The quality of a DIBR-synthesized image may be compromised for several reasons, e.g., inaccurate depth estimation. Since depth is important in this application, inaccuracies in depth maps lead to different textural and structural distortions that degrade the quality of the generated image and result in a poor quality of experience (QoE). Therefore, quality assessment DIBR-generated images are essential to guarantee an appreciative QoE. This paper aims at estimating the quality of DIBR-synthesized images and proposes a novel 3D objective image quality metric. The proposed algorithm aims to measure both textural and structural distortions in the DIBR image by exploiting the contrast sensitivity and the Hausdorff distance, respectively. The two measures are combined to estimate an overall quality score. The experimental evaluations performed on the benchmark MCL-3D dataset show that the proposed metric is reliable and accurate, and performs better than existing 2D and 3D quality assessment metrics.


Sensors ◽  
2019 ◽  
Vol 19 (3) ◽  
pp. 500 ◽  
Author(s):  
Luca Palmieri ◽  
Gabriele Scrofani ◽  
Nicolò Incardona ◽  
Genaro Saavedra ◽  
Manuel Martínez-Corral ◽  
...  

Light field technologies have seen a rise in recent years and microscopy is a field where such technology has had a deep impact. The possibility to provide spatial and angular information at the same time and in a single shot brings several advantages and allows for new applications. A common goal in these applications is the calculation of a depth map to reconstruct the three-dimensional geometry of the scene. Many approaches are applicable, but most of them cannot achieve high accuracy because of the nature of such images: biological samples are usually poor in features and do not exhibit sharp colors like natural scene. Due to such conditions, standard approaches result in noisy depth maps. In this work, a robust approach is proposed where accurate depth maps can be produced exploiting the information recorded in the light field, in particular, images produced with Fourier integral Microscope. The proposed approach can be divided into three main parts. Initially, it creates two cost volumes using different focal cues, namely correspondences and defocus. Secondly, it applies filtering methods that exploit multi-scale and super-pixels cost aggregation to reduce noise and enhance the accuracy. Finally, it merges the two cost volumes and extracts a depth map through multi-label optimization.


2019 ◽  
Vol 11 (10) ◽  
pp. 204 ◽  
Author(s):  
Dogan ◽  
Haddad ◽  
Ekmekcioglu ◽  
Kondoz

When it comes to evaluating perceptual quality of digital media for overall quality of experience assessment in immersive video applications, typically two main approaches stand out: Subjective and objective quality evaluation. On one hand, subjective quality evaluation offers the best representation of perceived video quality assessed by the real viewers. On the other hand, it consumes a significant amount of time and effort, due to the involvement of real users with lengthy and laborious assessment procedures. Thus, it is essential that an objective quality evaluation model is developed. The speed-up advantage offered by an objective quality evaluation model, which can predict the quality of rendered virtual views based on the depth maps used in the rendering process, allows for faster quality assessments for immersive video applications. This is particularly important given the lack of a suitable reference or ground truth for comparing the available depth maps, especially when live content services are offered in those applications. This paper presents a no-reference depth map quality evaluation model based on a proposed depth map edge confidence measurement technique to assist with accurately estimating the quality of rendered (virtual) views in immersive multi-view video content. The model is applied for depth image-based rendering in multi-view video format, providing comparable evaluation results to those existing in the literature, and often exceeding their performance.


2020 ◽  
Author(s):  
Chih-Shuan Huang ◽  
Ya-Han Huang ◽  
Din-Yuen Chan ◽  
Jar-Ferr Yang

Abstract Stereo matching is one of the most important topics in computer vision and aims at generating precise depth maps for various applications. The main challenge of stereo matching is to suppress inevitable errors occurring in smooth, occluded and discontinuous regions. To solve the aforementioned problems, in this paper, the proposed robust stereo matching system by using segment-based superpixels and magapixels to design adaptive stereo matching computation and dual-path refinement. After determination for edge and smooth regions and selection of matching cost, we suggest the segment–based adaptive support weights in cost aggregation instead of color similarity and spatial proximity only. The proposed dual-path depth refinements utilize the cross-based support region by referring texture features to correct the inaccurate disparities with iterative procedures to improve the depth maps for shape reserving. Specially for left-most and right most regions, the segment-based refinement can greatly improve the mismatched disparity holes. The experimental results demonstrate that the proposed system can obtain higher accurate depth maps compared with the conventional methods.


2020 ◽  
Author(s):  
Chih-Shuan Huang ◽  
Ya-Han Huang ◽  
Din-Yuen Chan ◽  
Jar-Ferr Yang

Abstract Stereo matching is one of the most important topics in computer vision and aims at generating precise depth maps for various smart applications. The major challenge of stereo matching is to suppress inevitable errors occurring in smooth, occluded and discontinuous regions. In this paper, we propose a robust stereo matching system, which is based on segment-based superpixels, to design adaptive matching computation and dual-path refinement. After the selection of matching costs, we suggest the segment-based adaptive support weights for cost aggregation, instead of color similarity and spatial proximity, to achieve precise depth estimation. Then, the proposed dual-path depth refinement, which refers the texture features in a cross-based support region, corrects the inaccurate disparities to successively refine the depth maps with shape reserving. Specially for left-most and right most regions, the segment-based refinement can greatly improve the mismatched disparity holes. The experimental results show that the proposed system achieves higher accurate depth maps than the conventional stereo matching methods.


Sensors ◽  
2019 ◽  
Vol 19 (20) ◽  
pp. 4434 ◽  
Author(s):  
Sangwon Kim ◽  
Jaeyeal Nam ◽  
Byoungchul Ko

Depth estimation is a crucial and fundamental problem in the computer vision field. Conventional methods re-construct scenes using feature points extracted from multiple images; however, these approaches require multiple images and thus are not easily implemented in various real-time applications. Moreover, the special equipment required by hardware-based approaches using 3D sensors is expensive. Therefore, software-based methods for estimating depth from a single image using machine learning or deep learning are emerging as new alternatives. In this paper, we propose an algorithm that generates a depth map in real time using a single image and an optimized lightweight efficient neural network (L-ENet) algorithm instead of physical equipment, such as an infrared sensor or multi-view camera. Because depth values have a continuous nature and can produce locally ambiguous results, pixel-wise prediction with ordinal depth range classification was applied in this study. In addition, in our method various convolution techniques are applied to extract a dense feature map, and the number of parameters is greatly reduced by reducing the network layer. By using the proposed L-ENet algorithm, an accurate depth map can be generated from a single image quickly and, in a comparison with the ground truth, we can produce depth values closer to those of the ground truth with small errors. Experiments confirmed that the proposed L-ENet can achieve a significantly improved estimation performance over the state-of-the-art algorithms in depth estimation based on a single image.


2021 ◽  
Author(s):  
Richard Rzeszutek

This dissertation proposes a novel framework for recovering relative depth maps from a video. The framework is composed of two parts: a depth estimator and a sparse label interpolator. These parts are completely separate from one another and can operate independently. Prior methods have tended to heavily couple the interpolation stage with the depth estimation, which can assist with automation at the expense of flexibility. The loss of this flexibility can in fact be worse than any advantage gained by coupling the two stages together. This dissertation shows how by treating the two stages separately, it is very easy to change the quality of the results with little effort. It also leaves room for other adjustments. The depth estimator is based upon well-established computer vision principles and only has the restriction that the camera must be moving in order to obtain depth estimates. By starting from first principles, this dissertation has developed a new approach for quickly estimating relative depth. That is, it is able to answer the question, “is this feature closer than another," with relatively little computational overhead. The estimator is designed using a pipeline-style approach so that it produces sparse depth estimates in an online fashion; i.e. a depth estimate is automatically available for each new frame presented to the estimator. Finally, the interpolator applies an existing method based upon edge-aware filtering to generate the final depth maps. When temporal filters are used, the interpolation stage is able to very easily handle frames without any depth information, such as when the camera was stationary. However, unlike the prior work, this dissertation establishes the theoretical background for this type of interpolation and addresses some of the associated numerical problems. Strategies for dealing with these issues have also been provided


Sign in / Sign up

Export Citation Format

Share Document