scholarly journals MDEAN: Multi-View Disparity Estimation with an Asymmetric Network

Electronics ◽  
2020 ◽  
Vol 9 (6) ◽  
pp. 924 ◽  
Author(s):  
Zhao Pei ◽  
Deqiang Wen ◽  
Yanning Zhang ◽  
Miao Ma ◽  
Min Guo ◽  
...  

In recent years, disparity estimation of a scene based on deep learning methods has been extensively studied and significant progress has been made. In contrast, a traditional image disparity estimation method requires considerable resources and consumes much time in processes such as stereo matching and 3D reconstruction. At present, most deep learning based disparity estimation methods focus on estimating disparity based on monocular images. Motivated by the results of traditional methods that multi-view methods are more accurate than monocular methods, especially for scenes that are textureless and have thin structures, in this paper, we present MDEAN, a new deep convolutional neural network to estimate disparity using multi-view images with an asymmetric encoder–decoder network structure. First, our method takes an arbitrary number of multi-view images as input. Next, we use these images to produce a set of plane-sweep cost volumes, which are combined to compute a high quality disparity map using an end-to-end asymmetric network. The results show that our method performs better than state-of-the-art methods, in particular, for outdoor scenes with the sky, flat surfaces and buildings.

Sensors ◽  
2020 ◽  
Vol 20 (21) ◽  
pp. 6188
Author(s):  
Ségolène Rogge ◽  
Ionut Schiopu ◽  
Adrian Munteanu

The paper presents a novel depth-estimation method for light-field (LF) images based on innovative multi-stereo matching and machine-learning techniques. In the first stage, a novel block-based stereo matching algorithm is employed to compute the initial estimation. The proposed algorithm is specifically designed to operate on any pair of sub-aperture images (SAIs) in the LF image and to compute the pair’s corresponding disparity map. For the central SAI, a disparity fusion technique is proposed to compute the initial disparity map based on all available pairwise disparities. In the second stage, a novel pixel-wise deep-learning (DL)-based method for residual error prediction is employed to further refine the disparity estimation. A novel neural network architecture is proposed based on a new structure of layers. The proposed DL-based method is employed to predict the residual error of the initial estimation and to refine the final disparity map. The experimental results demonstrate the superiority of the proposed framework and reveal that the proposed method achieves an average improvement of 15.65% in root mean squared error (RMSE), 43.62% in mean absolute error (MAE), and 5.03% in structural similarity index (SSIM) over machine-learning-based state-of-the-art methods.


2021 ◽  
Vol 297 ◽  
pp. 01055
Author(s):  
Mohamed El Ansari ◽  
Ilyas El Jaafari ◽  
Lahcen Koutti

This paper proposes a new edge based stereo matching approach for road applications. The new approach consists in matching the edge points extracted from the input stereo images using temporal constraints. At the current frame, we propose to estimate a disparity range for each image line based on the disparity map of its preceding one. The stereo images are divided into multiple parts according to the estimated disparity ranges. The optimal solution of each part is independently approximated via the state-of-the-art energy minimization approach Graph cuts. The disparity search space at each image part is very small compared to the global one, which improves the results and reduces the execution time. Furthermore, as a similarity criterion between corresponding edge points, we propose a new cost function based on the intensity, the gradient magnitude and gradient orientation. The proposed method has been tested on virtual stereo images, and it has been compared to a recently proposed method and the results are satisfactory.


2021 ◽  
Vol 5 (4) ◽  
pp. 334-341
Author(s):  
D Venkata Ratnam ◽  
◽  
K Nageswara Rao ◽  

<abstract> <p>The advanced neural network methods solve significant signal estimation and channel characterization difficulties in the next-generation 5G wireless communication systems. The number of transmitted signal copies received through multiple paths at the receiver leads to delay spread, which intern causes interference in communication. These adverse effects of the interference can be mitigated with the orthogonal frequency division modulation (OFDM) technique. Furthermore, the proper signal detection methods optimal channel estimation enhances the performance of the multicarrier wireless communication system. In this paper, bi-directional long short-term memory (Bi-LSTM) based deep learning method is implemented to estimate the channel in different multipath scenarios. The impact of the pilots and cyclic prefix on the performance of Bi LSTM algorithm is analyzed. It is evident from the symbol-error rate (SER) results that the Bi-LSTM algorithm performs better than the state of art channel estimation methods known as the Minimum Mean Square and Error (MMSE) estimation method.</p> </abstract>


Author(s):  
Xinrui Yuan ◽  
Hairong Wang ◽  
Jun Wang

In view of the significant effects of deep learning in graphics and image processing, research on human pose estimation methods using deep learning has attracted much attention, and many method models have been produced one after another. On the basis of tracking and in-depth study of domestic and foreign research results, this paper concentrates on 3D single person pose estimation methods, contrasts and analyzes three methods of end-to-end, staged and hybrid network models, and summarizes the characteristics of the methods. For evaluating method performance, set up an experimental environment, and utilize the Human3.6M data set to test several mainstream methods. The test results indicate that the hybrid network model method has a better performance in the field of human pose estimation.


2020 ◽  
Vol 34 (07) ◽  
pp. 12926-12934
Author(s):  
Youmin Zhang ◽  
Yimin Chen ◽  
Xiao Bai ◽  
Suihanjin Yu ◽  
Kun Yu ◽  
...  

State-of-the-art deep learning based stereo matching approaches treat disparity estimation as a regression problem, where loss function is directly defined on true disparities and their estimated ones. However, disparity is just a byproduct of a matching process modeled by cost volume, while indirectly learning cost volume driven by disparity regression is prone to overfitting since the cost volume is under constrained. In this paper, we propose to directly add constraints to the cost volume by filtering cost volume with unimodal distribution peaked at true disparities. In addition, variances of the unimodal distributions for each pixel are estimated to explicitly model matching uncertainty under different contexts. The proposed architecture achieves state-of-the-art performance on Scene Flow and two KITTI stereo benchmarks. In particular, our method ranked the 1st place of KITTI 2012 evaluation and the 4th place of KITTI 2015 evaluation (recorded on 2019.8.20). The codes of AcfNet are available at: https://github.com/youmi-zym/AcfNet.


Author(s):  
M. Cournet ◽  
A. Giros ◽  
L. Dumas ◽  
J. M. Delvit ◽  
D. Greslou ◽  
...  

In the frame of its earth observation missions, CNES created a library called QPEC, and one of its launcher called Medicis. QPEC / Medicis is a sub-pixel two-dimensional stereo matching algorithm that works on an image pair. This tool is a block matching algorithm, which means that it is based on a local method. Moreover it does not regularize the results found. It proposes several matching costs, such as the Zero mean Normalised Cross-Correlation or statistical measures (the Mutual Information being one of them), and different match validation flags. QPEC / Medicis is able to compute a two-dimensional dense disparity map with a subpixel precision. Hence, it is more versatile than disparity estimation methods found in computer vision literature, which often assume an epipolar geometry. <br><br> CNES uses Medicis, among other applications, during the in-orbit image quality commissioning of earth observation satellites. For instance the Pléiades-HR 1A & 1B and the Sentinel-2 geometric calibrations are based on this block matching algorithm. Over the years, it has become a common tool in ground segments for in-flight monitoring purposes. For these two kinds of applications, the two-dimensional search and the local sub-pixel measure without regularization can be essential. This tool is also used to generate automatic digital elevation models, for which it was not initially dedicated. <br><br> This paper deals with the QPEC / Medicis algorithm. It also presents some of its CNES applications (in-orbit commissioning, in flight monitoring or digital elevation model generation). Medicis software is distributed outside the CNES as well. This paper finally describes some of these external applications using Medicis, such as ground displacement measurement, or intra-oral scanner in the dental domain.


PLoS ONE ◽  
2021 ◽  
Vol 16 (8) ◽  
pp. e0251657
Author(s):  
Zedong Huang ◽  
Jinan Gu ◽  
Jing Li ◽  
Xuefei Yu

Deep learning based on a convolutional neural network (CNN) has been successfully applied to stereo matching. Compared with the traditional method, the speed and accuracy of this method have been greatly improved. However, the existing stereo matching framework based on a CNN often encounters two problems. First, the existing stereo matching network has many parameters, which leads to the matching running time being too long. Second, the disparity estimation is inadequate in some regions where reflections, repeated textures, and fine structures may lead to ill-posed problems. Through the lightweight improvement of the PSMNet (Pyramid Stereo Matching Network) model, the common matching effect of ill-conditioned areas such as repeated texture areas and weak texture areas is solved. In the feature extraction part, ResNeXt is introduced to learn unitary feature extraction, and the ASPP (Atrous Spatial Pyramid Pooling) module is trained to extract multiscale spatial feature information. The feature fusion module is designed to effectively fuse the feature information of different scales to construct the matching cost volume. The improved 3D CNN uses the stacked encoding and decoding structure to further regularize the matching cost volume and obtain the corresponding relationship between feature points under different parallax conditions. Finally, the disparity map is obtained by a regression. We evaluate our method on the Scene Flow, KITTI 2012, and KITTI 2015 stereo datasets. The experiments show that the proposed stereo matching network achieves a comparable prediction accuracy and much faster running speed compared with PSMNet.


Sensors ◽  
2021 ◽  
Vol 21 (18) ◽  
pp. 6016
Author(s):  
Ming Wei ◽  
Ming Zhu ◽  
Yi Wu ◽  
Jiaqi Sun ◽  
Jiarong Wang ◽  
...  

Stereo matching networks based on deep learning are widely developed and can obtain excellent disparity estimation. We present a new end-to-end fast deep learning stereo matching network in this work that aims to determine the corresponding disparity from two stereo image pairs. We extract the characteristics of the low-resolution feature images using the stacked hourglass structure feature extractor and build a multi-level detailed cost volume. We also use the edge of the left image to guide disparity optimization and sub-sample with the low-resolution data, ensuring excellent accuracy and speed at the same time. Furthermore, we design a multi-cross attention model for binocular stereo matching to improve the matching accuracy and achieve end-to-end disparity regression effectively. We evaluate our network on Scene Flow, KITTI2012, and KITTI2015 datasets, and the experimental results show that the speed and accuracy of our method are excellent.


Author(s):  
J. Liu ◽  
S. Ji ◽  
C. Zhang ◽  
Z. Qin

Dense stereo matching has been extensively studied in photogrammetry and computer vision. In this paper we evaluate the application of deep learning based stereo methods, which were raised from 2016 and rapidly spread, on aerial stereos other than ground images that are commonly used in computer vision community. Two popular methods are evaluated. One learns matching cost with a convolutional neural network (known as MC-CNN); the other produces a disparity map in an end-to-end manner by utilizing both geometry and context (known as GC-net). First, we evaluate the performance of the deep learning based methods for aerial stereo images by a direct model reuse. The models pre-trained on KITTI 2012, KITTI 2015 and Driving datasets separately, are directly applied to three aerial datasets. We also give the results of direct training on target aerial datasets. Second, the deep learning based methods are compared to the classic stereo matching method, Semi-Global Matching(SGM), and a photogrammetric software, SURE, on the same aerial datasets. Third, transfer learning strategy is introduced to aerial image matching based on the assumption of a few target samples available for model fine tuning. It experimentally proved that the conventional methods and the deep learning based methods performed similarly, and the latter had greater potential to be explored.


Sign in / Sign up

Export Citation Format

Share Document