scholarly journals Deep Learning Segmentation and 3D Reconstruction of Road Markings Using Multiview Aerial Imagery

2019 ◽  
Vol 8 (1) ◽  
pp. 47 ◽  
Author(s):  
Franz Kurz ◽  
Seyed Azimi ◽  
Chun-Yu Sheu ◽  
Pablo d’Angelo

The 3D information of road infrastructures is growing in importance with the development of autonomous driving. In this context, the exact 2D position of road markings as well as height information play an important role in, e.g., lane-accurate self-localization of autonomous vehicles. In this paper, the overall task is divided into an automatic segmentation followed by a refined 3D reconstruction. For the segmentation task, we applied a wavelet-enhanced fully convolutional network on multiview high-resolution aerial imagery. Based on the resulting 2D segments in the original images, we propose a successive workflow for the 3D reconstruction of road markings based on a least-squares line-fitting in multiview imagery. The 3D reconstruction exploits the line character of road markings with the aim to optimize the best 3D line location by minimizing the distance from its back projection to the detected 2D line in all the covering images. Results showed an improved IoU of the automatic road marking segmentation by exploiting the multiview character of the aerial images and a more accurate 3D reconstruction of the road surface compared to the semiglobal matching (SGM) algorithm. Further, the approach avoids the matching problem in non-textured image parts and is not limited to lines of finite length. In this paper, the approach is presented and validated on several aerial image data sets covering different scenarios like motorways and urban regions.

Author(s):  
C.-Y. Sheu ◽  
F. Kurz ◽  
P. Angelo

<p><strong>Abstract.</strong> The 3D information of road infrastructures are gaining importance with the development of autonomous driving. The exact absolute position and height of lane markings, for example, support lane-accurate localization. Several approaches have been proposed for the 3D reconstruction of line features from multi-view airborne optical imagery. However, standard appearance-based matching approaches for 3D reconstruction are hardly applicable on lane markings due to the similar color profile of all lane markings and the lack of textures in their neighboring areas. We present a workflow for 3D lane markings reconstruction without explicit feature matching process using multi-view aerial imagery. The aim is to optimize the best 3D line location by minimizing the distance from its back projection to the detected 2D line in all the covering images. Firstly, the lane markings are automatically extracted from aerial images using standard line detection algorithms. By projecting these extracted lines onto the Semi-Global Matching (SGM) generated Digital Surface Model (DSM), the approximate 3D line segments are generated. Starting from these approximations, the 3D lines are iteratively refined based on the detected 2D lines in the original images and the viewing geometry. The proposed approach relies on precise detection of 2D lines in image space, a pre-knowledge of the approximate 3D line segments, and it heavily relies on image orientations. Nevertheless, it avoids the problem of non-textured neighborhood and is not limited to lines of finite length. The theoretical precision of 3D reconstruction with the proposed framework is evaluated.</p>


2019 ◽  
Vol 11 (10) ◽  
pp. 1157 ◽  
Author(s):  
Jorge Fuentes-Pacheco ◽  
Juan Torres-Olivares ◽  
Edgar Roman-Rangel ◽  
Salvador Cervantes ◽  
Porfirio Juarez-Lopez ◽  
...  

Crop segmentation is an important task in Precision Agriculture, where the use of aerial robots with an on-board camera has contributed to the development of new solution alternatives. We address the problem of fig plant segmentation in top-view RGB (Red-Green-Blue) images of a crop grown under open-field difficult circumstances of complex lighting conditions and non-ideal crop maintenance practices defined by local farmers. We present a Convolutional Neural Network (CNN) with an encoder-decoder architecture that classifies each pixel as crop or non-crop using only raw colour images as input. Our approach achieves a mean accuracy of 93.85% despite the complexity of the background and a highly variable visual appearance of the leaves. We make available our CNN code to the research community, as well as the aerial image data set and a hand-made ground truth segmentation with pixel precision to facilitate the comparison among different algorithms.


2021 ◽  
Vol 13 (14) ◽  
pp. 2656
Author(s):  
Furong Shi ◽  
Tong Zhang

Deep-learning technologies, especially convolutional neural networks (CNNs), have achieved great success in building extraction from areal images. However, shape details are often lost during the down-sampling process, which results in discontinuous segmentation or inaccurate segmentation boundary. In order to compensate for the loss of shape information, two shape-related auxiliary tasks (i.e., boundary prediction and distance estimation) were jointly learned with building segmentation task in our proposed network. Meanwhile, two consistency constraint losses were designed based on the multi-task network to exploit the duality between the mask prediction and two shape-related information predictions. Specifically, an atrous spatial pyramid pooling (ASPP) module was appended to the top of the encoder of a U-shaped network to obtain multi-scale features. Based on the multi-scale features, one regression loss and two classification losses were used for predicting the distance-transform map, segmentation, and boundary. Two inter-task consistency-loss functions were constructed to ensure the consistency between distance maps and masks, and the consistency between masks and boundary maps. Experimental results on three public aerial image data sets showed that our method achieved superior performance over the recent state-of-the-art models.


2017 ◽  
Vol 29 (4) ◽  
pp. 697-705 ◽  
Author(s):  
Satoshi Muramatsu ◽  
Tetsuo Tomizawa ◽  
Shunsuke Kudoh ◽  
Takashi Suehiro ◽  
◽  
...  

In order to realize the work of goods conveyance etc. by robot, localization of robot position is fundamental technology component. Map matching methods is one of the localization technique. In map matching method, usually, to create the map data for localization, we have to operate the robot and measure the environment (teaching run). This operation requires a lot of time and work. In recent years, due to improved Internet services, aerial image data is easily obtained from Google Maps etc. Therefore, we utilize the aerial images as a map data to for mobile robots localization and navigation without teaching run. In this paper, we proposed the robot localization and navigation technique using aerial images. We verified the proposed technique by the localization and autonomous running experiment.


Author(s):  
D. Hein ◽  
R. Berger

<p><strong>Abstract.</strong> Many remote sensing applications demand for a fast and efficient way of generating orthophoto maps from raw aerial images. One prerequisite is direct georeferencing, which allows to geolocate aerial images to their geographic position on the earth’s surface. But this is only half the story. When dealing with a large quantity of highly overlapping images, a major challenge is to select the most suitable image parts in order to generate seamless aerial maps of the captured area. This paper proposes a method that quickly determines such an optimal (rectangular) section for each single aerial image, which in turn can be used for generating seamless aerial maps. Its key approach is to clip aerial images depending on their geometric intersections with a terrain elevation model of the captured area, which is why we call it <i>terrain aware image clipping</i> (TAC). The method has a modest computational footprint and is therefore applicable even for rather limited embedded vision systems. It can be applied for both, real-time aerial mapping applications using data links as well as for rapid map generation right after landing without any postprocessing step. Referring to real-time applications, this method also minimizes transmission of redundant image data. The proposed method has already been demonstrated in several search-and-rescue scenarios and real-time mapping applications using a broadband data link and diffent kinds of camera and carrier systems. Moreover, a patent for this technology is pending.</p>


Author(s):  
M. Madadikhaljan ◽  
R. Bahmanyar ◽  
S. M. Azimi ◽  
P. Reinartz ◽  
U. Sörgel

Abstract. Haze contains floating particles in the air which can result in image quality degradation and visibility reduction in airborne data. Haze removal task has several applications in image enhancement and can improve the performance of automatic image analysis systems, namely object detection and segmentation. Unlike rich haze removal literature in ground imagery, there is a lack of methods specifically designed for aerial imagery, considering the fact that there is a characteristic difference between the aerial imagery domain and ground one. In this paper, we propose a method to dehaze aerial images using Convolutional Neural Networks (CNNs). Currently, there is no available data for dehazing methods in aerial imagery. To address this issue, we have created a syntheticallyhazed aerial image dataset to train the neural network on aerial hazy image dataset. We train All-in-One dehazing network (AODNet) as the base approach on hazy aerial images and compare the performance of our proposed approach against the classical model. We have tested our model on natural as well as the synthetically-hazed aerial images. Both qualitative and quantitative results of the adapted network show an improvement in dehazing results. We show that the adapted AOD-Net on our aerial image test set increases PSNR and SSim by 2.2% and 9%, respectively.


Author(s):  
X. Zhuo ◽  
F. Kurz ◽  
P. Reinartz

Manned aircraft has long been used for capturing large-scale aerial images, yet the high costs and weather dependence restrict its availability in emergency situations. In recent years, MAV (Micro Aerial Vehicle) emerged as a novel modality for aerial image acquisition. Its maneuverability and flexibility enable a rapid awareness of the scene of interest. Since these two platforms deliver scene information from different scale and different view, it makes sense to fuse these two types of complimentary imagery to achieve a quick, accurate and detailed description of the scene, which is the main concern of real-time situation awareness. This paper proposes a method to fuse multi-view and multi-scale aerial imagery by establishing a common reference frame. In particular, common features among MAV images and geo-referenced airplane images can be extracted by a scale invariant feature detector like SIFT. From the tie point of geo-referenced images we derive the coordinate of corresponding ground points, which are then utilized as ground control points in global bundle adjustment of MAV images. In this way, the MAV block is aligned to the reference frame. Experiment results show that this method can achieve fully automatic geo-referencing of MAV images even if GPS/IMU acquisition has dropouts, and the orientation accuracy is improved compared to the GPS/IMU based georeferencing. The concept for a subsequent 3D classification method is also described in this paper.


2019 ◽  
Vol 11 (3) ◽  
pp. 315
Author(s):  
Xiuchuan Xie ◽  
Tao Yang ◽  
DongDong Li ◽  
Zhi Li ◽  
Yanning Zhang

With extensive applications of Unmanned Aircraft Vehicle (UAV) in the field of remotesensing, 3D reconstruction using aerial images has been a vibrant area of research. However,fast large-scale 3D reconstruction is a challenging task. For aerial image datasets, large scale meansthat the number and resolution of images are enormous, which brings significant computationalcost to the 3D reconstruction, especially in the process of Structure from Motion (SfM). In thispaper, for fast large-scale SfM, we propose a clustering-aligning framework that hierarchicallymerges partial structures to reconstruct the full scene. Through image clustering, an overlappingrelationship between image subsets is established. With the overlapping relationship, we proposea similarity transformation estimation method based on joint camera poses of common images.Finally, we introduce the closed-loop constraint and propose a similarity transformation-based hybridoptimization method to make the merged complete scene seamless. The advantage of the proposedmethod is a significant efficiency improvement without a marginal loss in accuracy. Experimentalresults on the Qinling dataset captured over Qinling mountain covering 57 square kilometersdemonstrate the efficiency and robustness of the proposed method.


Author(s):  
A. Moussa ◽  
N. El-Sheimy

The last few years have witnessed an increasing volume of aerial image data because of the extensive improvements of the Unmanned Aerial Vehicles (UAVs). These newly developed UAVs have led to a wide variety of applications. A fast assessment of the achieved coverage and overlap of the acquired images of a UAV flight mission is of great help to save the time and cost of the further steps. A fast automatic stitching of the acquired images can help to visually assess the achieved coverage and overlap during the flight mission. This paper proposes an automatic image stitching approach that creates a single overview stitched image using the acquired images during a UAV flight mission along with a coverage image that represents the count of overlaps between the acquired images. The main challenge of such task is the huge number of images that are typically involved in such scenarios. A short flight mission with image acquisition frequency of one second can capture hundreds to thousands of images. The main focus of the proposed approach is to reduce the processing time of the image stitching procedure by exploiting the initial knowledge about the images positions provided by the navigation sensors. The proposed approach also avoids solving for all the transformation parameters of all the photos together to save the expected long computation time if all the parameters were considered simultaneously. After extracting the points of interest of all the involved images using Scale-Invariant Feature Transform (SIFT) algorithm, the proposed approach uses the initial image’s coordinates to build an incremental constrained Delaunay triangulation that represents the neighborhood of each image. This triangulation helps to match only the neighbor images and therefore reduces the time-consuming features matching step. The estimated relative orientation between the matched images is used to find a candidate seed image for the stitching process. The pre-estimated transformation parameters of the images are employed successively in a growing fashion to create the stitched image and the coverage image. The proposed approach is implemented and tested using the images acquired through a UAV flight mission and the achieved results are presented and discussed.


Author(s):  
R. Bahmanyar ◽  
S. M. Azimi ◽  
P. Reinartz

Abstract. Geo-referenced real-time vehicle and person tracking in aerial imagery has a variety of applications such as traffic and large-scale event monitoring, disaster management, and also for input into predictive traffic and crowd models. However, object tracking in aerial imagery is still an unsolved challenging problem due to the tiny size of the objects as well as different scales and the limited temporal resolution of geo-referenced datasets. In this work, we propose a new approach based on Convolutional Neural Networks (CNNs) to track multiple vehicles and people in aerial image sequences. As the large number of objects in aerial images can exponentially increase the processing demands in multiple object tracking scenarios, the proposed approach utilizes the stack of micro CNNs, where each micro CNN is responsible for a single-object tracking task. We call our approach Stack of Micro-Single- Object-Tracking CNNs (SMSOT-CNN). More precisely, using a two-stream CNN, we extract a set of features from two consecutive frames for each object, with the given location of the object in the previous frame. Then, we assign each MSOT-CNN the extracted features of each object to predict the object location in the current frame. We train and validate the proposed approach on the vehicle and person sets of the KIT AIS dataset of object tracking in aerial image sequences. Results indicate the accurate and time-efficient tracking of multiple vehicles and people by the proposed approach.


Sign in / Sign up

Export Citation Format

Share Document