scholarly journals Intensity Thresholding and Deep Learning Based Lane Marking Extraction and Lane Width Estimation from Mobile Light Detection and Ranging (LiDAR) Point Clouds

2020 ◽  
Vol 12 (9) ◽  
pp. 1379 ◽  
Author(s):  
Yi-Ting Cheng ◽  
Ankit Patel ◽  
Chenglu Wen ◽  
Darcy Bullock ◽  
Ayman Habib

Lane markings are one of the essential elements of road information, which is useful for a wide range of transportation applications. Several studies have been conducted to extract lane markings through intensity thresholding of Light Detection and Ranging (LiDAR) point clouds acquired by mobile mapping systems (MMS). This paper proposes an intensity thresholding strategy using unsupervised intensity normalization and a deep learning strategy using automatically labeled training data for lane marking extraction. For comparative evaluation, original intensity thresholding and deep learning using manually established labels strategies are also implemented. A pavement surface-based assessment of lane marking extraction by the four strategies is conducted in asphalt and concrete pavement areas covered by MMS equipped with multiple LiDAR scanners. Additionally, the extracted lane markings are used for lane width estimation and reporting lane marking gaps along various highways. The normalized intensity thresholding leads to a better lane marking extraction with an F1-score of 78.9% in comparison to the original intensity thresholding with an F1-score of 72.3%. On the other hand, the deep learning model trained with automatically generated labels achieves a higher F1-score of 85.9% than the one trained on manually established labels with an F1-score of 75.1%. In concrete pavement area, the normalized intensity thresholding and both deep learning strategies obtain better lane marking extraction (i.e., lane markings along longer segments of the highway have been extracted) than the original intensity thresholding approach. For the lane width results, more estimates are observed, especially in areas with poor edge lane marking, using the two deep learning models when compared with the intensity thresholding strategies due to the higher recall rates for the former. The outcome of the proposed strategies is used to develop a framework for reporting lane marking gap regions, which can be subsequently visualized in RGB imagery to identify their cause.

Author(s):  
Y.-T. Cheng ◽  
A. Patel ◽  
D. Bullock ◽  
A. Habib

Abstract. With the rapid development of autonomous vehicles (AV) and high-definition (HD) maps, up-to-date lane marking information is necessary. Over the years, several lane marking extraction approaches have been proposed with many of them based on accurate and dense Light Detection and Ranging (LiDAR) point cloud data collected by mobile mapping systems (MMS). This study proposes a normalized intensity thresholding strategy and a deep learning strategy with automatically generated labels. The former extracts lane markings directly from LiDAR point clouds while the latter utilizes 2D intensity images generated from the LiDAR point cloud. Additionally, the proposed approaches are also compared with state-of-the-art strategies such as original intensity thresholding and a deep learning approach based on manually established labels. Finally, each strategy is evaluated in asphalt and concrete pavements separately to assess their sensitivity to the nature of pavement surface. The results show that the deep learning model trained with automatically generated labels performs the best in both asphalt and concrete pavement area with an F1-score of 84.9% and 85.1%. In asphalt pavement area, original intensity thresholding strategy shows a lane marking extraction performance comparable to the other strategies while in concrete pavement area, it is significantly poor with an F1-score of 65.1%. Between the proposed normalized intensity thresholding and deep learning model trained with manually labeled data, the former performs better in asphalt pavement area while the latter obtains better results in concrete pavements.


2021 ◽  
Vol 10 (1) ◽  
Author(s):  
Xinyang Li ◽  
Guoxun Zhang ◽  
Hui Qiao ◽  
Feng Bao ◽  
Yue Deng ◽  
...  

AbstractThe development of deep learning and open access to a substantial collection of imaging data together provide a potential solution for computational image transformation, which is gradually changing the landscape of optical imaging and biomedical research. However, current implementations of deep learning usually operate in a supervised manner, and their reliance on laborious and error-prone data annotation procedures remains a barrier to more general applicability. Here, we propose an unsupervised image transformation to facilitate the utilization of deep learning for optical microscopy, even in some cases in which supervised models cannot be applied. Through the introduction of a saliency constraint, the unsupervised model, named Unsupervised content-preserving Transformation for Optical Microscopy (UTOM), can learn the mapping between two image domains without requiring paired training data while avoiding distortions of the image content. UTOM shows promising performance in a wide range of biomedical image transformation tasks, including in silico histological staining, fluorescence image restoration, and virtual fluorescence labeling. Quantitative evaluations reveal that UTOM achieves stable and high-fidelity image transformations across different imaging conditions and modalities. We anticipate that our framework will encourage a paradigm shift in training neural networks and enable more applications of artificial intelligence in biomedical imaging.


Sensors ◽  
2021 ◽  
Vol 21 (3) ◽  
pp. 884
Author(s):  
Chia-Ming Tsai ◽  
Yi-Horng Lai ◽  
Yung-Da Sun ◽  
Yu-Jen Chung ◽  
Jau-Woei Perng

Numerous sensors can obtain images or point cloud data on land, however, the rapid attenuation of electromagnetic signals and the lack of light in water have been observed to restrict sensing functions. This study expands the utilization of two- and three-dimensional detection technologies in underwater applications to detect abandoned tires. A three-dimensional acoustic sensor, the BV5000, is used in this study to collect underwater point cloud data. Some pre-processing steps are proposed to remove noise and the seabed from raw data. Point clouds are then processed to obtain two data types: a 2D image and a 3D point cloud. Deep learning methods with different dimensions are used to train the models. In the two-dimensional method, the point cloud is transferred into a bird’s eye view image. The Faster R-CNN and YOLOv3 network architectures are used to detect tires. Meanwhile, in the three-dimensional method, the point cloud associated with a tire is cut out from the raw data and is used as training data. The PointNet and PointConv network architectures are then used for tire classification. The results show that both approaches provide good accuracy.


Sensors ◽  
2021 ◽  
Vol 21 (6) ◽  
pp. 2144
Author(s):  
Stefan Reitmann ◽  
Lorenzo Neumann ◽  
Bernhard Jung

Common Machine-Learning (ML) approaches for scene classification require a large amount of training data. However, for classification of depth sensor data, in contrast to image data, relatively few databases are publicly available and manual generation of semantically labeled 3D point clouds is an even more time-consuming task. To simplify the training data generation process for a wide range of domains, we have developed the BLAINDER add-on package for the open-source 3D modeling software Blender, which enables a largely automated generation of semantically annotated point-cloud data in virtual 3D environments. In this paper, we focus on classical depth-sensing techniques Light Detection and Ranging (LiDAR) and Sound Navigation and Ranging (Sonar). Within the BLAINDER add-on, different depth sensors can be loaded from presets, customized sensors can be implemented and different environmental conditions (e.g., influence of rain, dust) can be simulated. The semantically labeled data can be exported to various 2D and 3D formats and are thus optimized for different ML applications and visualizations. In addition, semantically labeled images can be exported using the rendering functionalities of Blender.


2021 ◽  
Vol 11 (19) ◽  
pp. 8996
Author(s):  
Yuwei Cao ◽  
Marco Scaioni

In current research, fully supervised Deep Learning (DL) techniques are employed to train a segmentation network to be applied to point clouds of buildings. However, training such networks requires large amounts of fine-labeled buildings’ point-cloud data, presenting a major challenge in practice because they are difficult to obtain. Consequently, the application of fully supervised DL for semantic segmentation of buildings’ point clouds at LoD3 level is severely limited. In order to reduce the number of required annotated labels, we proposed a novel label-efficient DL network that obtains per-point semantic labels of LoD3 buildings’ point clouds with limited supervision, named 3DLEB-Net. In general, it consists of two steps. The first step (Autoencoder, AE) is composed of a Dynamic Graph Convolutional Neural Network (DGCNN) encoder and a folding-based decoder. It is designed to extract discriminative global and local features from input point clouds by faithfully reconstructing them without any label. The second step is the semantic segmentation network. By supplying a small amount of task-specific supervision, a segmentation network is proposed for semantically segmenting the encoded features acquired from the pre-trained AE. Experimentally, we evaluated our approach based on the Architectural Cultural Heritage (ArCH) dataset. Compared to the fully supervised DL methods, we found that our model achieved state-of-the-art results on the unseen scenes, with only 10% of labeled training data from fully supervised methods as input. Moreover, we conducted a series of ablation studies to show the effectiveness of the design choices of our model.


2017 ◽  
Vol 14 (5) ◽  
pp. 172988141773540 ◽  
Author(s):  
Robert A Hewitt ◽  
Alex Ellery ◽  
Anton de Ruiter

A classifier training methodology is presented for Kapvik, a micro-rover prototype. A simulated light detection and ranging scan is divided into a grid, with each cell having a variety of characteristics (such as number of points, point variance and mean height) which act as inputs to classification algorithms. The training step avoids the need for time-consuming and error-prone manual classification through the use of a simulation that provides training inputs and target outputs. This simulation generates various terrains that could be encountered by a planetary rover, including untraversable ones, in a random fashion. A sensor model for a three-dimensional light detection and ranging is used with ray tracing to generate realistic noisy three-dimensional point clouds where all points that belong to untraversable terrain are labelled explicitly. A neural network classifier and its training algorithm are presented, and the results of its output as well as other popular classifiers show high accuracy on test data sets after training. The network is then tested on outdoor data to confirm it can accurately classify real-world light detection and ranging data. The results show the network is able to identify terrain correctly, falsely classifying just 4.74% of untraversable terrain.


2021 ◽  
Author(s):  
Clemens Heistracher ◽  
Anahid Jalali ◽  
Indu Strobl ◽  
Axel Suendermann ◽  
Sebastian Meixner ◽  
...  

<div>Abstract—An increasing number of industrial assets are equipped with IoT sensor platforms and the industry now expects data-driven maintenance strategies with minimal deployment costs. However, gathering labeled training data for supervised tasks such as anomaly detection is costly and often difficult to implement in operational environments. Therefore, this work aims to design and implement a solution that reduces the required amount of data for training anomaly classification models on time series sensor data and thereby brings down the overall deployment effort of IoT anomaly detection sensors. We set up several in-lab experiments using three peristaltic pumps and investigated approaches for transferring trained anomaly detection models across assets of the same type. Our experiments achieved promising effectiveness and provide initial evidence that transfer learning could be a suitable strategy for using pretrained anomaly classification models across industrial assets of the same type with minimal prior labeling and training effort. This work could serve as a starting point for more general, pretrained sensor data embeddings, applicable to a wide range of assets.</div>


Author(s):  
A. Nurunnabi ◽  
F. N. Teferle ◽  
J. Li ◽  
R. C. Lindenbergh ◽  
A. Hunegnaw

Abstract. Ground surface extraction is one of the classic tasks in airborne laser scanning (ALS) point cloud processing that is used for three-dimensional (3D) city modelling, infrastructure health monitoring, and disaster management. Many methods have been developed over the last three decades. Recently, Deep Learning (DL) has become the most dominant technique for 3D point cloud classification. DL methods used for classification can be categorized into end-to-end and non end-to-end approaches. One of the main challenges of using supervised DL approaches is getting a sufficient amount of training data. The main advantage of using a supervised non end-to-end approach is that it requires less training data. This paper introduces a novel local feature-based non end-to-end DL algorithm that generates a binary classifier for ground point filtering. It studies feature relevance, and investigates three models that are different combinations of features. This method is free from the limitations of point clouds’ irregular data structure and varying data density, which is the biggest challenge for using the elegant convolutional neural network. The new algorithm does not require transforming data into regular 3D voxel grids or any rasterization. The performance of the new method has been demonstrated through two ALS datasets covering urban environments. The method successfully labels ground and non-ground points in the presence of steep slopes and height discontinuity in the terrain. Experiments in this paper show that the algorithm achieves around 97% in both F1-score and model accuracy for ground point labelling.


Author(s):  
Y. Cao ◽  
M. Scaioni

Abstract. In recent research, fully supervised Deep Learning (DL) techniques and large amounts of pointwise labels are employed to train a segmentation network to be applied to buildings’ point clouds. However, fine-labelled buildings’ point clouds are hard to find and manually annotating pointwise labels is time-consuming and expensive. Consequently, the application of fully supervised DL for semantic segmentation of buildings’ point clouds at LoD3 level is severely limited. To address this issue, we propose a novel label-efficient DL network that obtains per-point semantic labels of LoD3 buildings’ point clouds with limited supervision. In general, it consists of two steps. The first step (Autoencoder – AE) is composed of a Dynamic Graph Convolutional Neural Network-based encoder and a folding-based decoder, designed to extract discriminative global and local features from input point clouds by reconstructing them without any label. The second step is semantic segmentation. By supplying a small amount of task-specific supervision, a segmentation network is proposed for semantically segmenting the encoded features acquired from the pre-trained AE. Experimentally, we evaluate our approach based on the ArCH dataset. Compared to the fully supervised DL methods, we find that our model achieved state-of-the-art results on the unseen scenes, with only 10% of labelled training data from fully supervised methods as input.


Sign in / Sign up

Export Citation Format

Share Document