Deep Learning on Point Clouds and Its Application: A Survey

Weiping Liu; Jia Sun; Wanyi Li; Ting Hu; Peng Wang

doi:10.3390/s19194188

Deep Learning on Point Clouds and Its Application: A Survey

Sensors ◽

10.3390/s19194188 ◽

2019 ◽

Vol 19 (19) ◽

pp. 4188 ◽

Cited By ~ 21

Author(s):

Weiping Liu ◽

Jia Sun ◽

Wanyi Li ◽

Ting Hu ◽

Peng Wang

Keyword(s):

Deep Learning ◽

Point Cloud ◽

Feature Learning ◽

Point Clouds ◽

Research Trend ◽

Future Research ◽

Great Success ◽

3D Object ◽

Depth Sensors ◽

Advantages And Disadvantages

Point cloud is a widely used 3D data form, which can be produced by depth sensors, such as Light Detection and Ranging (LIDAR) and RGB-D cameras. Being unordered and irregular, many researchers focused on the feature engineering of the point cloud. Being able to learn complex hierarchical structures, deep learning has achieved great success with images from cameras. Recently, many researchers have adapted it into the applications of the point cloud. In this paper, the recent existing point cloud feature learning methods are classified as point-based and tree-based. The former directly takes the raw point cloud as the input for deep learning. The latter first employs a k-dimensional tree (Kd-tree) structure to represent the point cloud with a regular representation and then feeds these representations into deep learning models. Their advantages and disadvantages are analyzed. The applications related to point cloud feature learning, including 3D object classification, semantic segmentation, and 3D object detection, are introduced, and the datasets and evaluation metrics are also collected. Finally, the future research trend is predicted.

Download Full-text

A Survey on Deep Learning Based Methods and Datasets for Monocular 3D Object Detection

Electronics ◽

10.3390/electronics10040517 ◽

2021 ◽

Vol 10 (4) ◽

pp. 517

Author(s):

Seong-heum Kim ◽

Youngbae Hwang

Keyword(s):

Deep Learning ◽

Object Detection ◽

Low Cost ◽

Detection Methods ◽

Future Research ◽

3D Object ◽

Practical Applications ◽

Depth Sensors ◽

Significant Research ◽

3D Object Detection

Owing to recent advancements in deep learning methods and relevant databases, it is becoming increasingly easier to recognize 3D objects using only RGB images from single viewpoints. This study investigates the major breakthroughs and current progress in deep learning-based monocular 3D object detection. For relatively low-cost data acquisition systems without depth sensors or cameras at multiple viewpoints, we first consider existing databases with 2D RGB photos and their relevant attributes. Based on this simple sensor modality for practical applications, deep learning-based monocular 3D object detection methods that overcome significant research challenges are categorized and summarized. We present the key concepts and detailed descriptions of representative single-stage and multiple-stage detection solutions. In addition, we discuss the effectiveness of the detection models on their baseline benchmarks. Finally, we explore several directions for future research on monocular 3D object detection.

Download Full-text

Orientation-Encoding CNN for Point Cloud Classification and Segmentation

Machine Learning and Knowledge Extraction ◽

10.3390/make3030031 ◽

2021 ◽

Vol 3 (3) ◽

pp. 601-614

Author(s):

Hongbin Lin ◽

Wu Zheng ◽

Xiuping Peng

Keyword(s):

Deep Learning ◽

Point Cloud ◽

Feature Learning ◽

Point Clouds ◽

Point Sets ◽

Learning Network ◽

Rule Structure ◽

Visual Tasks ◽

Deep Learning Network ◽

Point Cloud Classification

With the introduction of effective and general deep learning network frameworks, deep learning based methods have achieved remarkable success in various visual tasks. However, there are still tough challenges in applying them to convolutional neural networks due to the lack of a potential rule structure of point clouds. Therefore, by taking the original point clouds as the input data, this paper proposes an orientation-encoding (OE) convolutional module and designs a convolutional neural network for effectively extracting local geometric features of point sets. By searching for the same number of points in 8 directions and arranging them in order in 8 directions, the OE convolution is then carried out according to the number of points in the direction, which realizes the effective feature learning of the local structure of the point sets. Further experiments on diverse datasets show that the proposed method has competitive performance on classification and segmentation tasks of point sets.

Download Full-text

Structure-Aware Convolution for 3D Point Cloud Classification and Segmentation

Remote Sensing ◽

10.3390/rs12040634 ◽

2020 ◽

Vol 12 (4) ◽

pp. 634 ◽

Cited By ~ 2

Author(s):

Lei Wang ◽

Yuxuan Liu ◽

Shenman Zhang ◽

Jixing Yan ◽

Pengjie Tao

Keyword(s):

Deep Learning ◽

Template Matching ◽

Point Cloud ◽

Structure Learning ◽

Feature Learning ◽

Point Clouds ◽

Learning Networks ◽

Geometric Structures ◽

Learning Capability ◽

3D Point Clouds

Semantic feature learning on 3D point clouds is quite challenging because of their irregular and unordered data structure. In this paper, we propose a novel structure-aware convolution (SAC) to generalize deep learning on regular grids to irregular 3D point clouds. Similar to the template-matching process of convolution on 2D images, the key of our SAC is to match the point clouds’ neighborhoods with a series of 3D kernels, where each kernel can be regarded as a “geometric template” formed by a set of learnable 3D points. Thus, the interested geometric structures of the input point clouds can be activated by the corresponding kernels. To verify the effectiveness of the proposed SAC, we embedded it into three recently developed point cloud deep learning networks (PointNet, PointNet++, and KCNet) as a lightweight module, and evaluated its performance on both classification and segmentation tasks. Experimental results show that, benefiting from the geometric structure learning capability of our SAC, all these back-end networks achieved better classification and segmentation performance (e.g., +2.77% mean accuracy for classification and +4.99% mean intersection over union (IoU) for segmentation) with few additional parameters. Furthermore, results also demonstrate that the proposed SAC is helpful in improving the robustness of networks with the constraints of geometric structures.

Download Full-text

Multi-Dimensional Underwater Point Cloud Detection Based on Deep Learning

Sensors ◽

10.3390/s21030884 ◽

2021 ◽

Vol 21 (3) ◽

pp. 884

Author(s):

Chia-Ming Tsai ◽

Yi-Horng Lai ◽

Yung-Da Sun ◽

Yu-Jen Chung ◽

Jau-Woei Perng

Keyword(s):

Deep Learning ◽

Point Cloud ◽

Three Dimensional ◽

Point Clouds ◽

Training Data ◽

Network Architectures ◽

Point Cloud Data ◽

Data Types ◽

Raw Data ◽

Cloud Data

Numerous sensors can obtain images or point cloud data on land, however, the rapid attenuation of electromagnetic signals and the lack of light in water have been observed to restrict sensing functions. This study expands the utilization of two- and three-dimensional detection technologies in underwater applications to detect abandoned tires. A three-dimensional acoustic sensor, the BV5000, is used in this study to collect underwater point cloud data. Some pre-processing steps are proposed to remove noise and the seabed from raw data. Point clouds are then processed to obtain two data types: a 2D image and a 3D point cloud. Deep learning methods with different dimensions are used to train the models. In the two-dimensional method, the point cloud is transferred into a bird’s eye view image. The Faster R-CNN and YOLOv3 network architectures are used to detect tires. Meanwhile, in the three-dimensional method, the point cloud associated with a tire is cut out from the raw data and is used as training data. The PointNet and PointConv network architectures are then used for tire classification. The results show that both approaches provide good accuracy.

Download Full-text

Automated Measurement of Heart Girth for Pigs Using Two Kinect Depth Sensors

Sensors ◽

10.3390/s20143848 ◽

2020 ◽

Vol 20 (14) ◽

pp. 3848

Author(s):

Xinyue Zhang ◽

Gang Liu ◽

Ling Jing ◽

Siyao Chen

Keyword(s):

Mirror Symmetry ◽

Point Cloud ◽

Point Clouds ◽

Depth Image ◽

View Point ◽

Important Indicator ◽

Depth Sensors ◽

Starting Point ◽

Feature Based ◽

Girth Measurement

The heart girth parameter is an important indicator reflecting the growth and development of pigs that provides critical guidance for the optimization of healthy pig breeding. To overcome the heavy workloads and poor adaptability of traditional measurement methods currently used in pig breeding, this paper proposes an automated pig heart girth measurement method using two Kinect depth sensors. First, a two-view pig depth image acquisition platform is established for data collection; the two-view point clouds after preprocessing are registered and fused by feature-based improved 4-Point Congruent Set (4PCS) method. Second, the fused point cloud is pose-normalized, and the axillary contour is used to automatically extract the heart girth measurement point. Finally, this point is taken as the starting point to intercept the circumferential perpendicular to the ground from the pig point cloud, and the complete heart girth point cloud is obtained by mirror symmetry. The heart girth is measured along this point cloud using the shortest path method. Using the proposed method, experiments were conducted on two-view data from 26 live pigs. The results showed that the heart girth measurement absolute errors were all less than 4.19 cm, and the average relative error was 2.14%, which indicating a high accuracy and efficiency of this method.

Download Full-text

Real-Time 3D object detection using improved convolutional neural network based on image-driven point cloud

Recent Advances in Electrical & Electronic Engineering (Formerly Recent Patents on Electrical & Electronic Engineering) ◽

10.2174/2352096514666211026142721 ◽

2021 ◽

Vol 14 ◽

Author(s):

Zhiyong Gao ◽

Jianhong Xiang

Keyword(s):

Neural Network ◽

Object Detection ◽

Convolutional Neural Network ◽

Real Time ◽

Point Cloud ◽

Point Clouds ◽

3D Point Cloud ◽

3D Object ◽

3D Object Detection ◽

Instance Segmentation

Background: While detecting the object directly from the 3D point cloud, the natural 3D patterns and invariance of 3D data are often obscure. Objective: In this work, we aimed at studying the 3D object detection from discrete, disordered and sparse 3D point clouds. Methods: The CNN is composed of the frustum sequence module, 3D instance segmentation module S-NET, 3D point cloud transformation module T-NET, and 3D boundary box estimation module E-NET. The search space of the object is determined by the frustum sequence module. The instance segmentation of the point cloud is performed by the 3D instance segmentation module. The 3D coordinates of the object are confirmed by the transformation module and the 3D bounding box estimation module. Results: Evaluated on KITTI benchmark dataset, our method outperforms the state of the art by remarkable margins while having real-time capability. Conclusion: We achieve real-time 3D object detection by proposing an improved convolutional neural network (CNN) based on image-driven point clouds.

Download Full-text

EXPLORING ALS AND DIM DATA FOR SEMANTIC SEGMENTATION USING CNNS

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xlii-1-347-2018 ◽

2018 ◽

Vol XLII-1 ◽

pp. 347-354 ◽

Cited By ~ 5

Author(s):

F. Politz ◽

M. Sester

Keyword(s):

Point Cloud ◽

Laser Scanning ◽

Semantic Segmentation ◽

Point Clouds ◽

Good Alternative ◽

Aerial Images ◽

Learning Approaches ◽

Advantages And Disadvantages ◽

Sensing Applications ◽

High Level

Abstract. Over the past years, the algorithms for dense image matching (DIM) to obtain point clouds from aerial images improved significantly. Consequently, DIM point clouds are now a good alternative to the established Airborne Laser Scanning (ALS) point clouds for remote sensing applications. In order to derive high-level applications such as digital terrain models or city models, each point within a point cloud must be assigned a class label. Usually, ALS and DIM are labelled with different classifiers due to their varying characteristics. In this work, we explore both point cloud types in a fully convolutional encoder-decoder network, which learns to classify ALS as well as DIM point clouds. As input, we project the point clouds onto a 2D image raster plane and calculate the minimal, average and maximal height values for each raster cell. The network then differentiates between the classes ground, non-ground, building and no data. We test our network in six training setups using only one point cloud type, both point clouds as well as several transfer-learning approaches. We quantitatively and qualitatively compare all results and discuss the advantages and disadvantages of all setups. The best network achieves an overall accuracy of 96% in an ALS and 83% in a DIM test set.

Download Full-text

Multi-Angle Point Cloud-VAE: Unsupervised Feature Learning for 3D Point Clouds From Multiple Angles by Joint Self-Reconstruction and Half-to-Half Prediction

2019 IEEE/CVF International Conference on Computer Vision (ICCV) ◽

10.1109/iccv.2019.01054 ◽

2019 ◽

Cited By ~ 14

Author(s):

Zhizhong Han ◽

Xiyang Wang ◽

Yu-Shen Liu ◽

Matthias Zwicker

Keyword(s):

Point Cloud ◽

Feature Learning ◽

Point Clouds ◽

Unsupervised Feature Learning ◽

3D Point Clouds ◽

Angle Point

Download Full-text

Classification of Point Clouds for Indoor Components Using Few Labeled Samples

Remote Sensing ◽

10.3390/rs12142181 ◽

2020 ◽

Vol 12 (14) ◽

pp. 2181

Author(s):

Hangbin Wu ◽

Huimin Yang ◽

Shengyu Huang ◽

Doudou Zeng ◽

Chun Liu ◽

...

Keyword(s):

Deep Learning ◽

Point Cloud ◽

Point Clouds ◽

Neighborhood Search ◽

Learning Methods ◽

Semantic Classification ◽

Cloud Classification ◽

Mixed Features ◽

Indoor Scenarios ◽

Point Cloud Classification

The existing deep learning methods for point cloud classification are trained using abundant labeled samples and used to test only a few samples. However, classification tasks are diverse, and not all tasks have enough labeled samples for training. In this paper, a novel point cloud classification method for indoor components using few labeled samples is proposed to solve the problem of the requirement for abundant labeled samples for training with deep learning classification methods. This method is composed of four parts: mixing samples, feature extraction, dimensionality reduction, and semantic classification. First, the few labeled point clouds are mixed with unlabeled point clouds. Next, the mixed high-dimensional features are extracted using a deep learning framework. Subsequently, a nonlinear manifold learning method is used to embed the mixed features into a low-dimensional space. Finally, the few labeled point clouds in each cluster are identified, and semantic labels are provided for unlabeled point clouds in the same cluster by a neighborhood search strategy. The validity and versatility of the proposed method were validated by different experiments and compared with three state-of-the-art deep learning methods. Our method uses fewer than 30 labeled point clouds to achieve an accuracy that is 1.89–19.67% greater than existing methods. More importantly, the experimental results suggest that this method is not only suitable for single-attribute indoor scenarios but also for comprehensive complex indoor scenarios.

Download Full-text

Infrared and Visible Image Fusion Techniques Based on Deep Learning: A Review

Electronics ◽

10.3390/electronics9122162 ◽

2020 ◽

Vol 9 (12) ◽

pp. 2162

Author(s):

Changqi Sun ◽

Cong Zhang ◽

Naixue Xiong

Keyword(s):

Deep Learning ◽

Image Fusion ◽

Research Work ◽

Image Features ◽

Future Research ◽

Learning Methods ◽

Visible Image ◽

Advantages And Disadvantages ◽

Fusion Methods ◽

Future Work

Infrared and visible image fusion technologies make full use of different image features obtained by different sensors, retain complementary information of the source images during the fusion process, and use redundant information to improve the credibility of the fusion image. In recent years, many researchers have used deep learning methods (DL) to explore the field of image fusion and found that applying DL has improved the time-consuming efficiency of the model and the fusion effect. However, DL includes many branches, and there is currently no detailed investigation of deep learning methods in image fusion. In this work, this survey reports on the development of image fusion algorithms based on deep learning in recent years. Specifically, this paper first conducts a detailed investigation on the fusion method of infrared and visible images based on deep learning, compares the existing fusion algorithms qualitatively and quantitatively with the existing fusion quality indicators, and discusses various fusions. The main contribution, advantages, and disadvantages of the algorithm. Finally, the research status of infrared and visible image fusion is summarized, and future work has prospected. This research can help us realize many image fusion methods in recent years and lay the foundation for future research work.

Download Full-text