Real-Time Fruit Recognition and Grasping Estimation for Robotic Apple Harvesting

Hanwen Kang; Hongyu Zhou; Xing Wang; Chao Chen

doi:10.3390/s20195670

Real-Time Fruit Recognition and Grasping Estimation for Robotic Apple Harvesting

Sensors ◽

10.3390/s20195670 ◽

2020 ◽

Vol 20 (19) ◽

pp. 5670

Author(s):

Hanwen Kang ◽

Hongyu Zhou ◽

Xing Wang ◽

Chao Chen

Keyword(s):

Image Data ◽

Point Clouds ◽

Robotic Arm ◽

Depth Information ◽

Agricultural Industry ◽

Weight One ◽

Rgb Images ◽

Robotic Harvesting ◽

Instance Segmentation ◽

Indoor And Outdoor

Robotic harvesting shows a promising aspect in future development of agricultural industry. However, there are many challenges which are still presented in the development of a fully functional robotic harvesting system. Vision is one of the most important keys among these challenges. Traditional vision methods always suffer from defects in accuracy, robustness, and efficiency in real implementation environments. In this work, a fully deep learning-based vision method for autonomous apple harvesting is developed and evaluated. The developed method includes a light-weight one-stage detection and segmentation network for fruit recognition and a PointNet to process the point clouds and estimate a proper approach pose for each fruit before grasping. Fruit recognition network takes raw inputs from RGB-D camera and performs fruit detection and instance segmentation on RGB images. The PointNet grasping network combines depth information and results from the fruit recognition as input and outputs the approach pose of each fruit for robotic arm execution. The developed vision method is evaluated on RGB-D image data which are collected from both laboratory and orchard environments. Robotic harvesting experiments in both indoor and outdoor conditions are also included to validate the performance of the developed harvesting system. Experimental results show that the developed vision method can perform highly efficient and accurate to guide robotic harvesting. Overall, the developed robotic harvesting system achieves 0.8 on harvesting success rate and cycle time is 6.5 s.

Download Full-text

SqueezeFace: Integrative Face Recognition Methods with LiDAR Sensors

Journal of Sensors ◽

10.1155/2021/4312245 ◽

2021 ◽

Vol 2021 ◽

pp. 1-8

Author(s):

Kyoungmin Ko ◽

Hyunmin Gwak ◽

Nalinh Thoummala ◽

Hyun Kwon ◽

SungHwan Kim

Keyword(s):

Face Recognition ◽

Image Data ◽

Spatial Location ◽

Point Clouds ◽

Depth Information ◽

Depth Maps ◽

Spatially Adaptive ◽

Proposed Model ◽

High Discriminatory Power ◽

Face Spoofing

In this paper, we propose a robust and reliable face recognition model that incorporates depth information such as data from point clouds and depth maps into RGB image data to avoid false facial verification caused by face spoofing attacks while increasing the model’s performance. The proposed model is driven by the spatially adaptive convolution (SAC) block of SqueezeSegv3; this is the attention block that enables the model to weight features according to their importance of spatial location. We also utilize large-margin loss instead of softmax loss as a supervision signal for the proposed method, to enforce high discriminatory power. In the experiment, the proposed model, which incorporates depth information, had 99.88% accuracy and an F 1 score of 93.45%, outperforming the baseline models, which used RGB data alone.

Download Full-text

Registration and Fusion of UAV LiDAR System Sequence Images and Laser Point Clouds

Journal of Imaging Science and Technology ◽

10.2352/j.imagingsci.technol.2021.65.1.01050110.2352/j.imagingsci.technol.2021.65.1.010501 ◽

2020 ◽

Author(s):

Jiayong Yu ◽

Longchen Ma ◽

Maoyi Tian, ◽

Xiushan Lu

Keyword(s):

Point Cloud ◽

Image Data ◽

Point Clouds ◽

Optical Image ◽

Point Cloud Data ◽

Feature Points ◽

Lidar System ◽

Registration Accuracy ◽

Cloud Data ◽

Intensity Image

The unmanned aerial vehicle (UAV)-mounted mobile LiDAR system (ULS) is widely used for geomatics owing to its efficient data acquisition and convenient operation. However, due to limited carrying capacity of a UAV, sensors integrated in the ULS should be small and lightweight, which results in decrease in the density of the collected scanning points. This affects registration between image data and point cloud data. To address this issue, the authors propose a method for registering and fusing ULS sequence images and laser point clouds, wherein they convert the problem of registering point cloud data and image data into a problem of matching feature points between the two images. First, a point cloud is selected to produce an intensity image. Subsequently, the corresponding feature points of the intensity image and the optical image are matched, and exterior orientation parameters are solved using a collinear equation based on image position and orientation. Finally, the sequence images are fused with the laser point cloud, based on the Global Navigation Satellite System (GNSS) time index of the optical image, to generate a true color point cloud. The experimental results show the higher registration accuracy and fusion speed of the proposed method, thereby demonstrating its accuracy and effectiveness.

Download Full-text

InstantDL: an easy-to-use deep learning pipeline for image segmentation and classification

BMC Bioinformatics ◽

10.1186/s12859-021-04037-3 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Dominik Jens Elias Waibel ◽

Sayedali Shetab Boushehri ◽

Carsten Marr

Keyword(s):

Image Processing ◽

Deep Learning ◽

Specific Problem ◽

State Of The Art ◽

Image Data ◽

Semantic Segmentation ◽

Parameter Tuning ◽

Cellular Processes ◽

Minimal Effort ◽

Instance Segmentation

Abstract Background Deep learning contributes to uncovering molecular and cellular processes with highly performant algorithms. Convolutional neural networks have become the state-of-the-art tool to provide accurate and fast image data processing. However, published algorithms mostly solve only one specific problem and they typically require a considerable coding effort and machine learning background for their application. Results We have thus developed InstantDL, a deep learning pipeline for four common image processing tasks: semantic segmentation, instance segmentation, pixel-wise regression and classification. InstantDL enables researchers with a basic computational background to apply debugged and benchmarked state-of-the-art deep learning algorithms to their own data with minimal effort. To make the pipeline robust, we have automated and standardized workflows and extensively tested it in different scenarios. Moreover, it allows assessing the uncertainty of predictions. We have benchmarked InstantDL on seven publicly available datasets achieving competitive performance without any parameter tuning. For customization of the pipeline to specific tasks, all code is easily accessible and well documented. Conclusions With InstantDL, we hope to empower biomedical researchers to conduct reproducible image processing with a convenient and easy-to-use pipeline.

Download Full-text

Real-Time 3D object detection using improved convolutional neural network based on image-driven point cloud

Recent Advances in Electrical & Electronic Engineering (Formerly Recent Patents on Electrical & Electronic Engineering) ◽

10.2174/2352096514666211026142721 ◽

2021 ◽

Vol 14 ◽

Author(s):

Zhiyong Gao ◽

Jianhong Xiang

Keyword(s):

Neural Network ◽

Object Detection ◽

Convolutional Neural Network ◽

Real Time ◽

Point Cloud ◽

Point Clouds ◽

3D Point Cloud ◽

3D Object ◽

3D Object Detection ◽

Instance Segmentation

Background: While detecting the object directly from the 3D point cloud, the natural 3D patterns and invariance of 3D data are often obscure. Objective: In this work, we aimed at studying the 3D object detection from discrete, disordered and sparse 3D point clouds. Methods: The CNN is composed of the frustum sequence module, 3D instance segmentation module S-NET, 3D point cloud transformation module T-NET, and 3D boundary box estimation module E-NET. The search space of the object is determined by the frustum sequence module. The instance segmentation of the point cloud is performed by the 3D instance segmentation module. The 3D coordinates of the object are confirmed by the transformation module and the 3D bounding box estimation module. Results: Evaluated on KITTI benchmark dataset, our method outperforms the state of the art by remarkable margins while having real-time capability. Conclusion: We achieve real-time 3D object detection by proposing an improved convolutional neural network (CNN) based on image-driven point clouds.

Download Full-text

Registration and Fusion of UAV LiDAR System Sequence Images and Laser Point Clouds

Journal of Imaging Science and Technology ◽

10.2352/j.imagingsci.technol.2021.65.1.010501 ◽

2021 ◽

Vol 65 (1) ◽

pp. 10501-1-10501-9

Author(s):

Jiayong Yu ◽

Longchen Ma ◽

Maoyi Tian ◽

Xiushan Lu

Keyword(s):

Point Cloud ◽

Image Data ◽

Point Clouds ◽

Optical Image ◽

Point Cloud Data ◽

Feature Points ◽

Lidar System ◽

Registration Accuracy ◽

Cloud Data ◽

Intensity Image

Abstract The unmanned aerial vehicle (UAV)-mounted mobile LiDAR system (ULS) is widely used for geomatics owing to its efficient data acquisition and convenient operation. However, due to limited carrying capacity of a UAV, sensors integrated in the ULS should be small and lightweight, which results in decrease in the density of the collected scanning points. This affects registration between image data and point cloud data. To address this issue, the authors propose a method for registering and fusing ULS sequence images and laser point clouds, wherein they convert the problem of registering point cloud data and image data into a problem of matching feature points between the two images. First, a point cloud is selected to produce an intensity image. Subsequently, the corresponding feature points of the intensity image and the optical image are matched, and exterior orientation parameters are solved using a collinear equation based on image position and orientation. Finally, the sequence images are fused with the laser point cloud, based on the Global Navigation Satellite System (GNSS) time index of the optical image, to generate a true color point cloud. The experimental results show the higher registration accuracy and fusion speed of the proposed method, thereby demonstrating its accuracy and effectiveness.

Download Full-text

Growing Neural Forest-Based Color Quantization Applied to RGB Images

International Journal of Computer Vision and Image Processing ◽

10.4018/ijcvip.2017070102 ◽

2017 ◽

Vol 7 (3) ◽

pp. 13-25

Author(s):

Jesús Benito-Picazo ◽

Ezequiel López-Rubio ◽

Enrique Domínguez

Keyword(s):

Data Streams ◽

Image Data ◽

Color Quantization ◽

Perceptual Distortion ◽

Self Organized ◽

Novel Approach ◽

Image Quantization ◽

Rgb Images ◽

Storage Technologies ◽

Future Work

Although last improvements in both physical storage technologies and image handling techniques have eased image managing processes, the large amount of information handled nowadays constantly demands more efficient ways to store and transmit image data streams. Among other alternatives for such purpose, the authors find color quantization, which consists of color indexing for minimal perceptual distortion image compression. In this context, artificial intelligence-based algorithms and more specifically, Artificial Neural Networks, have been consolidated as a powerful tool for unsupervised tasks, and therefore, for color quantization purposes. In this work, a novel approach to color quantization is presented based on the Growing Neural Forest (GNF), which is a Growing Neural Gas (GNG) variation where a set of trees is learnt instead of a general graph. Experimental results support the use of GNF for image quantization tasks where it overcomes other self-organized models including SOM, GHSOM and GNG. Future work will include more datasets and different competitive models to compare to.

Download Full-text

U-Net-Id, an Instance Segmentation Model for Building Extraction from Satellite Images—Case Study in the Joanópolis City, Brazil

Remote Sensing ◽

10.3390/rs12101544 ◽

2020 ◽

Vol 12 (10) ◽

pp. 1544 ◽

Cited By ~ 5

Author(s):

Fabien H. Wagner ◽

Ricardo Dalagnol ◽

Yuliya Tarabalka ◽

Tassiana Y. F. Segantine ◽

Rogério Thomé ◽

...

Keyword(s):

Network Architecture ◽

Satellite Images ◽

Semantic Segmentation ◽

Developed Countries ◽

Building Extraction ◽

Neural Network Architecture ◽

Rgb Images ◽

The City ◽

Instance Segmentation

Currently, there exists a growing demand for individual building mapping in regions of rapid urban growth in less-developed countries. Most existing methods can segment buildings but cannot discriminate adjacent buildings. Here, we present a new convolutional neural network architecture (CNN) called U-net-id that performs building instance segmentation. The proposed network is trained with WorldView-3 satellite RGB images (0.3 m) and three different labeled masks. The first is the building mask; the second is the border mask, which is the border of the building segment with 4 pixels added outside and 3 pixels inside; and the third is the inner segment mask, which is the segment of the building diminished by 2 pixels. The architecture consists of three parallel paths, one for each mask, all starting with a U-net model. To accurately capture the overlap between the masks, all activation layers of the U-nets are copied and concatenated on each path and sent to two additional convolutional layers before the output activation layers. The method was tested with a dataset of 7563 manually delineated individual buildings of the city of Joanópolis-SP, Brazil. On this dataset, the semantic segmentation showed an overall accuracy of 97.67% and an F1-Score of 0.937 and the building individual instance segmentation showed good performance with a mean intersection over union (IoU) of 0.582 (median IoU = 0.694).

Download Full-text

Semantic Labeling and Instance Segmentation of 3D Point Clouds Using Patch Context Analysis and Multiscale Processing

IEEE Transactions on Visualization and Computer Graphics ◽

10.1109/tvcg.2018.2889944 ◽

2020 ◽

Vol 26 (7) ◽

pp. 2485-2498 ◽

Cited By ~ 7

Author(s):

Shi-Min Hu ◽

Jun-Xiong Cai ◽

Yu-Kun Lai

Keyword(s):

Point Clouds ◽

Context Analysis ◽

3D Point Clouds ◽

Semantic Labeling ◽

Patch Context ◽

Instance Segmentation

Download Full-text

A Dense Feature Pyramid Network-Based Deep Learning Model for Road Marking Instance Segmentation Using MLS Point Clouds

IEEE Transactions on Geoscience and Remote Sensing ◽

10.1109/tgrs.2020.2996617 ◽

2020 ◽

pp. 1-17 ◽

Cited By ~ 1

Author(s):

Siyun Chen ◽

Zhenxin Zhang ◽

Ruofei Zhong ◽

Liqiang Zhang ◽

Hao Ma ◽

...

Keyword(s):

Deep Learning ◽

Point Clouds ◽

Learning Model ◽

Feature Pyramid ◽

Road Marking ◽

Deep Learning Model ◽

Instance Segmentation

Download Full-text

Kinect Depth Data Segmentation Based on Gauss Mixture Model Clustering

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.760-762.1556 ◽

2013 ◽

Vol 760-762 ◽

pp. 1556-1561

Author(s):

Ting Wei Du ◽

Bo Liu

Keyword(s):

Mixture Model ◽

Point Cloud ◽

Three Dimensional ◽

Image Data ◽

Point Clouds ◽

Gaussian Mixture ◽

Depth Image ◽

Cloud Data ◽

Mixture Model Clustering ◽

Gauss Mixture

Indoor scene understanding based on the depth image data is a cutting-edge issue in the field of three-dimensional computer vision. Taking the layout characteristics of the indoor scenes and more plane features in these scenes into account, this paper presents a depth image segmentation method based on Gauss Mixture Model clustering. First, transform the Kinect depth image data into point cloud which is in the form of discrete three-dimensional point data, and denoise and down-sample the point cloud data; second, calculate the point normal of all points in the entire point cloud, then cluster the entire normal using Gaussian Mixture Model, and finally implement the entire point clouds segmentation by RANSAC algorithm. Experimental results show that the divided regions have obvious boundaries and segmentation quality is above normal, and lay a good foundation for object recognition.

Download Full-text