scholarly journals RGB-D-Based Pose Estimation of Workpieces with Semantic Segmentation and Point Cloud Registration

Sensors ◽  
2019 ◽  
Vol 19 (8) ◽  
pp. 1873 ◽  
Author(s):  
Hui Xu ◽  
Guodong Chen ◽  
Zhenhua Wang ◽  
Lining Sun ◽  
Fan Su

As an important part of a factory’s automated production line, industrial robots can perform a variety of tasks by integrating external sensors. Among these tasks, grasping scattered workpieces on the industrial assembly line has always been a prominent and difficult point in robot manipulation research. By using RGB-D (color and depth) information, we propose an efficient and practical solution that fuses the approaches of semantic segmentation and point cloud registration to perform object recognition and pose estimation. Different from objects in an indoor environment, the characteristics of the workpiece are relatively simple; thus, we create and label an RGB image dataset from a variety of industrial scenarios and train the modified FCN (Fully Convolutional Network) on a homemade dataset to infer the semantic segmentation results of the input images. Then, we determine the point cloud of the workpieces by incorporating the depth information to estimate the real-time pose of the workpieces. To evaluate the accuracy of the solution, we propose a novel pose error evaluation method based on the robot vision system. This method does not rely on expensive measuring equipment and can also obtain accurate evaluation results. In an industrial scenario, our solution has a rotation error less than two degrees and a translation error < 10 mm.

2021 ◽  
Vol 13 (5) ◽  
pp. 1003
Author(s):  
Nan Luo ◽  
Hongquan Yu ◽  
Zhenfeng Huo ◽  
Jinhui Liu ◽  
Quan Wang ◽  
...  

Semantic segmentation of the sensed point cloud data plays a significant role in scene understanding and reconstruction, robot navigation, etc. This work presents a Graph Convolutional Network integrating K-Nearest Neighbor searching (KNN) and Vector of Locally Aggregated Descriptors (VLAD). KNN searching is utilized to construct the topological graph of each point and its neighbors. Then, we perform convolution on the edges of constructed graph to extract representative local features by multiple Multilayer Perceptions (MLPs). Afterwards, a trainable VLAD layer, NetVLAD, is embedded in the feature encoder to aggregate the local and global contextual features. The designed feature encoder is repeated for multiple times, and the extracted features are concatenated in a jump-connection style to strengthen the distinctiveness of features and thereby improve the segmentation. Experimental results on two datasets show that the proposed work settles the shortcoming of insufficient local feature extraction and promotes the accuracy (mIoU 60.9% and oAcc 87.4% for S3DIS) of semantic segmentation comparing to existing models.


2019 ◽  
Vol 9 (16) ◽  
pp. 3273 ◽  
Author(s):  
Wen-Chung Chang ◽  
Van-Toan Pham

This paper develops a registration architecture for the purpose of estimating relative pose including the rotation and the translation of an object in terms of a model in 3-D space based on 3-D point clouds captured by a 3-D camera. Particularly, this paper addresses the time-consuming problem of 3-D point cloud registration which is essential for the closed-loop industrial automated assembly systems that demand fixed time for accurate pose estimation. Firstly, two different descriptors are developed in order to extract coarse and detailed features of these point cloud data sets for the purpose of creating training data sets according to diversified orientations. Secondly, in order to guarantee fast pose estimation in fixed time, a seemingly novel registration architecture by employing two consecutive convolutional neural network (CNN) models is proposed. After training, the proposed CNN architecture can estimate the rotation between the model point cloud and a data point cloud, followed by the translation estimation based on computing average values. By covering a smaller range of uncertainty of the orientation compared with a full range of uncertainty covered by the first CNN model, the second CNN model can precisely estimate the orientation of the 3-D point cloud. Finally, the performance of the algorithm proposed in this paper has been validated by experiments in comparison with baseline methods. Based on these results, the proposed algorithm significantly reduces the estimation time while maintaining high precision.


2021 ◽  
Author(s):  
Jing Li ◽  
Jialin Yin ◽  
Lin Deng

Abstract In the development of modern agriculture, the intelligent use of mechanical equipment is one of the main signs for agricultural modernization. Navigation technology is the key technology for agricultural machinery to control autonomously in operating environment, and it is a hotspot in the field of intelligent research on agricultural machinery. Facing the accuracy requirements of autonomous navigation for intelligent agricultural robots, this paper proposes a visual navigation algorithm for agricultural robots based on deep learning image understanding. The method first uses cascaded deep convolutional network and hybrid dilated convolution fusion method to process images collected by vision system. Then it extracts the route of processed images based on improved Hough transform algorithm. At the same time, the posture of agricultural robots is adjusted to realize autonomous navigation. Finally, our proposed method is verified by using non-interference experimental scenes and noisy experimental scenes. Experimental results show that the method can perform autonomous navigation in complex and noisy environments, and has good practicability and applicability.


2019 ◽  
Vol 55 (20) ◽  
pp. 1088-1090
Author(s):  
Jian Lu ◽  
Tong Liu ◽  
Maoxin Luo ◽  
Haozhe Cheng ◽  
Kaibing Zhang

2021 ◽  
Vol 13 (18) ◽  
pp. 3651
Author(s):  
Weiqi Wang ◽  
Xiong You ◽  
Xin Zhang ◽  
Lingyu Chen ◽  
Lantian Zhang ◽  
...  

Facing the realistic demands of the application environment of robots, the application of simultaneous localisation and mapping (SLAM) has gradually moved from static environments to complex dynamic environments, while traditional SLAM methods usually result in pose estimation deviations caused by errors in data association due to the interference of dynamic elements in the environment. This problem is effectively solved in the present study by proposing a SLAM approach based on light detection and ranging (LiDAR) under semantic constraints in dynamic environments. Four main modules are used for the projection of point cloud data, semantic segmentation, dynamic element screening, and semantic map construction. A LiDAR point cloud semantic segmentation network SANet based on a spatial attention mechanism is proposed, which significantly improves the real-time performance and accuracy of point cloud semantic segmentation. A dynamic element selection algorithm is designed and used with prior knowledge to significantly reduce the pose estimation deviations caused by SLAM dynamic elements. The results of experiments conducted on the public datasets SemanticKITTI, KITTI, and SemanticPOSS show that the accuracy and robustness of the proposed approach are significantly improved.


2018 ◽  
Vol 3 (4) ◽  
pp. 2942-2949 ◽  
Author(s):  
Anestis Zaganidis ◽  
Li Sun ◽  
Tom Duckett ◽  
Grzegorz Cielniak

2021 ◽  
Vol 2021 ◽  
pp. 1-16
Author(s):  
Ba-Phuc Huynh ◽  
Yong-Lin Kuo

One of the problems with industrial robots is their ability to accurately locate the pose of the end-effector. Over the years, many other solutions have been studied including static calibration and dynamic positioning. This paper presents a novel approach for pose estimation of a Hexa parallel robot. The vision system uses three simple color feature points fixed on the surface of the end-effector to measure the pose of the robot. The Intel RealSense Camera D435i is used as a 3D measurement of feature points, which offers a cheap solution and high accuracy in positioning. Based on the constraint of three color feature points, the pose of the end-effector, including position and orientation, is determined. A dynamic hybrid filter is designed to correct the vision-based pose measurement. The complementary filter is used to eliminate the noise of image processing due to environmental light source interference. The unscented Kalman filter is designed to smooth out the pose estimation of the vision system based on robot’s kinematic parameters. The combination of two filters in the same control scheme contributes to increased stability and improved accuracy of robot’s positioning. The simulation, experiment, and comparison demonstrate the effectiveness and feasibility of the proposed method.


Sign in / Sign up

Export Citation Format

Share Document