Estimation of 6D Object Pose Using a 2D Bounding Box

Yong Hong; Jin Liu; Zahid Jahangir; Sheng He; Qing Zhang

doi:10.3390/s21092939

Estimation of 6D Object Pose Using a 2D Bounding Box

Sensors ◽

10.3390/s21092939 ◽

2021 ◽

Vol 21 (9) ◽

pp. 2939

Author(s):

Yong Hong ◽

Jin Liu ◽

Zahid Jahangir ◽

Sheng He ◽

Qing Zhang

Keyword(s):

Neural Network ◽

Loss Function ◽

Three Dimensional ◽

Unit Vector ◽

Prediction Algorithm ◽

Computational Time ◽

Bounding Box ◽

Dimensional Unit ◽

Bounding Boxes ◽

Rgb Image

This paper provides an efficient way of addressing the problem of detecting or estimating the 6-Dimensional (6D) pose of objects from an RGB image. A quaternion is used to define an object′s three-dimensional pose, but the pose represented by q and the pose represented by -q are equivalent, and the L2 loss between them is very large. Therefore, we define a new quaternion pose loss function to solve this problem. Based on this, we designed a new convolutional neural network named Q-Net to estimate an object’s pose. Considering that the quaternion′s output is a unit vector, a normalization layer is added in Q-Net to hold the output of pose on a four-dimensional unit sphere. We propose a new algorithm, called the Bounding Box Equation, to obtain 3D translation quickly and effectively from 2D bounding boxes. The algorithm uses an entirely new way of assessing the 3D rotation (R) and 3D translation rotation (t) in only one RGB image. This method can upgrade any traditional 2D-box prediction algorithm to a 3D prediction model. We evaluated our model using the LineMod dataset, and experiments have shown that our methodology is more acceptable and efficient in terms of L2 loss and computational time.

Download Full-text

Gauge freedom, anholonomy, and Hopf index of a three-dimensional unit vector field

Physical Review B ◽

10.1103/physrevb.47.5438 ◽

1993 ◽

Vol 47 (9) ◽

pp. 5438-5441 ◽

Cited By ~ 6

Author(s):

Radha Balakrishnan ◽

A. R. Bishop ◽

R. Dandoloff

Keyword(s):

Vector Field ◽

Three Dimensional ◽

Unit Vector ◽

Gauge Freedom ◽

Unit Vector Field ◽

Dimensional Unit

Download Full-text

Determination of Vehicle Trajectory through Optimization of Vehicle Bounding Boxes Using a Convolutional Neural Network

Sensors ◽

10.3390/s19194263 ◽

2019 ◽

Vol 19 (19) ◽

pp. 4263 ◽

Cited By ~ 4

Author(s):

Seong ◽

Song ◽

Yoon ◽

Kim ◽

Choi

Keyword(s):

Neural Network ◽

Object Detection ◽

Convolutional Neural Network ◽

Moving Vehicle ◽

Bounding Box ◽

Vehicle Location ◽

Vehicle Trajectory ◽

Bounding Boxes ◽

Conventional Algorithm

In this manuscript, a new method for the determination of vehicle trajectories using an optimal bounding box for the vehicle is developed. The vehicle trajectory is extracted using images acquired from a camera installed at an intersection based on a convolutional neural network (CNN). First, real-time vehicle object detection is performed using the YOLOv2 model, which is one of the most representative object detection algorithms based on CNN. To overcome the inaccuracy of the vehicle location extracted by YOLOv2, the trajectory was calibrated using a vehicle tracking algorithm such as a Kalman filter and intersection-over-union (IOU) tracker. In particular, we attempted to correct the vehicle trajectory by extracting the center position based on the geometric characteristics of a moving vehicle according to the bounding box. The quantitative and qualitative evaluations indicate that the proposed algorithm can detect the trajectories of moving vehicles better than the conventional algorithm. Although the center points of the bounding boxes obtained using the existing conventional algorithm are often outside of the vehicle due to the geometric displacement of the camera, the proposed technique can minimize positional errors and extract the optimal bounding box to determine the vehicle location.

Download Full-text

Semantic Segmentation of 3D Medical Images with 3D Convolutional Neural Networks

CLEI electronic journal ◽

10.19153/cleiej.23.1.4 ◽

2020 ◽

Vol 23 (1) ◽

Author(s):

Alejandra Márquez Herrera ◽

Alex J. Cuadros-Vargas ◽

Helio Pedrini

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Loss Function ◽

Medical Images ◽

Three Dimensional ◽

Ground Truth ◽

Semantic Segmentation ◽

Volumetric Medical Images ◽

Magnetic Resonance Imaging Mri ◽

3D Cnn

A neural network is a mathematical model that is able to perform a task automatically or semi-automatically after learning the human knowledge that we provided. Moreover, a Convolutional Neural Network (CNN) is a type of neural network that has shown to efficiently learn tasks related to the area of image analysis, such as image segmentation, whose main purpose is to find regions or separable objects within an image. A more specific type of segmentation, called semantic segmentation, guarantees that each region has a semantic meaning by giving it a label or class. Since CNNs can automate the task of image semantic segmentation, they have been very useful for the medical area, applying them to the segmentation of organs or abnormalities (tumors). This work aims to improve the task of binary semantic segmentation of volumetric medical images acquired by Magnetic Resonance Imaging (MRI) using a pre-existing Three-Dimensional Convolutional Neural Network (3D CNN) architecture. We propose a formulation of a loss function for training this 3D CNN, for improving pixel-wise segmentation results. This loss function is formulated based on the idea of adapting a similarity coefficient, used for measuring the spatial overlap between the prediction and ground truth, and then using it to train the network. As contribution, the developed approach achieved good performance in a context where the pixel classes are imbalanced. We show how the choice of the loss function for training can affect the nal quality of the segmentation. We validate our proposal over two medical image semantic segmentation datasets and show comparisons in performance between the proposed loss function and other pre-existing loss functions used for binary semantic segmentation.

Download Full-text

Three-Dimensional Cognitive Mapping With a Neural Network

PsycEXTRA Dataset ◽

10.1037/e501882009-547 ◽

2000 ◽

Author(s):

Horatiu Voicu ◽

Nestor Schmajuk

Keyword(s):

Neural Network ◽

Cognitive Mapping ◽

Three Dimensional

Download Full-text

Digitalization system of ancient architecture decoration art based on neural network and image features

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-189251 ◽

2020 ◽

pp. 1-12

Author(s):

Wu Xin ◽

Qiu Daping

Keyword(s):

Neural Network ◽

Construction Industry ◽

Three Dimensional ◽

Performance Testing ◽

Image Features ◽

Three Dimensional Model ◽

Performance Effect ◽

Data Process ◽

And Performance ◽

Construction Mode

The inheritance and innovation of ancient architecture decoration art is an important way for the development of the construction industry. The data process of traditional ancient architecture decoration art is relatively backward, which leads to the obvious distortion of the digitalization of ancient architecture decoration art. In order to improve the digital effect of ancient architecture decoration art, based on neural network, this paper combines the image features to construct a neural network-based ancient architecture decoration art data system model, and graphically expresses the static construction mode and dynamic construction process of the architecture group. Based on this, three-dimensional model reconstruction and scene simulation experiments of architecture groups are realized. In order to verify the performance effect of the system proposed in this paper, it is verified through simulation and performance testing, and data visualization is performed through statistical methods. The result of the study shows that the digitalization effect of the ancient architecture decoration art proposed in this paper is good.

Download Full-text

Automatic accounting of Baikal diatomic algae: approaches and prospects

Issues of modern algology (Вопросы современной альгологии) ◽

10.33624/2311-0147-2019-2(20)-295-299 ◽

2019 ◽

pp. 295-299

Author(s):

Кonstantin А. Elshin ◽

Еlena I. Molchanova ◽

Мarina V. Usoltseva ◽

Yelena V. Likhoshway

Keyword(s):

Object Detection ◽

Loss Function ◽

Classification Accuracy ◽

Diatom Species ◽

Bounding Box ◽

Synedra Acus ◽

And Training

Using the TensorFlow Object Detection API, an approach to identifying and registering Baikal diatom species Synedra acus subsp. radians has been tested. As a result, a set of images was formed and training was conducted. It is shown that аfter 15000 training iterations, the total value of the loss function was obtained equal to 0,04. At the same time, the classification accuracy is equal to 95%, and the accuracy of construction of the bounding box is also equal to 95%.

Download Full-text

IMPROVEMENT OF COMPUTATIONAL TIME IN RADIATIVE HEAT TRANSFER OF THREE-DIMENSIONAL PARTICIPATING MEDIA USING THE RADIATION ELEMENT METHOD

ICHMT Third International Symposium on Radiative Transfer ◽

10.1615/ichmt.2001.radiationsymp.150 ◽

2001 ◽

Author(s):

Yuhei Takeuchi ◽

Shigenao Maruyama ◽

Seigo Sakai ◽

Zhixiong Guo

Keyword(s):

Heat Transfer ◽

Radiative Heat Transfer ◽

Radiative Heat ◽

Three Dimensional ◽

Computational Time ◽

Participating Media ◽

Element Method

Download Full-text

A Proof-of-Concept Algorithm for the Retrieval of Total Column Amount of Trace Gases in a Multi-Dimensional Atmosphere

Remote Sensing ◽

10.3390/rs13020270 ◽

2021 ◽

Vol 13 (2) ◽

pp. 270

Author(s):

Adrian Doicu ◽

Dmitry S. Efremenko ◽

Thomas Trautmann

Keyword(s):

Trace Gases ◽

Three Dimensional ◽

Principal Component ◽

Radiative Transfer Model ◽

Computational Time ◽

Transfer Model ◽

Proof Of Concept ◽

Discrete Ordinate ◽

Distribution Method ◽

Total Column

An algorithm for the retrieval of total column amount of trace gases in a multi-dimensional atmosphere is designed. The algorithm uses (i) certain differential radiance models with internal and external closures as inversion models, (ii) the iteratively regularized Gauss–Newton method as a regularization tool, and (iii) the spherical harmonics discrete ordinate method (SHDOM) as linearized radiative transfer model. For efficiency reasons, SHDOM is equipped with a spectral acceleration approach that combines the correlated k-distribution method with the principal component analysis. The algorithm is used to retrieve the total column amount of nitrogen for two- and three-dimensional cloudy scenes. Although for three-dimensional geometries, the computational time is high, the main concepts of the algorithm are correct and the retrieval results are accurate.

Download Full-text

Automatic method for glaucoma diagnosis using a three-dimensional convoluted neural network

Neurocomputing ◽

10.1016/j.neucom.2020.07.146 ◽

2021 ◽

Vol 438 ◽

pp. 72-83

Author(s):

Nonato Rodrigues de Sales Carvalho ◽

Maria da Conceição Leal Carvalho Rodrigues ◽

Antonio Oseas de Carvalho Filho ◽

Mano Joseph Mathew

Keyword(s):

Neural Network ◽

Three Dimensional ◽

Automatic Method ◽

Glaucoma Diagnosis

Download Full-text

Simulation of Thermal and Electric Field Distribution in Packaged Sausages Heated in a Stationary Versus a Rotating Microwave Oven

Foods ◽

10.3390/foods10071622 ◽

2021 ◽

Vol 10 (7) ◽

pp. 1622

Author(s):

Wipawee Tepnatim ◽

Witchuda Daud ◽

Pitiya Kamonpatana

Keyword(s):

Temperature Distribution ◽

Electric Field ◽

Three Dimensional ◽

Development Stage ◽

Microwave Oven ◽

Computational Time ◽

Plastic Package ◽

Heating Uniformity ◽

The Cost ◽

Good Agreement

The microwave oven has become a standard appliance to reheat or cook meals in households and convenience stores. However, the main problem of microwave heating is the non-uniform temperature distribution, which may affect food quality and health safety. A three-dimensional mathematical model was developed to simulate the temperature distribution of four ready-to-eat sausages in a plastic package in a stationary versus a rotating microwave oven, and the model was validated experimentally. COMSOL software was applied to predict sausage temperatures at different orientations for the stationary microwave model, whereas COMSOL and COMSOL in combination with MATLAB software were used for a rotating microwave model. A sausage orientation at 135° with the waveguide was similar to that using the rotating microwave model regarding uniform thermal and electric field distributions. Both rotating models provided good agreement between the predicted and actual values and had greater precision than the stationary model. In addition, the computational time using COMSOL in combination with MATLAB was reduced by 60% compared to COMSOL alone. Consequently, the models could assist food producers and associations in designing packaging materials to prevent leakage of the packaging compound, developing new products and applications to improve product heating uniformity, and reducing the cost and time of the research and development stage.

Download Full-text