Latent 3D Volume for Joint Depth Estimation and Semantic Segmentation from a Single Image

Seiya Ito; Naoshi Kaneko; Kazuhiko Sumi

doi:10.3390/s20205765

High Level 3D Structure Extraction from a Single Image Using a CNN-Based Approach

Sensors ◽

10.3390/s19030563 ◽

2019 ◽

Vol 19 (3) ◽

pp. 563 ◽

Cited By ~ 3

Author(s):

J. Osuna-Coutiño ◽

Jose Martinez-Carranza

Keyword(s):

Level Structure ◽

3D Structure ◽

Depth Estimation ◽

Point Clouds ◽

Graph Analysis ◽

Single Image ◽

3D Data ◽

3D Elements ◽

High Level ◽

Structure Extraction

High-Level Structure (HLS) extraction in a set of images consists of recognizing 3D elements with useful information to the user or application. There are several approaches to HLS extraction. However, most of these approaches are based on processing two or more images captured from different camera views or on processing 3D data in the form of point clouds extracted from the camera images. In contrast and motivated by the extensive work developed for the problem of depth estimation in a single image, where parallax constraints are not required, in this work, we propose a novel methodology towards HLS extraction from a single image with promising results. For that, our method has four steps. First, we use a CNN to predict the depth for a single image. Second, we propose a region-wise analysis to refine depth estimates. Third, we introduce a graph analysis to segment the depth in semantic orientations aiming at identifying potential HLS. Finally, the depth sections are provided to a new CNN architecture that predicts HLS in the shape of cubes and rectangular parallelepipeds.

Download Full-text

METRIC CALIBRATION OF A FOCUSED PLENOPTIC CAMERA BASED ON A 3D CALIBRATION TARGET

ISPRS Annals of Photogrammetry Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-annals-iii-3-449-2016 ◽

2016 ◽

Vol III-3 ◽

pp. 449-456 ◽

Cited By ~ 1

Author(s):

N. Zeller ◽

C. A. Noury ◽

F. Quint ◽

C. Teulière ◽

U. Stilla ◽

...

Keyword(s):

Three Dimensional ◽

Image Plane ◽

Depth Estimation ◽

Bundle Adjustment ◽

Optimization Approach ◽

Calibration Target ◽

Projection Model ◽

3D Space ◽

Plenoptic Camera ◽

Depth Distortion

In this paper we present a new calibration approach for focused plenoptic cameras. We derive a new mathematical projection model of a focused plenoptic camera which considers lateral as well as depth distortion. Therefore, we derive a new depth distortion model directly from the theory of depth estimation in a focused plenoptic camera. In total the model consists of five intrinsic parameters, the parameters for radial and tangential distortion in the image plane and two new depth distortion parameters. In the proposed calibration we perform a complete bundle adjustment based on a 3D calibration target. The residual of our optimization approach is three dimensional, where the depth residual is defined by a scaled version of the inverse virtual depth difference and thus conforms well to the measured data. Our method is evaluated based on different camera setups and shows good accuracy. For a better characterization of our approach we evaluate the accuracy of virtual image points projected back to 3D space.

Download Full-text

METRIC CALIBRATION OF A FOCUSED PLENOPTIC CAMERA BASED ON A 3D CALIBRATION TARGET

ISPRS Annals of Photogrammetry Remote Sensing and Spatial Information Sciences ◽

10.5194/isprsannals-iii-3-449-2016 ◽

2016 ◽

Vol III-3 ◽

pp. 449-456 ◽

Cited By ~ 6

Author(s):

N. Zeller ◽

C. A. Noury ◽

F. Quint ◽

C. Teulière ◽

U. Stilla ◽

...

Keyword(s):

Three Dimensional ◽

Image Plane ◽

Depth Estimation ◽

Bundle Adjustment ◽

Optimization Approach ◽

Calibration Target ◽

Projection Model ◽

3D Space ◽

Plenoptic Camera ◽

Depth Distortion

In this paper we present a new calibration approach for focused plenoptic cameras. We derive a new mathematical projection model of a focused plenoptic camera which considers lateral as well as depth distortion. Therefore, we derive a new depth distortion model directly from the theory of depth estimation in a focused plenoptic camera. In total the model consists of five intrinsic parameters, the parameters for radial and tangential distortion in the image plane and two new depth distortion parameters. In the proposed calibration we perform a complete bundle adjustment based on a 3D calibration target. The residual of our optimization approach is three dimensional, where the depth residual is defined by a scaled version of the inverse virtual depth difference and thus conforms well to the measured data. Our method is evaluated based on different camera setups and shows good accuracy. For a better characterization of our approach we evaluate the accuracy of virtual image points projected back to 3D space.

Download Full-text

Computational Design and Analysis of a Magic Snake

Journal of Mechanisms and Robotics ◽

10.1115/1.4046351 ◽

2020 ◽

Vol 12 (5) ◽

Author(s):

Zilong Li ◽

Songming Hou ◽

Thomas C. Bishop

Keyword(s):

3D Structure ◽

Three Dimensional ◽

Computational Design ◽

Space Curve ◽

One Dimensional ◽

3D Structures ◽

Multi Scale ◽

3D Space ◽

Slender Bodies ◽

Self Similar

Abstract The Magic Snake (Rubik’s Snake) is a toy that was invented decades ago. It draws much less attention than Rubik’s Cube, which was invented by the same professor, Erno Rubik. The number of configurations of a Magic Snake, determined by the number of discrete rotations about the elementary wedges in a typical snake, is far less than the possible configurations of a typical cube. However, a cube has only a single three-dimensional (3D) structure while the number of sterically allowed 3D conformations of the snake is unknown. Here, we demonstrate how to represent a Magic Snake as a one-dimensional (1D) sequence that can be converted into a 3D structure. We then provide two strategies for designing Magic Snakes to have specified 3D structures. The first enables the folding of a Magic Snake onto any 3D space curve. The second introduces the idea of “embedding” to expand an existing Magic Snake into a longer, more complex, self-similar Magic Snake. Collectively, these ideas allow us to rapidly list and then compute all possible 3D conformations of a Magic Snake. They also form the basis for multidimensional, multi-scale representations of chain-like structures and other slender bodies including certain types of robots, polymers, proteins, and DNA.

Download Full-text

Three-Dimensional Image Reconstruction for Virtual Talent Training Scene

Traitement du signal ◽

10.18280/ts.380615 ◽

2021 ◽

Vol 38 (6) ◽

pp. 1719-1726

Author(s):

Tanbo Zhu ◽

Die Wang ◽

Yuhua Li ◽

Wenjie Dong

Keyword(s):

Image Reconstruction ◽

Three Dimensional ◽

Semantic Segmentation ◽

Image Features ◽

Practical Training ◽

3D Image Reconstruction ◽

Training Cost ◽

Training Quality ◽

Strongly Regular ◽

Scene Depth

In real training, the training conditions are often undesirable, and the use of equipment is severely limited. These problems can be solved by virtual practical training, which breaks the limit of space, lowers the training cost, while ensuring the training quality. However, the existing methods work poorly in image reconstruction, because they fail to consider the fact that the environmental perception of actual scene is strongly regular by nature. Therefore, this paper investigates the three-dimensional (3D) image reconstruction for virtual talent training scene. Specifically, a fusion network model was deigned, and the deep-seated correlation between target detection and semantic segmentation was discussed for images shot in two-dimensional (2D) scenes, in order to enhance the extraction effect of image features. Next, the vertical and horizontal parallaxes of the scene were solved, and the depth-based virtual talent training scene was reconstructed three dimensionally, based on the continuity of scene depth. Finally, the proposed algorithm was proved effective through experiments.

Download Full-text

Structured Light Field Generated by Two Projectors for High-Speed Three Dimensional Measurement

Journal of Robotics and Mechatronics ◽

10.20965/jrm.2016.p0523 ◽

2016 ◽

Vol 28 (4) ◽

pp. 523-532 ◽

Cited By ~ 2

Author(s):

Akihiro Obara ◽

◽

Xu Yang ◽

Hiromasa Oku ◽

Keyword(s):

High Speed ◽

Light Field ◽

Stereo Matching ◽

Structured Light ◽

Three Dimensional ◽

Depth Estimation ◽

Image Features ◽

Depth Information ◽

Speed Tracking ◽

Three Dimensional Measurement

[abstFig src='/00280004/10.jpg' width='300' text='Concept of SLF generated by two projectors' ] Triangulation is commonly used to restore 3D scenes, but its frame of less than 30 fps due to time-consuming stereo-matching is an obstacle for applications requiring that results be fed back in real time. The structured light field (SLF) our group proposed previously reduced the amount of calculation in 3D restoration, realizing high-speed measurement. Specifically, the SLF estimates depth information by projecting information on distance directly to a target. The SLF synthesized as reported, however, presents difficulty in extracting image features for depth estimation. In this paper, we propose synthesizing the SLF using two projectors with a certain layout. Our proposed SLF’s basic properties are based on an optical model. We evaluated the SLF’s performance using a prototype we developed and applied to the high-speed depth estimation of a target moving randomly at a speed of 1000 Hz. We demonstrate the target’s high-speed tracking based on high-speed depth information feedback.

Download Full-text

NONLINEAR MODEL OF A FABRIC WARP AND WEFT

Advances in Complex Systems ◽

10.1142/s0219525906000707 ◽

2006 ◽

Vol 09 (01n02) ◽

pp. 99-120 ◽

Cited By ~ 3

Author(s):

PASCAL BRUNIAUX ◽

CYRIL NGO NGOC

Keyword(s):

Dynamic Behavior ◽

3D Structure ◽

Three Dimensional ◽

Geometrical Model ◽

Nonlinear Phenomena ◽

Initial State ◽

Numerical Resolution ◽

3D Space ◽

Proposed Model ◽

Nodal Model

This study aims to develop a realistic mathematical model of fabric. In contrast to other studies on fabric modeling as a deformable surface, the model described in this article takes into account the geometry of the object. Moreover, it integrates the nonlinear phenomena of the dynamic behavior of material. As input parameters, the weaving data that define the 3D structure of the object and the mechanical properties of the yarn that express its dynamics are used. Thus, the fabric model is composed of a geometrical model of fabric (structure) on which a model of yarn (material characterization) is added. This hypothesis may be reasonable since a fabric shows the result of a three-dimensional assembly of yarns judiciously disposed. Since these yarns interact dynamically: the main difficulty consists of defining the yarn model. In our case, it is composed of various nonlinear functions representing the dynamic behavior of yarn. In order to characterize the flexibility of material, the weight, the elasticity and any other mechanical characteristics defining the relation between the strain and the stretching out of the shape should be taken into account. Firstly, several works dealing with realistic mathematical models of fabric are described. A taxonomic classification is achieved in order to position our study (in comparison to the scientific community). Secondly, the model of the fabric is described. A geometrical model of the object is presented. It allows one to dimension the object in a 3D space and then to position it at its initial state. Subsequently, a nodal model of yarns is described, step by step, in order to demonstrate the separability of the various dynamic behaviors. These nodal links make it simple to integrate the proposed model in the global geometrical model. Thus, the methods of numerical resolution used to simulate the complete model of the fabric are exposed. One method is selected and used in order to improve the performances of the fabric simulator and to obtain better stability. Several simulations illustrate the quality of the results obtained.

Download Full-text

Volumetric Representation and Sphere Packing of Indoor Space for Three-Dimensional Room Segmentation

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10110739 ◽

2021 ◽

Vol 10 (11) ◽

pp. 739

Author(s):

Fan Yang ◽

Mingliang Che ◽

Xinkai Zuo ◽

Lin Li ◽

Jiyi Zhang ◽

...

Keyword(s):

Three Dimensional ◽

Sphere Packing ◽

Semantic Segmentation ◽

Point Clouds ◽

Connected Components ◽

Indoor Environments ◽

Structure Information ◽

Indoor Space ◽

3D Space ◽

Volumetric Representation

Room segmentation is a basic task for the semantic enrichment of point clouds. Recent studies have mainly projected single-floor point clouds to binary images to realize two-dimensional room segmentation. However, these methods have difficulty solving semantic segmentation problems in complex 3D indoor environments, including cross-floor spaces and rooms inside rooms; this is the bottleneck of indoor 3D modeling for non-Manhattan worlds. To make full use of the abundant geometric and spatial structure information in 3D space, a novel 3D room segmentation method that realizes room segmentation directly in 3D space is proposed in this study. The method utilizes volumetric representation based on a VDB data structure and packs an indoor space with a set of compact spheres to form rooms as separated connected components. Experimental results on different types of indoor point cloud datasets demonstrate the efficiency of the proposed method.

Download Full-text

A survey of structure from motion.

Acta Numerica ◽

10.1017/s096249291700006x ◽

2017 ◽

Vol 26 ◽

pp. 305-364 ◽

Cited By ~ 62

Author(s):

Onur Özyeşil ◽

Vladislav Voroninski ◽

Ronen Basri ◽

Amit Singer

Keyword(s):

Structure From Motion ◽

3D Structure ◽

Three Dimensional ◽

Image Features ◽

Estimation Methods ◽

Camera Motion ◽

Structure Estimation ◽

Localization And Mapping ◽

Reprojection Error ◽

Recent Developments

The structure from motion (SfM) problem in computer vision is to recover the three-dimensional (3D) structure of a stationary scene from a set of projective measurements, represented as a collection of two-dimensional (2D) images, via estimation of motion of the cameras corresponding to these images. In essence, SfM involves the three main stages of (i) extracting features in images (e.g. points of interest, lines,etc.) and matching these features between images, (ii) camera motion estimation (e.g. using relative pairwise camera positions estimated from the extracted features), and (iii) recovery of the 3D structure using the estimated motion and features (e.g. by minimizing the so-calledreprojection error). This survey mainly focuses on relatively recent developments in the literature pertaining to stages (ii) and (iii). More specifically, after touching upon the early factorization-based techniques for motion and structure estimation, we provide a detailed account of some of the recent cameralocationestimation methods in the literature, followed by discussion of notable techniques for 3D structure recovery. We also cover the basics of thesimultaneous localization and mapping(SLAM) problem, which can be viewed as a specific case of the SfM problem. Further, our survey includes a review of the fundamentals of feature extraction and matching (i.e. stage (i) above), various recent methods for handling ambiguities in 3D scenes, SfM techniques involving relatively uncommon camera models and image features, and popular sources of data and SfM software.

Download Full-text

Methods for three-dimensional reconstruction of cellular components within a thick section

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s0424820100141871 ◽

1986 ◽

Vol 44 ◽

pp. 18-21

Author(s):

J. Frank ◽

B. F. McEwen ◽

M. Radermacher ◽

C. L. Rieder

Keyword(s):

Tilt Angle ◽

Tomographic Reconstruction ◽

3D Structure ◽

Three Dimensional ◽

Thick Section ◽

Three Dimensional Reconstruction ◽

Axis Ratio ◽

Dimensional Reconstruction ◽

Cellular Components

The tomographic reconstruction from multiple projections of cellular components, within a thick section, offers a way of visualizing and quantifying their three-dimensional (3D) structure. However, asymmetric objects require as many views from the widest tilt range as possible; otherwise the reconstruction may be uninterpretable. Even if not for geometric obstructions, the increasing pathway of electrons, as the tilt angle is increased, poses the ultimate upper limitation to the projection range. With the maximum tilt angle being fixed, the only way to improve the faithfulness of the reconstruction is by changing the mode of the tilting from single-axis to conical; a point within the object projected with a tilt angle of 60° and a full 360° azimuthal range is then reconstructed as a slightly elliptic (axis ratio 1.2 : 1) sphere.

Download Full-text