Multi-Stage Variational Auto-Encoders for Coarse-to-Fine Image Generation

Detailed 3D human body reconstruction from multi-view images combining voxel super-resolution and learned implicit representation

Applied Intelligence ◽

10.1007/s10489-021-02783-8 ◽

2021 ◽

Author(s):

Zhongguo Li ◽

Magnus Oskarsson ◽

Anders Heyden

Keyword(s):

Human Body ◽

Super Resolution ◽

3D Models ◽

Implicit Function ◽

Implicit Representation ◽

Multi Scale ◽

Multi Stage ◽

Synthetic Datasets ◽

Human Body Models ◽

Coarse To Fine

AbstractThe task of reconstructing detailed 3D human body models from images is interesting but challenging in computer vision due to the high freedom of human bodies. This work proposes a coarse-to-fine method to reconstruct detailed 3D human body from multi-view images combining Voxel Super-Resolution (VSR) based on learning the implicit representation. Firstly, the coarse 3D models are estimated by learning an Pixel-aligned Implicit Function based on Multi-scale Features (MF-PIFu) which are extracted by multi-stage hourglass networks from the multi-view images. Then, taking the low resolution voxel grids which are generated by the coarse 3D models as input, the VSR is implemented by learning an implicit function through a multi-stage 3D convolutional neural network. Finally, the refined detailed 3D human body models can be produced by VSR which can preserve the details and reduce the false reconstruction of the coarse 3D models. Benefiting from the implicit representation, the training process in our method is memory efficient and the detailed 3D human body produced by our method from multi-view images is the continuous decision boundary with high-resolution geometry. In addition, the coarse-to-fine method based on MF-PIFu and VSR can remove false reconstructions and preserve the appearance details in the final reconstruction, simultaneously. In the experiments, our method quantitatively and qualitatively achieves the competitive 3D human body models from images with various poses and shapes on both the real and synthetic datasets.

Download Full-text

Coarse-to-fine Nasopharyngeal Carcinoma Segmentation in MRI via Multi-stage Rendering

2020 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) ◽

10.1109/bibm49941.2020.9313574 ◽

2020 ◽

Author(s):

Yang Li ◽

Hong Peng ◽

Tingting Dan ◽

Yu Hu ◽

Guihua Tao ◽

...

Keyword(s):

Nasopharyngeal Carcinoma ◽

Multi Stage ◽

Coarse To Fine

Download Full-text

Machine Reading Comprehension Based On Multi-headed attention Model

MATEC Web of Conferences ◽

10.1051/matecconf/201823202047 ◽

2018 ◽

Vol 232 ◽

pp. 02047

Author(s):

Hui Xu ◽

Shichang Zhang ◽

Jie Jiang

Keyword(s):

Reading Comprehension ◽

Flow Model ◽

Original Text ◽

Focus Attention ◽

Fixed Size ◽

Attention Model ◽

Multi Stage ◽

L Value ◽

Machine Reading ◽

Coarse To Fine

Machine Reading Comprehension (MRC) refers to the task that aims to read the context through the machine and answer the question about the original text, which needs to be modeled in the interaction between the context and the question. Recently, attention mechanisms in deep learning have been successfully extended to MRC tasks. In general, the attention-based approach is to focus attention on a small part of the context and to generalize it using a fixed-size vector. This paper introduces a network of attention from coarse to fine, which is a multi-stage hierarchical process. Firstly, the context and questions are encoded by bi-directional LSTM RNN; Then, more accurate interaction information is obtained after multiple iterations of the attention mechanism; Finally, a cursor-based approach is used to predicts the answer at the beginning and end of the original text. Experimental evaluation of shows that the BiDMF (Bi-Directional Multi-Attention Flow) model designed in this paper achieved 34.1% BLUE4 value and 39.5% Rouge-L value on the test set.

Download Full-text

Coarse-to-Fine Clothing Image Generation with Progressively Constructed Conditional GAN

Proceedings of the 14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications ◽

10.5220/0007306900830090 ◽

2019 ◽

Author(s):

Youngki Kwon ◽

Soomin Kim ◽

Donggeun Yoo ◽

Sung-Eui Yoon

Keyword(s):

Image Generation ◽

Coarse To Fine

Download Full-text

Coarse-to-Fine UAV Image Geo-Localization Using Multi-stage Lucas-Kanade Networks

2021 2nd Information Communication Technologies Conference (ICTC) ◽

10.1109/ictc51749.2021.9441503 ◽

2021 ◽

Author(s):

Songbing Wu ◽

Chun Du ◽

Hao Chen ◽

Ning Jing

Keyword(s):

Multi Stage ◽

Uav Image ◽

Coarse To Fine

Download Full-text

A Multi-Stage Classifier for Face Recognition Undertaken by Coarse-to-fine Strategy

State of the Art in Face Recognition ◽

10.5772/6639 ◽

2009 ◽

Author(s):

Jiann-Der Lee ◽

Chen-Hui Kuo

Keyword(s):

Face Recognition ◽

Multi Stage ◽

Coarse To Fine

Download Full-text

SSR-Net: A Compact Soft Stagewise Regression Network for Age Estimation

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/150 ◽

2018 ◽

Cited By ~ 19

Author(s):

Tsun-Yi Yang ◽

Yi-Hsuan Huang ◽

Yen-Yu Lin ◽

Pi-Cheng Hsiu ◽

Yung-Yu Chuang

Keyword(s):

Age Estimation ◽

Network Architecture ◽

Dynamic Range ◽

Single Image ◽

Input Face ◽

Multi Stage ◽

Compact Size ◽

Model Size ◽

Multi Class Classification ◽

Coarse To Fine

This paper presents a novel CNN model called Soft Stagewise Regression Network (SSR-Net) for age estimation from a single image with a compact model size. Inspired by DEX, we address age estimation by performing multi-class classification and then turning classification results into regression by calculating the expected values. SSR-Net takes a coarse-to-fine strategy and performs multi-class classification with multiple stages. Each stage is only responsible for refining the decision of its previous stage for more accurate age estimation. Thus, each stage performs a task with few classes and requires few neurons, greatly reducing the model size. For addressing the quantization issue introduced by grouping ages into classes, SSR-Net assigns a dynamic range to each age class by allowing it to be shifted and scaled according to the input face image. Both the multi-stage strategy and the dynamic range are incorporated into the formulation of soft stagewise regression. A novel network architecture is proposed for carrying out soft stagewise regression. The resultant SSR-Net model is very compact and takes only 0.32 MB. Despite its compact size, SSR-Net’s performance approaches those of the state-of-the-art methods whose model sizes are often more than 1500× larger.

Download Full-text