Whole Heart Segmentation Using 3D FM-Pre-ResNet Encoder–Decoder Based Architecture with Variational Autoencoder Regularization

Marija Habijan; Irena Galić; Hrvoje Leventić; Krešimir Romić

doi:10.3390/app11093912

Whole Heart Segmentation Using 3D FM-Pre-ResNet Encoder–Decoder Based Architecture with Variational Autoencoder Regularization

Applied Sciences ◽

10.3390/app11093912 ◽

2021 ◽

Vol 11 (9) ◽

pp. 3912

Author(s):

Marija Habijan ◽

Irena Galić ◽

Hrvoje Leventić ◽

Krešimir Romić

Keyword(s):

Three Dimensional ◽

Disease Diagnosis ◽

Input Image ◽

Medical Image Segmentation ◽

Training Data ◽

Treatment Manual ◽

Latent Space ◽

Variational Autoencoder ◽

Low Dimensional ◽

Whole Heart

An accurate whole heart segmentation (WHS) on medical images, including computed tomography (CT) and magnetic resonance (MR) images, plays a crucial role in many clinical applications, such as cardiovascular disease diagnosis, pre-surgical planning, and intraoperative treatment. Manual whole-heart segmentation is a time-consuming process, prone to subjectivity and error. Therefore, there is a need to develop a quick, automatic, and accurate whole heart segmentation systems. Nowadays, convolutional neural networks (CNNs) emerged as a robust approach for medical image segmentation. In this paper, we first introduce a novel connectivity structure of residual unit that we refer to as a feature merge residual unit (FM-Pre-ResNet). The proposed connectivity allows the creation of distinctly deep models without an increase in the number of parameters compared to the pre-activation residual units. Second, we propose a three-dimensional (3D) encoder–decoder based architecture that successfully incorporates FM-Pre-ResNet units and variational autoencoder (VAE). In an encoding stage, FM-Pre-ResNet units are used for learning a low-dimensional representation of the input. After that, the variational autoencoder (VAE) reconstructs the input image from the low-dimensional latent space to provide a strong regularization of all model weights, simultaneously preventing overfitting on the training data. Finally, the decoding stage creates the final whole heart segmentation. We evaluate our method on the 40 test subjects of the MICCAI Multi-Modality Whole Heart Segmentation (MM-WHS) Challenge. The average dice values of whole heart segmentation are 90.39% (CT images) and 89.50% (MRI images), which are both highly comparable to the state-of-the-art.

Download Full-text

Generalization-Based Acquisition of Training Data for Motor Primitive Learning by Neural Networks

Applied Sciences ◽

10.3390/app11031013 ◽

2021 ◽

Vol 11 (3) ◽

pp. 1013

Author(s):

Zvezdan Lončarević ◽

Rok Pahič ◽

Aleš Ude ◽

Andrej Gams

Keyword(s):

Neural Networks ◽

Dimensionality Reduction ◽

Gaussian Process Regression ◽

Search Space ◽

Robot Learning ◽

Training Data ◽

Practical Applications ◽

Latent Space ◽

Real Robot ◽

Low Dimensional

Autonomous robot learning in unstructured environments often faces the problem that the dimensionality of the search space is too large for practical applications. Dimensionality reduction techniques have been developed to address this problem and describe motor skills in low-dimensional latent spaces. Most of these techniques require the availability of a sufficiently large database of example task executions to compute the latent space. However, the generation of many example task executions on a real robot is tedious, and prone to errors and equipment failures. The main result of this paper is a new approach for efficient database gathering by performing a small number of task executions with a real robot and applying statistical generalization, e.g., Gaussian process regression, to generate more data. We have shown in our experiments that the data generated this way can be used for dimensionality reduction with autoencoder neural networks. The resulting latent spaces can be exploited to implement robot learning more efficiently. The proposed approach has been evaluated on the problem of robotic throwing at a target. Simulation and real-world results with a humanoid robot TALOS are provided. They confirm the effectiveness of generalization-based database acquisition and the efficiency of learning in a low-dimensional latent space.

Download Full-text

Collective dynamics of repeated inference in variational autoencoder rapidly find cluster structure

Scientific Reports ◽

10.1038/s41598-020-72593-4 ◽

2020 ◽

Vol 10 (1) ◽

Author(s):

Yoshihiro Nagano ◽

Ryo Karakida ◽

Masato Okada

Keyword(s):

Activity Pattern ◽

Latent Variables ◽

Cluster Structure ◽

Generative Models ◽

Cluster Center ◽

Specific Data ◽

Global Cluster ◽

Latent Space ◽

Variational Autoencoder ◽

Low Dimensional

Abstract Deep neural networks are good at extracting low-dimensional subspaces (latent spaces) that represent the essential features inside a high-dimensional dataset. Deep generative models represented by variational autoencoders (VAEs) can generate and infer high-quality datasets, such as images. In particular, VAEs can eliminate the noise contained in an image by repeating the mapping between latent and data space. To clarify the mechanism of such denoising, we numerically analyzed how the activity pattern of trained networks changes in the latent space during inference. We considered the time development of the activity pattern for specific data as one trajectory in the latent space and investigated the collective behavior of these inference trajectories for many data. Our study revealed that when a cluster structure exists in the dataset, the trajectory rapidly approaches the center of the cluster. This behavior was qualitatively consistent with the concept retrieval reported in associative memory models. Additionally, the larger the noise contained in the data, the closer the trajectory was to a more global cluster. It was demonstrated that by increasing the number of the latent variables, the trend of the approach a cluster center can be enhanced, and the generalization ability of the VAE can be improved.

Download Full-text

Coronary artery segmentation in angiographic videos utilizing spatial-temporal information

BMC Medical Imaging ◽

10.1186/s12880-020-00509-9 ◽

2020 ◽

Vol 20 (1) ◽

Author(s):

Lu Wang ◽

Dongxue Liang ◽

Xiaolei Yin ◽

Jing Qiu ◽

Zhiyun Yang ◽

...

Keyword(s):

Coronary Artery ◽

Blood Vessels ◽

Three Dimensional ◽

Input Image ◽

Medical Image Segmentation ◽

Temporal Information ◽

Feature Maps ◽

Convolutional Network ◽

Input Layer ◽

Coronary Artery Segmentation

Abstract Background Coronary artery angiography is an indispensable assistive technique for cardiac interventional surgery. Segmentation and extraction of blood vessels from coronary angiographic images or videos are very essential prerequisites for physicians to locate, assess and diagnose the plaques and stenosis in blood vessels. Methods This article proposes a novel coronary artery segmentation framework that combines a three–dimensional (3D) convolutional input layer and a two–dimensional (2D) convolutional network. Instead of a single input image in the previous medical image segmentation applications, our framework accepts a sequence of coronary angiographic images as input, and outputs the clearest mask of segmentation result. The 3D input layer leverages the temporal information in the image sequence, and fuses the multiple images into more comprehensive 2D feature maps. The 2D convolutional network implements down–sampling encoders, up–sampling decoders, bottle–neck modules, and skip connections to accomplish the segmentation task. Results The spatial–temporal model of this article obtains good segmentation results despite the poor quality of coronary angiographic video sequences, and outperforms the state–of–the–art techniques. Conclusions The results justify that making full use of the spatial and temporal information in the image sequences will promote the analysis and understanding of the images in videos.

Download Full-text

Calcifications Attenuation in Left Coronary Artery CT Images Using FDA Domain

International Journal of Biomedical and Clinical Engineering ◽

10.4018/ijbce.2014070102 ◽

2014 ◽

Vol 3 (2) ◽

pp. 14-32

Author(s):

Mithun Kumar PK ◽

Mohammad Motiur Rahman

Keyword(s):

Computed Tomography ◽

Threshold Value ◽

Disease Diagnosis ◽

Ct Images ◽

Input Image ◽

Medical Image Segmentation ◽

Optimal Threshold ◽

High Attenuation ◽

Magnetic Resonance Imaging Mri ◽

Fisher’S Discriminant Analysis

The calcification plaque is a one kind of artifacts or noises, which is occurred in the Computed Tomography (CT) images as a very high attenuation coefficient. Computed Tomography (CT) images are more helpful than other modalities (e.g. Ultrasonic Imaging, Magnetic Resonance Imaging (MRI) etc.) for disease diagnosis but unfortunately, CT image is an affected sometime by calcification plaque. Medical image segmentation cannot be optimum because of having calcification in the CT images, which is absolutely unexpected. The calcification plaque is the major problem for optimal organ segmentation and detection. This proposed task is a subjective as well as an effective for calcification alleviation from CT images. In this paper, Firstly, we applied the Fisher's Discriminant Analysis (FDA) for optimal threshold value estimation. Secondly, the proposed optimal threshold value is used for the optimal threshold image extraction. After this, the morphological operation is used for heavy calcification erosion and the XOR operation is used for adjusting the optimal threshold image with the input image. Finally, we implemented the Extra-Energy Reduction (EER) Function to smooth the desired image. Therefore, our investigated method is the most significant and articulate in order to attenuate calcification plaque from CT images.

Download Full-text

Context-Aware Content Generation for Virtual Environments

Volume 1B: 36th Computers and Information in Engineering Conference ◽

10.1115/detc2016-59997 ◽

2016 ◽

Author(s):

Andrew Brock ◽

Theodore Lim ◽

J. M. Ritchie ◽

Nick Weston

Keyword(s):

Large Scale ◽

Dimensional Space ◽

Context Aware ◽

3 Dimensional ◽

Latent Space ◽

Variational Autoencoder ◽

Computationally Intensive ◽

Expert Input ◽

Low Dimensional ◽

Content Generation

Large scale scene generation is a computationally intensive operation, and added complexities arise when dynamic content generation is required. We propose a system capable of generating virtual content from non-expert input. The proposed system uses a 3-dimensional variational autoencoder to interactively generate new virtual objects by interpolating between extant objects in a learned low-dimensional space, as well as by randomly sampling in that space. We present an interface that allows a user to intuitively explore the latent manifold, taking advantage of the network’s ability to perform algebra in the latent space to help infer context and generalize to previously unseen inputs.

Download Full-text

A Deep Learning Framework for Segmenting Brain Tumors Using MRI and Synthetically Generated CT Images

Sensors ◽

10.3390/s22020523 ◽

2022 ◽

Vol 22 (2) ◽

pp. 523

Author(s):

Kh Tohidul Islam ◽

Sudanthi Wijewickrema ◽

Stephen O’Leary

Keyword(s):

Image Segmentation ◽

Three Dimensional ◽

Disease Diagnosis ◽

Ct Images ◽

Medical Image Segmentation ◽

Tumor Segmentation ◽

Brain Tumor Segmentation ◽

Learning Framework ◽

Guided Surgery ◽

Magnetic Resonance Imaging Mri

Multi-modal three-dimensional (3-D) image segmentation is used in many medical applications, such as disease diagnosis, treatment planning, and image-guided surgery. Although multi-modal images provide information that no single image modality alone can provide, integrating such information to be used in segmentation is a challenging task. Numerous methods have been introduced to solve the problem of multi-modal medical image segmentation in recent years. In this paper, we propose a solution for the task of brain tumor segmentation. To this end, we first introduce a method of enhancing an existing magnetic resonance imaging (MRI) dataset by generating synthetic computed tomography (CT) images. Then, we discuss a process of systematic optimization of a convolutional neural network (CNN) architecture that uses this enhanced dataset, in order to customize it for our task. Using publicly available datasets, we show that the proposed method outperforms similar existing methods.

Download Full-text

Calcifications Attenuation in Left Coronary Artery CT Images Using FDA Domain

Medical Imaging ◽

10.4018/978-1-5225-0571-6.ch051 ◽

2017 ◽

pp. 1258-1280

Author(s):

Mithun Kumar PK ◽

Mohammad Motiur Rahman

Keyword(s):

Computed Tomography ◽

Threshold Value ◽

Disease Diagnosis ◽

Ct Images ◽

Input Image ◽

Medical Image Segmentation ◽

Optimal Threshold ◽

Value Estimation ◽

Magnetic Resonance Imaging Mri ◽

Fisher’S Discriminant Analysis

Download Full-text

Image Generation using Variational Autoencoders

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.e2480.039520 ◽

2020 ◽

Vol 9 (5) ◽

pp. 517-520

Keyword(s):

Probability Distribution ◽

Multivariate Normal Distribution ◽

Input Image ◽

The Other ◽

Image Generation ◽

Multivariate Normal ◽

High Quality ◽

Latent Space ◽

Variational Autoencoder ◽

Primary Advantage

The proposed system generates new images from the existing images using variational autoencoders. The autoencoder aims to map the input image to a multivariate normal distribution in the latent space. Variational autoencoder transforms input image into a remarkable output by reducing the reconstruction and KL divergence losses. The primary advantage of implementing variational autoencoder over the other autoencoders is that it follows a specific probability distribution called Gaussian distribution and results in generating high quality images.

Download Full-text

A three-dimensional virtual mouse generates synthetic training data for behavioral analysis

Nature Methods ◽

10.1038/s41592-021-01103-9 ◽

2021 ◽

Vol 18 (4) ◽

pp. 378-381 ◽

Cited By ~ 1

Author(s):

Luis A. Bolaños ◽

Dongsheng Xiao ◽

Nancy L. Ford ◽

Jeff M. LeDue ◽

Pankaj K. Gupta ◽

...

Keyword(s):

Three Dimensional ◽

Training Data ◽

Behavioral Analysis ◽

Virtual Mouse ◽

Synthetic Training Data

Download Full-text

Interpretable Variational Graph Autoencoder with Noninformative Prior

Future Internet ◽

10.3390/fi13020051 ◽

2021 ◽

Vol 13 (2) ◽

pp. 51

Author(s):

Lili Sun ◽

Xueyan Liu ◽

Min Zhao ◽

Bo Yang

Keyword(s):

Latent Variables ◽

Latent Variable ◽

Expert Knowledge ◽

Structural Information ◽

Standard Normal Distribution ◽

Noninformative Prior ◽

Latent Space ◽

Distribution Parameters ◽

Standard Normal ◽

Low Dimensional

Variational graph autoencoder, which can encode structural information and attribute information in the graph into low-dimensional representations, has become a powerful method for studying graph-structured data. However, most existing methods based on variational (graph) autoencoder assume that the prior of latent variables obeys the standard normal distribution which encourages all nodes to gather around 0. That leads to the inability to fully utilize the latent space. Therefore, it becomes a challenge on how to choose a suitable prior without incorporating additional expert knowledge. Given this, we propose a novel noninformative prior-based interpretable variational graph autoencoder (NPIVGAE). Specifically, we exploit the noninformative prior as the prior distribution of latent variables. This prior enables the posterior distribution parameters to be almost learned from the sample data. Furthermore, we regard each dimension of a latent variable as the probability that the node belongs to each block, thereby improving the interpretability of the model. The correlation within and between blocks is described by a block–block correlation matrix. We compare our model with state-of-the-art methods on three real datasets, verifying its effectiveness and superiority.

Download Full-text