scholarly journals Whole Heart Segmentation Using 3D FM-Pre-ResNet Encoder–Decoder Based Architecture with Variational Autoencoder Regularization

2021 ◽  
Vol 11 (9) ◽  
pp. 3912
Author(s):  
Marija Habijan ◽  
Irena Galić ◽  
Hrvoje Leventić ◽  
Krešimir Romić

An accurate whole heart segmentation (WHS) on medical images, including computed tomography (CT) and magnetic resonance (MR) images, plays a crucial role in many clinical applications, such as cardiovascular disease diagnosis, pre-surgical planning, and intraoperative treatment. Manual whole-heart segmentation is a time-consuming process, prone to subjectivity and error. Therefore, there is a need to develop a quick, automatic, and accurate whole heart segmentation systems. Nowadays, convolutional neural networks (CNNs) emerged as a robust approach for medical image segmentation. In this paper, we first introduce a novel connectivity structure of residual unit that we refer to as a feature merge residual unit (FM-Pre-ResNet). The proposed connectivity allows the creation of distinctly deep models without an increase in the number of parameters compared to the pre-activation residual units. Second, we propose a three-dimensional (3D) encoder–decoder based architecture that successfully incorporates FM-Pre-ResNet units and variational autoencoder (VAE). In an encoding stage, FM-Pre-ResNet units are used for learning a low-dimensional representation of the input. After that, the variational autoencoder (VAE) reconstructs the input image from the low-dimensional latent space to provide a strong regularization of all model weights, simultaneously preventing overfitting on the training data. Finally, the decoding stage creates the final whole heart segmentation. We evaluate our method on the 40 test subjects of the MICCAI Multi-Modality Whole Heart Segmentation (MM-WHS) Challenge. The average dice values of whole heart segmentation are 90.39% (CT images) and 89.50% (MRI images), which are both highly comparable to the state-of-the-art.

2021 ◽  
Vol 11 (3) ◽  
pp. 1013
Author(s):  
Zvezdan Lončarević ◽  
Rok Pahič ◽  
Aleš Ude ◽  
Andrej Gams

Autonomous robot learning in unstructured environments often faces the problem that the dimensionality of the search space is too large for practical applications. Dimensionality reduction techniques have been developed to address this problem and describe motor skills in low-dimensional latent spaces. Most of these techniques require the availability of a sufficiently large database of example task executions to compute the latent space. However, the generation of many example task executions on a real robot is tedious, and prone to errors and equipment failures. The main result of this paper is a new approach for efficient database gathering by performing a small number of task executions with a real robot and applying statistical generalization, e.g., Gaussian process regression, to generate more data. We have shown in our experiments that the data generated this way can be used for dimensionality reduction with autoencoder neural networks. The resulting latent spaces can be exploited to implement robot learning more efficiently. The proposed approach has been evaluated on the problem of robotic throwing at a target. Simulation and real-world results with a humanoid robot TALOS are provided. They confirm the effectiveness of generalization-based database acquisition and the efficiency of learning in a low-dimensional latent space.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Yoshihiro Nagano ◽  
Ryo Karakida ◽  
Masato Okada

Abstract Deep neural networks are good at extracting low-dimensional subspaces (latent spaces) that represent the essential features inside a high-dimensional dataset. Deep generative models represented by variational autoencoders (VAEs) can generate and infer high-quality datasets, such as images. In particular, VAEs can eliminate the noise contained in an image by repeating the mapping between latent and data space. To clarify the mechanism of such denoising, we numerically analyzed how the activity pattern of trained networks changes in the latent space during inference. We considered the time development of the activity pattern for specific data as one trajectory in the latent space and investigated the collective behavior of these inference trajectories for many data. Our study revealed that when a cluster structure exists in the dataset, the trajectory rapidly approaches the center of the cluster. This behavior was qualitatively consistent with the concept retrieval reported in associative memory models. Additionally, the larger the noise contained in the data, the closer the trajectory was to a more global cluster. It was demonstrated that by increasing the number of the latent variables, the trend of the approach a cluster center can be enhanced, and the generalization ability of the VAE can be improved.


2020 ◽  
Vol 20 (1) ◽  
Author(s):  
Lu Wang ◽  
Dongxue Liang ◽  
Xiaolei Yin ◽  
Jing Qiu ◽  
Zhiyun Yang ◽  
...  

Abstract Background Coronary artery angiography is an indispensable assistive technique for cardiac interventional surgery. Segmentation and extraction of blood vessels from coronary angiographic images or videos are very essential prerequisites for physicians to locate, assess and diagnose the plaques and stenosis in blood vessels. Methods This article proposes a novel coronary artery segmentation framework that combines a three–dimensional (3D) convolutional input layer and a two–dimensional (2D) convolutional network. Instead of a single input image in the previous medical image segmentation applications, our framework accepts a sequence of coronary angiographic images as input, and outputs the clearest mask of segmentation result. The 3D input layer leverages the temporal information in the image sequence, and fuses the multiple images into more comprehensive 2D feature maps. The 2D convolutional network implements down–sampling encoders, up–sampling decoders, bottle–neck modules, and skip connections to accomplish the segmentation task. Results The spatial–temporal model of this article obtains good segmentation results despite the poor quality of coronary angiographic video sequences, and outperforms the state–of–the–art techniques. Conclusions The results justify that making full use of the spatial and temporal information in the image sequences will promote the analysis and understanding of the images in videos.


2014 ◽  
Vol 3 (2) ◽  
pp. 14-32
Author(s):  
Mithun Kumar PK ◽  
Mohammad Motiur Rahman

The calcification plaque is a one kind of artifacts or noises, which is occurred in the Computed Tomography (CT) images as a very high attenuation coefficient. Computed Tomography (CT) images are more helpful than other modalities (e.g. Ultrasonic Imaging, Magnetic Resonance Imaging (MRI) etc.) for disease diagnosis but unfortunately, CT image is an affected sometime by calcification plaque. Medical image segmentation cannot be optimum because of having calcification in the CT images, which is absolutely unexpected. The calcification plaque is the major problem for optimal organ segmentation and detection. This proposed task is a subjective as well as an effective for calcification alleviation from CT images. In this paper, Firstly, we applied the Fisher's Discriminant Analysis (FDA) for optimal threshold value estimation. Secondly, the proposed optimal threshold value is used for the optimal threshold image extraction. After this, the morphological operation is used for heavy calcification erosion and the XOR operation is used for adjusting the optimal threshold image with the input image. Finally, we implemented the Extra-Energy Reduction (EER) Function to smooth the desired image. Therefore, our investigated method is the most significant and articulate in order to attenuate calcification plaque from CT images.


Author(s):  
Andrew Brock ◽  
Theodore Lim ◽  
J. M. Ritchie ◽  
Nick Weston

Large scale scene generation is a computationally intensive operation, and added complexities arise when dynamic content generation is required. We propose a system capable of generating virtual content from non-expert input. The proposed system uses a 3-dimensional variational autoencoder to interactively generate new virtual objects by interpolating between extant objects in a learned low-dimensional space, as well as by randomly sampling in that space. We present an interface that allows a user to intuitively explore the latent manifold, taking advantage of the network’s ability to perform algebra in the latent space to help infer context and generalize to previously unseen inputs.


Sensors ◽  
2022 ◽  
Vol 22 (2) ◽  
pp. 523
Author(s):  
Kh Tohidul Islam ◽  
Sudanthi Wijewickrema ◽  
Stephen O’Leary

Multi-modal three-dimensional (3-D) image segmentation is used in many medical applications, such as disease diagnosis, treatment planning, and image-guided surgery. Although multi-modal images provide information that no single image modality alone can provide, integrating such information to be used in segmentation is a challenging task. Numerous methods have been introduced to solve the problem of multi-modal medical image segmentation in recent years. In this paper, we propose a solution for the task of brain tumor segmentation. To this end, we first introduce a method of enhancing an existing magnetic resonance imaging (MRI) dataset by generating synthetic computed tomography (CT) images. Then, we discuss a process of systematic optimization of a convolutional neural network (CNN) architecture that uses this enhanced dataset, in order to customize it for our task. Using publicly available datasets, we show that the proposed method outperforms similar existing methods.


2017 ◽  
pp. 1258-1280
Author(s):  
Mithun Kumar PK ◽  
Mohammad Motiur Rahman

The calcification plaque is a one kind of artifacts or noises, which is occurred in the Computed Tomography (CT) images as a very high attenuation coefficient. Computed Tomography (CT) images are more helpful than other modalities (e.g. Ultrasonic Imaging, Magnetic Resonance Imaging (MRI) etc.) for disease diagnosis but unfortunately, CT image is an affected sometime by calcification plaque. Medical image segmentation cannot be optimum because of having calcification in the CT images, which is absolutely unexpected. The calcification plaque is the major problem for optimal organ segmentation and detection. This proposed task is a subjective as well as an effective for calcification alleviation from CT images. In this paper, Firstly, we applied the Fisher's Discriminant Analysis (FDA) for optimal threshold value estimation. Secondly, the proposed optimal threshold value is used for the optimal threshold image extraction. After this, the morphological operation is used for heavy calcification erosion and the XOR operation is used for adjusting the optimal threshold image with the input image. Finally, we implemented the Extra-Energy Reduction (EER) Function to smooth the desired image. Therefore, our investigated method is the most significant and articulate in order to attenuate calcification plaque from CT images.


The proposed system generates new images from the existing images using variational autoencoders. The autoencoder aims to map the input image to a multivariate normal distribution in the latent space. Variational autoencoder transforms input image into a remarkable output by reducing the reconstruction and KL divergence losses. The primary advantage of implementing variational autoencoder over the other autoencoders is that it follows a specific probability distribution called Gaussian distribution and results in generating high quality images.


2021 ◽  
Vol 18 (4) ◽  
pp. 378-381 ◽  
Author(s):  
Luis A. Bolaños ◽  
Dongsheng Xiao ◽  
Nancy L. Ford ◽  
Jeff M. LeDue ◽  
Pankaj K. Gupta ◽  
...  

2021 ◽  
Vol 13 (2) ◽  
pp. 51
Author(s):  
Lili Sun ◽  
Xueyan Liu ◽  
Min Zhao ◽  
Bo Yang

Variational graph autoencoder, which can encode structural information and attribute information in the graph into low-dimensional representations, has become a powerful method for studying graph-structured data. However, most existing methods based on variational (graph) autoencoder assume that the prior of latent variables obeys the standard normal distribution which encourages all nodes to gather around 0. That leads to the inability to fully utilize the latent space. Therefore, it becomes a challenge on how to choose a suitable prior without incorporating additional expert knowledge. Given this, we propose a novel noninformative prior-based interpretable variational graph autoencoder (NPIVGAE). Specifically, we exploit the noninformative prior as the prior distribution of latent variables. This prior enables the posterior distribution parameters to be almost learned from the sample data. Furthermore, we regard each dimension of a latent variable as the probability that the node belongs to each block, thereby improving the interpretability of the model. The correlation within and between blocks is described by a block–block correlation matrix. We compare our model with state-of-the-art methods on three real datasets, verifying its effectiveness and superiority.


Sign in / Sign up

Export Citation Format

Share Document