scholarly journals An Encoder-decoder Deep Learning Model Combining Mixed Attention Mechanism and Asymmetric Convolution for Automation of Retinal Vessels Segmentation

Author(s):  
Jiajia Cao ◽  
Qin Zhou ◽  
Yi Chen ◽  
Lin Yin ◽  
Fei Zhang

The segmentation of the retinal vascular tree is the fundamental step for diagnosing ophthalmological diseases and cardiovascular diseases. Most existing vessel segmentation methods based on deep learning give the learned features equal importance. Ignored the highly imbalanced ratio between background and vessels (the majority of vessel pixels belong to the background), the learned features would be dominantly guided by background, and relatively little influence comes from vessels, often leading to low model sensitivity and prediction accuracy. The reduction of model size is also a challenge. We propose a mixed attention mechanism and asymmetric convolution encoder-decoder structure(MAAC) for segmentation in Retinal Vessels to solve these problems. In MAAC, the mixed attention is designed to emphasize the valid features and suppress the invalid features. It not only identifies information that helps retinal vessels recognition but also locates the position of the vessel. All square convolutions are replaced by asymmetric convolutions because it is more robust to rotational distortions and small convolutions are more suitable for extracting vessel features (based on the thin characteristics of vessels). The employment of asymmetric convolution reduces model parameters and improve the recognition of thin vessel. The experiments on public datasets DRIVE, STARE, and CHASE\_DB1 demonstrated that the proposed MAAC could more accurately segment vessels with a global AUC of 98.17$\%$, 98.67$\%$, and 98.53$\%$, respectively. The mixed attention proposed in this study can be applied to other deep learning models for performance improvement without changing the network architectures. <br>

2021 ◽  
Author(s):  
Jiajia Cao ◽  
Qin Zhou ◽  
Yi Chen ◽  
Lin Yin ◽  
Fei Zhang

The segmentation of the retinal vascular tree is the fundamental step for diagnosing ophthalmological diseases and cardiovascular diseases. Most existing vessel segmentation methods based on deep learning give the learned features equal importance. Ignored the highly imbalanced ratio between background and vessels (the majority of vessel pixels belong to the background), the learned features would be dominantly guided by background, and relatively little influence comes from vessels, often leading to low model sensitivity and prediction accuracy. The reduction of model size is also a challenge. We propose a mixed attention mechanism and asymmetric convolution encoder-decoder structure(MAAC) for segmentation in Retinal Vessels to solve these problems. In MAAC, the mixed attention is designed to emphasize the valid features and suppress the invalid features. It not only identifies information that helps retinal vessels recognition but also locates the position of the vessel. All square convolutions are replaced by asymmetric convolutions because it is more robust to rotational distortions and small convolutions are more suitable for extracting vessel features (based on the thin characteristics of vessels). The employment of asymmetric convolution reduces model parameters and improve the recognition of thin vessel. The experiments on public datasets DRIVE, STARE, and CHASE\_DB1 demonstrated that the proposed MAAC could more accurately segment vessels with a global AUC of 98.17$\%$, 98.67$\%$, and 98.53$\%$, respectively. The mixed attention proposed in this study can be applied to other deep learning models for performance improvement without changing the network architectures. <br>


Sensors ◽  
2021 ◽  
Vol 21 (23) ◽  
pp. 8072
Author(s):  
Yu-Bang Chang ◽  
Chieh Tsai ◽  
Chang-Hong Lin ◽  
Poki Chen

As the techniques of autonomous driving become increasingly valued and universal, real-time semantic segmentation has become very popular and challenging in the field of deep learning and computer vision in recent years. However, in order to apply the deep learning model to edge devices accompanying sensors on vehicles, we need to design a structure that has the best trade-off between accuracy and inference time. In previous works, several methods sacrificed accuracy to obtain a faster inference time, while others aimed to find the best accuracy under the condition of real time. Nevertheless, the accuracies of previous real-time semantic segmentation methods still have a large gap compared to general semantic segmentation methods. As a result, we propose a network architecture based on a dual encoder and a self-attention mechanism. Compared with preceding works, we achieved a 78.6% mIoU with a speed of 39.4 FPS with a 1024 × 2048 resolution on a Cityscapes test submission.


Sensors ◽  
2021 ◽  
Vol 21 (9) ◽  
pp. 2902
Author(s):  
Wenting Qiao ◽  
Qiangwei Liu ◽  
Xiaoguang Wu ◽  
Biao Ma ◽  
Gang Li

Pavement crack detection is essential for safe driving. The traditional manual crack detection method is highly subjective and time-consuming. Hence, an automatic pavement crack detection system is needed to facilitate this progress. However, this is still a challenging task due to the complex topology and large noise interference of crack images. Recently, although deep learning-based technologies have achieved breakthrough progress in crack detection, there are still some challenges, such as large parameters and low detection efficiency. Besides, most deep learning-based crack detection algorithms find it difficult to establish good balance between detection accuracy and detection speed. Inspired by the latest deep learning technology in the field of image processing, this paper proposes a novel crack detection algorithm based on the deep feature aggregation network with the spatial-channel squeeze & excitation (scSE) attention mechanism module, which calls CrackDFANet. Firstly, we cut the collected crack images into 512 × 512 pixel image blocks to establish a crack dataset. Then through iterative optimization on the training and validation sets, we obtained a crack detection model with good robustness. Finally, the CrackDFANet model verified on a total of 3516 images in five datasets with different sizes and containing different noise interferences. Experimental results show that the trained CrackDFANet has strong anti-interference ability, and has better robustness and generalization ability under the interference of light interference, parking line, water stains, plant disturbance, oil stains, and shadow conditions. Furthermore, the CrackDFANet is found to be better than other state-of-the-art algorithms with more accurate detection effect and faster detection speed. Meanwhile, our algorithm model parameters and error rates are significantly reduced.


2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Meihong Sheng ◽  
Weixia Tang ◽  
Jiahuan Tang ◽  
Ming Zhang ◽  
Shenchu Gong ◽  
...  

To determine the feasibility of using a deep learning (DL) approach to identify benign and malignant BI-RADS 4 lesions with preoperative breast DCE-MRI images and compare two 3D segmentation methods. The patients admitted from January 2014 to October 2020 were retrospectively analyzed. Breast MRI examination was performed before surgical resection or biopsy, and the masses were classified as BI-RADS 4. The first postcontrast images of DCE-MRI T1WI sequence were selected. There were two 3D segmentation methods for the lesions, one was manual segmentation along the edge of the lesion slice by slice, and the other was the minimum bounding cube of the lesion. Then, DL feature extraction was carried out; the pixel values of the image data are normalized to 0-1 range. The model was established based on the blueprint of the classic residual network ResNet50, retaining its residual module and improved 2D convolution module to 3D. At the same time, an attention mechanism was added to transform the attention mechanism module, which only fit the 2D image convolution module, into a 3D-Convolutional Block Attention Module (CBAM) to adapt to 3D-MRI. After the last CBAM, the algorithm stretches the output high-dimensional features into a one-dimensional vector and connects 2 fully connected slices, before finally setting two output results (P1, P2), which, respectively, represent the probability of benign and malignant lesions. Accuracy, sensitivity, specificity, negative predictive value, positive predictive value, the recall rate and area under the ROC curve (AUC) were used as evaluation indicators. A total of 203 patients were enrolled, with 207 mass lesions including 101 benign lesions and 106 malignant lesions. The data set was divided into the training set ( n = 145 ), the validation set ( n = 22 ), and the test set ( n = 40 ) at the ratio of 7 : 1 : 2; fivefold cross-validation was performed. The mean AUC based on the minimum bounding cube of lesion and the 3D-ROI of lesion itself were 0.827 and 0.799, the accuracy was 78.54% and 74.63%, the sensitivity was 78.85% and 83.65%, the specificity was 78.22% and 65.35%, the NPV was 78.85% and 71.31%, the PPV was 78.22% and 79.52%, the recall rate was 78.85% and 83.65%, respectively. There was no statistical difference in AUC based on the lesion itself model and the minimum bounding cube model ( Z = 0.771 , p = 0.4408 ). The minimum bounding cube based on the edge of the lesion showed higher accuracy, specificity, and lower recall rate in identifying benign and malignant lesions. Based on the lesion 3D-ROI segmentation using a minimum bounding cube can more effectively reflect the information of the lesion itself and the surrounding tissues. Its DL model performs better than the lesion itself. Using the DL approach with a 3D attention mechanism based on ResNet50 to identify benign and malignant BI-RADS 4 lesions was feasible.


Author(s):  
Xiujuan Ji ◽  
Lei Liu ◽  
Jingwen Zhu

Code clone serves as a typical programming manner that reuses the existing code to solve similar programming problems, which greatly facilitates software development but recurs program bugs and maintenance costs. Recently, deep learning-based detection approaches gradually present their effectiveness on feature representation and detection performance. Among them, deep learning approaches based on abstract syntax tree (AST) construct models relying on the node embedding technique. In AST, the semantic of nodes is obviously hierarchical, and the importance of nodes is quite different to determine whether the two code fragments are cloned or not. However, some approaches do not fully consider the hierarchical structure information of source code. Some approaches ignore the different importance of nodes when generating the features of source code. Thirdly, when the tree is very large and deep, many approaches are vulnerable to the gradient vanishing problem during training. In order to properly address these challenges, we propose a hierarchical attentive graph neural network embedding model-HAG for the code clone detection. Firstly, the attention mechanism is applied on nodes in AST to distinguish the importance of different nodes during the model training. In addition, the HAG adopts graph convolutional network (GCN) to propagate the code message on AST graph and then exploits a hierarchical differential pooling GCN to sufficiently capture the code semantics at different structure level. To evaluate the effectiveness of HAG, we conducted extensive experiments on public clone dataset and compared it with seven state-of-the-art clone detection models. The experimental results demonstrate that the HAG achieves superior detection performance compared with baseline models. Especially, in the detection of moderately Type-3 or Type-4 clones, the HAG particularly outperforms baselines, indicating the strong detection capability of HAG for semantic clones. Apart from that, the impacts of the hierarchical pooling, attention mechanism and critical model parameters are systematically discussed.


Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Bingjing Jia ◽  
Zhongli Wu ◽  
Pengpeng Zhou ◽  
Bin Wu

Entity linking involves mapping ambiguous mentions in documents to the correct entities in a given knowledge base. Most existing methods failed to link when a mention appears multiple times in a document, since the conflict of its contexts in different locations may lead to difficult linking. Sentence representation, which has been studied based on deep learning approaches recently, can be used to resolve the above issue. In this paper, an effective entity linking model is proposed to capture the semantic meaning of the sentences and reduce the noise introduced by different contexts of the same mention in a document. This model first uses the symmetry of the Siamese network to learn the sentence similarity. Then, the attention mechanism is added to improve the interaction between input sentences. To show the effectiveness of our sentence representation model combined with attention mechanism, named ELSR, extensive experiments are conducted on two public datasets. Results illustrate that our model outperforms the baselines and achieves the superior performance.


2019 ◽  
Vol 2019 (1) ◽  
pp. 360-368
Author(s):  
Mekides Assefa Abebe ◽  
Jon Yngve Hardeberg

Different whiteboard image degradations highly reduce the legibility of pen-stroke content as well as the overall quality of the images. Consequently, different researchers addressed the problem through different image enhancement techniques. Most of the state-of-the-art approaches applied common image processing techniques such as background foreground segmentation, text extraction, contrast and color enhancements and white balancing. However, such types of conventional enhancement methods are incapable of recovering severely degraded pen-stroke contents and produce artifacts in the presence of complex pen-stroke illustrations. In order to surmount such problems, the authors have proposed a deep learning based solution. They have contributed a new whiteboard image data set and adopted two deep convolutional neural network architectures for whiteboard image quality enhancement applications. Their different evaluations of the trained models demonstrated their superior performances over the conventional methods.


2021 ◽  
Author(s):  
Zhaoyang Niu ◽  
Guoqiang Zhong ◽  
Hui Yu

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Mu Sook Lee ◽  
Yong Soo Kim ◽  
Minki Kim ◽  
Muhammad Usman ◽  
Shi Sub Byon ◽  
...  

AbstractWe examined the feasibility of explainable computer-aided detection of cardiomegaly in routine clinical practice using segmentation-based methods. Overall, 793 retrospectively acquired posterior–anterior (PA) chest X-ray images (CXRs) of 793 patients were used to train deep learning (DL) models for lung and heart segmentation. The training dataset included PA CXRs from two public datasets and in-house PA CXRs. Two fully automated segmentation-based methods using state-of-the-art DL models for lung and heart segmentation were developed. The diagnostic performance was assessed and the reliability of the automatic cardiothoracic ratio (CTR) calculation was determined using the mean absolute error and paired t-test. The effects of thoracic pathological conditions on performance were assessed using subgroup analysis. One thousand PA CXRs of 1000 patients (480 men, 520 women; mean age 63 ± 23 years) were included. The CTR values derived from the DL models and diagnostic performance exhibited excellent agreement with reference standards for the whole test dataset. Performance of segmentation-based methods differed based on thoracic conditions. When tested using CXRs with lesions obscuring heart borders, the performance was lower than that for other thoracic pathological findings. Thus, segmentation-based methods using DL could detect cardiomegaly; however, the feasibility of computer-aided detection of cardiomegaly without human intervention was limited.


Sign in / Sign up

Export Citation Format

Share Document