An Encoder-decoder Deep Learning Model Combining Mixed Attention Mechanism and Asymmetric Convolution for Automation of Retinal Vessels Segmentation

10.36227/techrxiv.16522011.v1 ◽

2021 ◽

Author(s):

Jiajia Cao ◽

Qin Zhou ◽

Yi Chen ◽

Lin Yin ◽

Fei Zhang

Keyword(s):

Deep Learning ◽

Attention Mechanism ◽

Model Parameters ◽

Network Architectures ◽

Retinal Vessels ◽

Model Combining ◽

Equal Importance ◽

Segmentation Methods ◽

Public Datasets ◽

Learned Features

The segmentation of the retinal vascular tree is the fundamental step for diagnosing ophthalmological diseases and cardiovascular diseases. Most existing vessel segmentation methods based on deep learning give the learned features equal importance. Ignored the highly imbalanced ratio between background and vessels (the majority of vessel pixels belong to the background), the learned features would be dominantly guided by background, and relatively little influence comes from vessels, often leading to low model sensitivity and prediction accuracy. The reduction of model size is also a challenge. We propose a mixed attention mechanism and asymmetric convolution encoder-decoder structure(MAAC) for segmentation in Retinal Vessels to solve these problems. In MAAC, the mixed attention is designed to emphasize the valid features and suppress the invalid features. It not only identifies information that helps retinal vessels recognition but also locates the position of the vessel. All square convolutions are replaced by asymmetric convolutions because it is more robust to rotational distortions and small convolutions are more suitable for extracting vessel features (based on the thin characteristics of vessels). The employment of asymmetric convolution reduces model parameters and improve the recognition of thin vessel. The experiments on public datasets DRIVE, STARE, and CHASE\_DB1 demonstrated that the proposed MAAC could more accurately segment vessels with a global AUC of 98.17$\%$, 98.67$\%$, and 98.53$\%$, respectively. The mixed attention proposed in this study can be applied to other deep learning models for performance improvement without changing the network architectures. <br>

Download Full-text

Real-Time Semantic Segmentation with Dual Encoder and Self-Attention Mechanism for Autonomous Driving

Sensors ◽

10.3390/s21238072 ◽

2021 ◽

Vol 21 (23) ◽

pp. 8072

Author(s):

Yu-Bang Chang ◽

Chieh Tsai ◽

Chang-Hong Lin ◽

Poki Chen

Keyword(s):

Deep Learning ◽

Real Time ◽

Network Architecture ◽

Semantic Segmentation ◽

Autonomous Driving ◽

Attention Mechanism ◽

Trade Off ◽

Segmentation Methods ◽

General Semantic ◽

Deep Learning Model

As the techniques of autonomous driving become increasingly valued and universal, real-time semantic segmentation has become very popular and challenging in the field of deep learning and computer vision in recent years. However, in order to apply the deep learning model to edge devices accompanying sensors on vehicles, we need to design a structure that has the best trade-off between accuracy and inference time. In previous works, several methods sacrificed accuracy to obtain a faster inference time, while others aimed to find the best accuracy under the condition of real time. Nevertheless, the accuracies of previous real-time semantic segmentation methods still have a large gap compared to general semantic segmentation methods. As a result, we propose a network architecture based on a dual encoder and a self-attention mechanism. Compared with preceding works, we achieved a 78.6% mIoU with a speed of 39.4 FPS with a 1024 × 2048 resolution on a Cityscapes test submission.

Download Full-text

Automatic Pixel-Level Pavement Crack Recognition Using a Deep Feature Aggregation Segmentation Network with a scSE Attention Mechanism Module

Sensors ◽

10.3390/s21092902 ◽

2021 ◽

Vol 21 (9) ◽

pp. 2902

Author(s):

Wenting Qiao ◽

Qiangwei Liu ◽

Xiaoguang Wu ◽

Biao Ma ◽

Gang Li

Keyword(s):

Deep Learning ◽

Crack Detection ◽

Attention Mechanism ◽

Model Parameters ◽

Detection Accuracy ◽

Safe Driving ◽

Deep Feature ◽

Feature Aggregation ◽

Pavement Crack Detection ◽

Detection Speed

Pavement crack detection is essential for safe driving. The traditional manual crack detection method is highly subjective and time-consuming. Hence, an automatic pavement crack detection system is needed to facilitate this progress. However, this is still a challenging task due to the complex topology and large noise interference of crack images. Recently, although deep learning-based technologies have achieved breakthrough progress in crack detection, there are still some challenges, such as large parameters and low detection efficiency. Besides, most deep learning-based crack detection algorithms find it difficult to establish good balance between detection accuracy and detection speed. Inspired by the latest deep learning technology in the field of image processing, this paper proposes a novel crack detection algorithm based on the deep feature aggregation network with the spatial-channel squeeze & excitation (scSE) attention mechanism module, which calls CrackDFANet. Firstly, we cut the collected crack images into 512 × 512 pixel image blocks to establish a crack dataset. Then through iterative optimization on the training and validation sets, we obtained a crack detection model with good robustness. Finally, the CrackDFANet model verified on a total of 3516 images in five datasets with different sizes and containing different noise interferences. Experimental results show that the trained CrackDFANet has strong anti-interference ability, and has better robustness and generalization ability under the interference of light interference, parking line, water stains, plant disturbance, oil stains, and shadow conditions. Furthermore, the CrackDFANet is found to be better than other state-of-the-art algorithms with more accurate detection effect and faster detection speed. Meanwhile, our algorithm model parameters and error rates are significantly reduced.

Download Full-text

Feasibility of Using Improved Convolutional Neural Network to Classify BI-RADS 4 Breast Lesions: Compare Deep Learning Features of the Lesion Itself and the Minimum Bounding Cube of Lesion

Wireless Communications and Mobile Computing ◽

10.1155/2021/4430886 ◽

2021 ◽

Vol 2021 ◽

pp. 1-9

Author(s):

Meihong Sheng ◽

Weixia Tang ◽

Jiahuan Tang ◽

Ming Zhang ◽

Shenchu Gong ◽

...

Keyword(s):

Deep Learning ◽

Predictive Value ◽

Breast Mri ◽

Recall Rate ◽

Attention Mechanism ◽

3D Segmentation ◽

Data Set ◽

Dce Mri ◽

Segmentation Methods ◽

Malignant Lesions

To determine the feasibility of using a deep learning (DL) approach to identify benign and malignant BI-RADS 4 lesions with preoperative breast DCE-MRI images and compare two 3D segmentation methods. The patients admitted from January 2014 to October 2020 were retrospectively analyzed. Breast MRI examination was performed before surgical resection or biopsy, and the masses were classified as BI-RADS 4. The first postcontrast images of DCE-MRI T1WI sequence were selected. There were two 3D segmentation methods for the lesions, one was manual segmentation along the edge of the lesion slice by slice, and the other was the minimum bounding cube of the lesion. Then, DL feature extraction was carried out; the pixel values of the image data are normalized to 0-1 range. The model was established based on the blueprint of the classic residual network ResNet50, retaining its residual module and improved 2D convolution module to 3D. At the same time, an attention mechanism was added to transform the attention mechanism module, which only fit the 2D image convolution module, into a 3D-Convolutional Block Attention Module (CBAM) to adapt to 3D-MRI. After the last CBAM, the algorithm stretches the output high-dimensional features into a one-dimensional vector and connects 2 fully connected slices, before finally setting two output results (P1, P2), which, respectively, represent the probability of benign and malignant lesions. Accuracy, sensitivity, specificity, negative predictive value, positive predictive value, the recall rate and area under the ROC curve (AUC) were used as evaluation indicators. A total of 203 patients were enrolled, with 207 mass lesions including 101 benign lesions and 106 malignant lesions. The data set was divided into the training set ( n = 145 ), the validation set ( n = 22 ), and the test set ( n = 40 ) at the ratio of 7 : 1 : 2; fivefold cross-validation was performed. The mean AUC based on the minimum bounding cube of lesion and the 3D-ROI of lesion itself were 0.827 and 0.799, the accuracy was 78.54% and 74.63%, the sensitivity was 78.85% and 83.65%, the specificity was 78.22% and 65.35%, the NPV was 78.85% and 71.31%, the PPV was 78.22% and 79.52%, the recall rate was 78.85% and 83.65%, respectively. There was no statistical difference in AUC based on the lesion itself model and the minimum bounding cube model ( Z = 0.771 , p = 0.4408 ). The minimum bounding cube based on the edge of the lesion showed higher accuracy, specificity, and lower recall rate in identifying benign and malignant lesions. Based on the lesion 3D-ROI segmentation using a minimum bounding cube can more effectively reflect the information of the lesion itself and the surrounding tissues. Its DL model performs better than the lesion itself. Using the DL approach with a 3D attention mechanism based on ResNet50 to identify benign and malignant BI-RADS 4 lesions was feasible.

Download Full-text

Code Clone Detection with Hierarchical Attentive Graph Embedding

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s021819402150025x ◽

2021 ◽

Vol 31 (06) ◽

pp. 837-861

Author(s):

Xiujuan Ji ◽

Lei Liu ◽

Jingwen Zhu

Keyword(s):

Deep Learning ◽

Source Code ◽

Detection Performance ◽

Attention Mechanism ◽

Feature Representation ◽

Model Parameters ◽

Clone Detection ◽

Learning Approaches ◽

Convolutional Network ◽

Code Clone

Code clone serves as a typical programming manner that reuses the existing code to solve similar programming problems, which greatly facilitates software development but recurs program bugs and maintenance costs. Recently, deep learning-based detection approaches gradually present their effectiveness on feature representation and detection performance. Among them, deep learning approaches based on abstract syntax tree (AST) construct models relying on the node embedding technique. In AST, the semantic of nodes is obviously hierarchical, and the importance of nodes is quite different to determine whether the two code fragments are cloned or not. However, some approaches do not fully consider the hierarchical structure information of source code. Some approaches ignore the different importance of nodes when generating the features of source code. Thirdly, when the tree is very large and deep, many approaches are vulnerable to the gradient vanishing problem during training. In order to properly address these challenges, we propose a hierarchical attentive graph neural network embedding model-HAG for the code clone detection. Firstly, the attention mechanism is applied on nodes in AST to distinguish the importance of different nodes during the model training. In addition, the HAG adopts graph convolutional network (GCN) to propagate the code message on AST graph and then exploits a hierarchical differential pooling GCN to sufficiently capture the code semantics at different structure level. To evaluate the effectiveness of HAG, we conducted extensive experiments on public clone dataset and compared it with seven state-of-the-art clone detection models. The experimental results demonstrate that the HAG achieves superior detection performance compared with baseline models. Especially, in the detection of moderately Type-3 or Type-4 clones, the HAG particularly outperforms baselines, indicating the strong detection capability of HAG for semantic clones. Apart from that, the impacts of the hierarchical pooling, attention mechanism and critical model parameters are systematically discussed.

Download Full-text

Entity Linking Based on Sentence Representation

Complexity ◽

10.1155/2021/8895742 ◽

2021 ◽

Vol 2021 ◽

pp. 1-9

Author(s):

Bingjing Jia ◽

Zhongli Wu ◽

Pengpeng Zhou ◽

Bin Wu

Keyword(s):

Deep Learning ◽

Knowledge Base ◽

Attention Mechanism ◽

Superior Performance ◽

Learning Approaches ◽

Entity Linking ◽

Semantic Meaning ◽

Representation Model ◽

Sentence Similarity ◽

Public Datasets

Entity linking involves mapping ambiguous mentions in documents to the correct entities in a given knowledge base. Most existing methods failed to link when a mention appears multiple times in a document, since the conflict of its contexts in different locations may lead to difficult linking. Sentence representation, which has been studied based on deep learning approaches recently, can be used to resolve the above issue. In this paper, an effective entity linking model is proposed to capture the semantic meaning of the sentences and reduce the noise introduced by different contexts of the same mention in a document. This model first uses the symmetry of the Siamese network to learn the sentence similarity. Then, the attention mechanism is added to improve the interaction between input sentences. To show the effectiveness of our sentence representation model combined with attention mechanism, named ELSR, extensive experiments are conducted on two public datasets. Results illustrate that our model outperforms the baselines and achieves the superior performance.

Download Full-text

Deep Learning Approaches for Whiteboard Image Quality Enhancement

Color and Imaging Conference ◽

10.2352/j.imagingsci.technol.2019.63.4.040404 ◽

2019 ◽

Vol 2019 (1) ◽

pp. 360-368

Author(s):

Mekides Assefa Abebe ◽

Jon Yngve Hardeberg

Keyword(s):

Deep Learning ◽

Image Quality ◽

Image Data ◽

Quality Enhancement ◽

Network Architectures ◽

Learning Approaches ◽

Data Set ◽

Image Quality Enhancement ◽

Processing Techniques ◽

White Balancing

Different whiteboard image degradations highly reduce the legibility of pen-stroke content as well as the overall quality of the images. Consequently, different researchers addressed the problem through different image enhancement techniques. Most of the state-of-the-art approaches applied common image processing techniques such as background foreground segmentation, text extraction, contrast and color enhancements and white balancing. However, such types of conventional enhancement methods are incapable of recovering severely degraded pen-stroke contents and produce artifacts in the presence of complex pen-stroke illustrations. In order to surmount such problems, the authors have proposed a deep learning based solution. They have contributed a new whiteboard image data set and adopted two deep convolutional neural network architectures for whiteboard image quality enhancement applications. Their different evaluations of the trained models demonstrated their superior performances over the conventional methods.

Download Full-text

A Review on the Attention Mechanism of Deep Learning

Neurocomputing ◽

10.1016/j.neucom.2021.03.091 ◽

2021 ◽

Author(s):

Zhaoyang Niu ◽

Guoqiang Zhong ◽

Hui Yu

Keyword(s):

Deep Learning ◽

Attention Mechanism

Download Full-text

Multi-instance deep learning based on attention mechanism for failure prediction of unlabeled hard disk drives

IEEE Transactions on Instrumentation and Measurement ◽

10.1109/tim.2021.3068180 ◽

2021 ◽

pp. 1-1

Author(s):

Guochao Wang ◽

Yu Wang ◽

Xiaojie Sun

Keyword(s):

Deep Learning ◽

Hard Disk Drives ◽

Failure Prediction ◽

Hard Disk ◽

Attention Mechanism ◽

Disk Drives

Download Full-text

Evaluation of the feasibility of explainable computer-aided detection of cardiomegaly on chest radiographs using deep learning

Scientific Reports ◽

10.1038/s41598-021-96433-1 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Mu Sook Lee ◽

Yong Soo Kim ◽

Minki Kim ◽

Muhammad Usman ◽

Shi Sub Byon ◽

...

Keyword(s):

Deep Learning ◽

Diagnostic Performance ◽

Absolute Error ◽

Training Dataset ◽

Computer Aided Detection ◽

Test Dataset ◽

Cardiothoracic Ratio ◽

Computer Aided ◽

Chest X Ray ◽

Public Datasets

AbstractWe examined the feasibility of explainable computer-aided detection of cardiomegaly in routine clinical practice using segmentation-based methods. Overall, 793 retrospectively acquired posterior–anterior (PA) chest X-ray images (CXRs) of 793 patients were used to train deep learning (DL) models for lung and heart segmentation. The training dataset included PA CXRs from two public datasets and in-house PA CXRs. Two fully automated segmentation-based methods using state-of-the-art DL models for lung and heart segmentation were developed. The diagnostic performance was assessed and the reliability of the automatic cardiothoracic ratio (CTR) calculation was determined using the mean absolute error and paired t-test. The effects of thoracic pathological conditions on performance were assessed using subgroup analysis. One thousand PA CXRs of 1000 patients (480 men, 520 women; mean age 63 ± 23 years) were included. The CTR values derived from the DL models and diagnostic performance exhibited excellent agreement with reference standards for the whole test dataset. Performance of segmentation-based methods differed based on thoracic conditions. When tested using CXRs with lesions obscuring heart borders, the performance was lower than that for other thoracic pathological findings. Thus, segmentation-based methods using DL could detect cardiomegaly; however, the feasibility of computer-aided detection of cardiomegaly without human intervention was limited.

Download Full-text