LFM: A Lightweight LCD Algorithm Based on Feature Matching between Similar Key Frames

Zuojun Zhu; Xiangrong Xu; Xuefei Liu; Yanglin Jiang

doi:10.3390/s21134499

LFM: A Lightweight LCD Algorithm Based on Feature Matching between Similar Key Frames

Sensors ◽

10.3390/s21134499 ◽

2021 ◽

Vol 21 (13) ◽

pp. 4499

Author(s):

Zuojun Zhu ◽

Xiangrong Xu ◽

Xuefei Liu ◽

Yanglin Jiang

Keyword(s):

Deep Learning ◽

Feature Matching ◽

Binary Classification ◽

Recall Rate ◽

Current Position ◽

Loop Closure ◽

Localization And Mapping ◽

Similar Images ◽

Key Frames ◽

Target Detection Task

Loop Closure Detection (LCD) is an important technique to improve the accuracy of Simultaneous Localization and Mapping (SLAM). In this paper, we propose an LCD algorithm based on binary classification for feature matching between similar images with deep learning, which greatly improves the accuracy of LCD algorithm. Meanwhile, a novel lightweight convolutional neural network (CNN) is proposed and applied to the target detection task of key frames. On this basis, the key frames are binary classified according to their labels. Finally, similar frames are input into the improved lightweight feature matching network based on Transformer to judge whether the current position is loop closure. The experimental results show that, compared with the traditional method, LFM-LCD has higher accuracy and recall rate in the LCD task of indoor SLAM while ensuring the number of parameters and calculation amount. The research in this paper provides a new direction for LCD of robotic SLAM, which will be further improved with the development of deep learning.

Download Full-text

Role of Deep Learning in Loop Closure Detection for Visual and Lidar SLAM: A Survey

Sensors ◽

10.3390/s21041243 ◽

2021 ◽

Vol 21 (4) ◽

pp. 1243

Author(s):

Saba Arshad ◽

Gon-Woo Kim

Keyword(s):

Deep Learning ◽

Detailed Comparison ◽

Loop Closure ◽

Loop Closure Detection ◽

Detection Algorithms ◽

Mobile Objects ◽

Localization And Mapping ◽

Loop Detection ◽

Scene Representation ◽

Matching Strategy

Loop closure detection is of vital importance in the process of simultaneous localization and mapping (SLAM), as it helps to reduce the cumulative error of the robot’s estimated pose and generate a consistent global map. Many variations of this problem have been considered in the past and the existing methods differ in the acquisition approach of query and reference views, the choice of scene representation, and associated matching strategy. Contributions of this survey are many-fold. It provides a thorough study of existing literature on loop closure detection algorithms for visual and Lidar SLAM and discusses their insight along with their limitations. It presents a taxonomy of state-of-the-art deep learning-based loop detection algorithms with detailed comparison metrics. Also, the major challenges of conventional approaches are identified. Based on those challenges, deep learning-based methods were reviewed where the identified challenges are tackled focusing on the methods providing long-term autonomy in various conditions such as changing weather, light, seasons, viewpoint, and occlusion due to the presence of mobile objects. Furthermore, open challenges and future directions were also discussed.

Download Full-text

An ensemble deep learning framework to refine large deletions in linked-reads

10.1101/2021.09.27.462057 ◽

2021 ◽

Author(s):

Yunfei Hu ◽

Sanidhya V Mangal ◽

Lu Zhang ◽

Xin Zhou

Keyword(s):

Deep Learning ◽

Genome Assembly ◽

Binary Classification ◽

Image Data ◽

Read Depth ◽

Recall Rate ◽

Large Deletion ◽

Reference Sequence ◽

Diploid Genome ◽

Large Deletions

AbstractThe detection of structural variants (SVs) remains challenging due to inconsistencies in detected breakpoints and biological complexity of some rearrangements. Linked-reads have demonstrated their superiority in diploid genome assembly and SV detection. Recently developed tools Aquila and Aquila_stLFR use a reference sequence and linked-reads to generate a high quality diploid genome assembly, using which they then detect and phase personal genetic variations. However, they both produce a substantial proportion of false positive deletion SV calls. To take full advantage of linked-reads, an effective downstream filtering and refinement framework is needed pressingly. In this work, we propose AquilaDeepFilter to filter large deletion SVs from Aquila and Aquila_stLFR. AquilaDeepFilter relies on a deep learning ensemble approach by integrating six state-of-the-art CNN backbones. The filtering of deletion SVs is formulated as a binary classification task on image data that are generated through the extraction of multiple alignment signals, including read depth, split reads and discordant read pairs. Three linked-reads libraries sequenced from the well-studied sample NA24385 and the gold standard of GiaB benchmark were used to perform thorough experiments on our proposed method. The results demonstrated that AquilaDeepFilter could increase the precision rate of Aquila while the recall rate of Aquila decreased only slightly, and the overall F1 improved by 20%. Furthermore, AquilaDeepFilter outperformed another deep learning based method for SV filtering, DeepSVFilter. Even though we designed AquilaDeepFilter for linked-reads, the framework could also be used to improve SV detection on short reads.

Download Full-text

Deep convolutional neural networks for cardiovascular vulnerable plaque detection

MATEC Web of Conferences ◽

10.1051/matecconf/201927702024 ◽

2019 ◽

Vol 277 ◽

pp. 02024 ◽

Cited By ~ 1

Author(s):

Lincan Li ◽

Tong Jia ◽

Tianqi Meng ◽

Yizhe Liu

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Convolutional Neural Networks ◽

Vulnerable Plaque ◽

Recall Rate ◽

Superior Performance ◽

Learning Approaches ◽

Deep Convolutional Neural Networks ◽

Vulnerable Plaques ◽

Plaque Detection

In this paper, an accurate two-stage deep learning method is proposed to detect vulnerable plaques in ultrasonic images of cardiovascular. Firstly, a Fully Convonutional Neural Network (FCN) named U-Net is used to segment the original Intravascular Optical Coherence Tomography (IVOCT) cardiovascular images. We experiment on different threshold values to find the best threshold for removing noise and background in the original images. Secondly, a modified Faster RCNN is adopted to do precise detection. The modified Faster R-CNN utilize six-scale anchors (122,162,322,642,1282,2562) instead of the conventional one scale or three scale approaches. First, we present three problems in cardiovascular vulnerable plaque diagnosis, then we demonstrate how our method solve these problems. The proposed method in this paper apply deep convolutional neural networks to the whole diagnostic procedure. Test results show the Recall rate, Precision rate, IoU (Intersection-over-Union) rate and Total score are 0.94, 0.885, 0.913 and 0.913 respectively, higher than the 1st team of CCCV2017 Cardiovascular OCT Vulnerable Plaque Detection Challenge. AP of the designed Faster RCNN is 83.4%, higher than conventional approaches which use one-scale or three-scale anchors. These results demonstrate the superior performance of our proposed method and the power of deep learning approaches in diagnose cardiovascular vulnerable plaques.

Download Full-text

Advancing Stress Detection Methodology with Deep Learning Techniques Targeting UX Evaluation in AAL Scenarios: Applying Embeddings for Categorical Variables

Electronics ◽

10.3390/electronics10131550 ◽

2021 ◽

Vol 10 (13) ◽

pp. 1550

Author(s):

Alexandros Liapis ◽

Evanthia Faliagka ◽

Christos P. Antonopoulos ◽

Georgios Keramidas ◽

Nikolaos Voros

Keyword(s):

Machine Learning ◽

Deep Learning ◽

User Experience ◽

Electrodermal Activity ◽

Binary Classification ◽

Research Question ◽

Classification Problem ◽

Categorical Variables ◽

Stress Detection ◽

Software Failures

Physiological measurements have been widely used by researchers and practitioners in order to address the stress detection challenge. So far, various datasets for stress detection have been recorded and are available to the research community for testing and benchmarking. The majority of the stress-related available datasets have been recorded while users were exposed to intense stressors, such as songs, movie clips, major hardware/software failures, image datasets, and gaming scenarios. However, it remains an open research question if such datasets can be used for creating models that will effectively detect stress in different contexts. This paper investigates the performance of the publicly available physiological dataset named WESAD (wearable stress and affect detection) in the context of user experience (UX) evaluation. More specifically, electrodermal activity (EDA) and skin temperature (ST) signals from WESAD were used in order to train three traditional machine learning classifiers and a simple feed forward deep learning artificial neural network combining continues variables and entity embeddings. Regarding the binary classification problem (stress vs. no stress), high accuracy (up to 97.4%), for both training approaches (deep-learning, machine learning), was achieved. Regarding the stress detection effectiveness of the created models in another context, such as user experience (UX) evaluation, the results were quite impressive. More specifically, the deep-learning model achieved a rather high agreement when a user-annotated dataset was used for validation.

Download Full-text

Deep-Learning-Based Pupil Center Detection and Tracking Technology for Visible-Light Wearable Gaze Tracking Devices

Applied Sciences ◽

10.3390/app11020851 ◽

2021 ◽

Vol 11 (2) ◽

pp. 851

Author(s):

Wei-Liang Ou ◽

Tzu-Ling Kuo ◽

Chin-Chieh Chang ◽

Chih-Peng Fan

Keyword(s):

Deep Learning ◽

Visible Light ◽

Tracking System ◽

Recall Rate ◽

Detection Accuracy ◽

Learning Technology ◽

Tracking Errors ◽

Gaze Tracking ◽

Detection Technology ◽

Pupil Tracking

In this study, for the application of visible-light wearable eye trackers, a pupil tracking methodology based on deep-learning technology is developed. By applying deep-learning object detection technology based on the You Only Look Once (YOLO) model, the proposed pupil tracking method can effectively estimate and predict the center of the pupil in the visible-light mode. By using the developed YOLOv3-tiny-based model to test the pupil tracking performance, the detection accuracy is as high as 80%, and the recall rate is close to 83%. In addition, the average visible-light pupil tracking errors of the proposed YOLO-based deep-learning design are smaller than 2 pixels for the training mode and 5 pixels for the cross-person test, which are much smaller than those of the previous ellipse fitting design without using deep-learning technology under the same visible-light conditions. After the combination of calibration process, the average gaze tracking errors by the proposed YOLOv3-tiny-based pupil tracking models are smaller than 2.9 and 3.5 degrees at the training and testing modes, respectively, and the proposed visible-light wearable gaze tracking system performs up to 20 frames per second (FPS) on the GPU-based software embedded platform.

Download Full-text

A Multi-Feature Fusion Slam System Attaching Semantic Invariant to Points and Lines

Sensors ◽

10.3390/s21041196 ◽

2021 ◽

Vol 21 (4) ◽

pp. 1196

Author(s):

Gang Li ◽

Yawen Zeng ◽

Huilan Huang ◽

Shaojian Song ◽

Bin Liu ◽

...

Keyword(s):

Feature Matching ◽

Feature Fusion ◽

Error Function ◽

Line Segments ◽

Cumulative Error ◽

Localization And Mapping ◽

Tracking Process ◽

Line Features ◽

Point Line ◽

Segment Data

The traditional simultaneous localization and mapping (SLAM) system uses static points of the environment as features for real-time localization and mapping. When there are few available point features, the system is difficult to implement. A feasible solution is to introduce line features. In complex scenarios containing rich line segments, the description of line segments is not strongly differentiated, which can lead to incorrect association of line segment data, thus introducing errors into the system and aggravating the cumulative error of the system. To address this problem, a point-line stereo visual SLAM system incorporating semantic invariants is proposed in this paper. This system improves the accuracy of line feature matching by fusing line features with image semantic invariant information. When defining the error function, the semantic invariant is fused with the reprojection error function, and the semantic constraint is applied to reduce the cumulative error of the poses in the long-term tracking process. Experiments on the Office sequence of the TartanAir dataset and the KITTI dataset show that this system improves the matching accuracy of line features and suppresses the cumulative error of the SLAM system to some extent, and the mean relative pose error (RPE) is 1.38 and 0.0593 m, respectively.

Download Full-text

An Optimized Tightly-Coupled VIO Design on the Basis of the Fused Point and Line Features for Patrol Robot Navigation

Sensors ◽

10.3390/s19092004 ◽

2019 ◽

Vol 19 (9) ◽

pp. 2004 ◽

Cited By ~ 3

Author(s):

Linlin Xia ◽

Qingyu Meng ◽

Deru Chi ◽

Bo Meng ◽

Hanrui Yang

Keyword(s):

Global Positioning System ◽

Inertial Measurement Unit ◽

Feature Matching ◽

Robot Navigation ◽

Measurement Unit ◽

State Estimator ◽

Localization And Mapping ◽

Global Positioning ◽

Tightly Coupled ◽

Association State

The development and maturation of simultaneous localization and mapping (SLAM) in robotics opens the door to the application of a visual inertial odometry (VIO) to the robot navigation system. For a patrol robot with no available Global Positioning System (GPS) support, the embedded VIO components, which are generally composed of an Inertial Measurement Unit (IMU) and a camera, fuse the inertial recursion with SLAM calculation tasks, and enable the robot to estimate its location within a map. The highlights of the optimized VIO design lie in the simplified VIO initialization strategy as well as the fused point and line feature-matching based method for efficient pose estimates in the front-end. With a tightly-coupled VIO anatomy, the system state is explicitly expressed in a vector and further estimated by the state estimator. The consequent problems associated with the data association, state optimization, sliding window and timestamp alignment in the back-end are discussed in detail. The dataset tests and real substation scene tests are conducted, and the experimental results indicate that the proposed VIO can realize the accurate pose estimation with a favorable initializing efficiency and eminent map representations as expected in concerned environments. The proposed VIO design can therefore be recognized as a preferred tool reference for a class of visual and inertial SLAM application domains preceded by no external location reference support hypothesis.

Download Full-text

Ensemble Deep Learning for Cervix Image Selection toward Improving Reliability in Automated Cervical Precancer Screening

Diagnostics ◽

10.3390/diagnostics10070451 ◽

2020 ◽

Vol 10 (7) ◽

pp. 451 ◽

Cited By ~ 2

Author(s):

Peng Guo ◽

Zhiyun Xue ◽

Zac Mtema ◽

Karen Yeates ◽

Ophira Ginsburg ◽

...

Keyword(s):

Deep Learning ◽

Learning Algorithm ◽

Binary Classification ◽

Digital Camera ◽

Deep Learning Algorithm ◽

Average Accuracy ◽

Acceptable Quality ◽

Cervical Precancer ◽

One Class Classification ◽

Learning Architectures

Automated Visual Examination (AVE) is a deep learning algorithm that aims to improve the effectiveness of cervical precancer screening, particularly in low- and medium-resource regions. It was trained on data from a large longitudinal study conducted by the National Cancer Institute (NCI) and has been shown to accurately identify cervices with early stages of cervical neoplasia for clinical evaluation and treatment. The algorithm processes images of the uterine cervix taken with a digital camera and alerts the user if the woman is a candidate for further evaluation. This requires that the algorithm be presented with images of the cervix, which is the object of interest, of acceptable quality, i.e., in sharp focus, with good illumination, without shadows or other occlusions, and showing the entire squamo-columnar transformation zone. Our prior work has addressed some of these constraints to help discard images that do not meet these criteria. In this work, we present a novel algorithm that determines that the image contains the cervix to a sufficient extent. Non-cervix or other inadequate images could lead to suboptimal or wrong results. Manual removal of such images is labor intensive and time-consuming, particularly in working with large retrospective collections acquired with inadequate quality control. In this work, we present a novel ensemble deep learning method to identify cervix images and non-cervix images in a smartphone-acquired cervical image dataset. The ensemble method combined the assessment of three deep learning architectures, RetinaNet, Deep SVDD, and a customized CNN (Convolutional Neural Network), each using a different strategy to arrive at its decision, i.e., object detection, one-class classification, and binary classification. We examined the performance of each individual architecture and an ensemble of all three architectures. An average accuracy and F-1 score of 91.6% and 0.890, respectively, were achieved on a separate test dataset consisting of more than 30,000 smartphone-captured images.

Download Full-text

Deep Learning for Laryngopharyngeal Reflux Diagnosis

Applied Sciences ◽

10.3390/app11114753 ◽

2021 ◽

Vol 11 (11) ◽

pp. 4753

Author(s):

Gen Ye ◽

Chen Du ◽

Tong Lin ◽

Yan Yan ◽

Jack Jiang

Keyword(s):

Deep Learning ◽

Speech Processing ◽

Data Augmentation ◽

Laryngopharyngeal Reflux ◽

Ph Monitoring ◽

Binary Classification ◽

Classification Problem ◽

Learning Approaches ◽

Learning Techniques ◽

Auc Value

(1) Background: Deep learning has become ubiquitous due to its impressive performance in various domains, such as varied as computer vision, natural language and speech processing, and game-playing. In this work, we investigated the performance of recent deep learning approaches on the laryngopharyngeal reflux (LPR) diagnosis task. (2) Methods: Our dataset is composed of 114 subjects with 37 pH-positive cases and 77 control cases. In contrast to prior work based on either reflux finding score (RFS) or pH monitoring, we directly take laryngoscope images as inputs to neural networks, as laryngoscopy is the most common and simple diagnostic method. The diagnosis task is formulated as a binary classification problem. We first tested a powerful backbone network that incorporates residual modules, attention mechanism and data augmentation. Furthermore, recent methods in transfer learning and few-shot learning were investigated. (3) Results: On our dataset, the performance is the best test classification accuracy is 73.4%, while the best AUC value is 76.2%. (4) Conclusions: This study demonstrates that deep learning techniques can be applied to classify LPR images automatically. Although the number of pH-positive images used for training is limited, deep network can still be capable of learning discriminant features with the advantage of technique.

Download Full-text

Deep Learning-Based Binary Classification of ADHD Using Resting State MR Images

Augmented Human Research ◽

10.1007/s41133-020-00042-y ◽

2021 ◽

Vol 6 (1) ◽

Author(s):

Vikas Khullar ◽

Karuna Salgotra ◽

Harjit Pal Singh ◽

Davinder Pal Sharma

Keyword(s):

Deep Learning ◽

Resting State ◽

Binary Classification ◽

Mr Images

Download Full-text