scholarly journals Development of a Deep Learning-Based Algorithm to Detect the Distal End of a Surgical Instrument

2020 ◽  
Vol 10 (12) ◽  
pp. 4245
Author(s):  
Hiroyuki Sugimori ◽  
Taku Sugiyama ◽  
Naoki Nakayama ◽  
Akemi Yamashita ◽  
Katsuhiko Ogasawara

This work aims to develop an algorithm to detect the distal end of a surgical instrument using object detection with deep learning. We employed nine video recordings of carotid endarterectomies for training and testing. We obtained regions of interest (ROI; 32 × 32 pixels), at the end of the surgical instrument on the video images, as supervised data. We applied data augmentation to these ROIs. We employed a You Only Look Once Version 2 (YOLOv2) -based convolutional neural network as the network model for training. The detectors were validated to evaluate average detection precision. The proposed algorithm used the central coordinates of the bounding boxes predicted by YOLOv2. Using the test data, we calculated the detection rate. The average precision (AP) for the ROIs, without data augmentation, was 0.4272 ± 0.108. The AP with data augmentation, of 0.7718 ± 0.0824, was significantly higher than that without data augmentation. The detection rates, including the calculated coordinates of the center points in the centers of 8 × 8 pixels and 16 × 16 pixels, were 0.6100 ± 0.1014 and 0.9653 ± 0.0177, respectively. We expect that the proposed algorithm will be efficient for the analysis of surgical records.

2020 ◽  
Author(s):  
Quoc-Viet Tran ◽  
Yen-Po Chin ◽  
Phung-Anh Nguyen ◽  
Ming-Yang Lee ◽  
Hsuan-Chia Yang ◽  
...  

BACKGROUND The automatic segmentation of skin lesions has been reported using the data of dermoscopic images. It is, however, not applicable to real-time detection using a smartphone. OBJECTIVE This study aims to examine a deep learning model for detecting and localizing positions of the mole on the captured images to precisely extract the crop images of the model without any other objects. METHODS The data were collected through public health events in Taiwan between December 2017 and February 2019. All the participants who concerned about the risk of their moles were asked to take the mole-images. Images were then measured and determined the risks by three dermatologists. We labeled the mole position with bounding boxes using the ‘LabelImg’ tool. Two architectures, SSD and Faster-RCNN, have been used to build eight different mole-detection models. The confidence score, intersection over union (IoU), and mean average precision (mAP) with the COCO metrics were used to measure the accuracy of those models. RESULTS 2790-mole images were used for the development and the validation of the models. The Faster-RCNN Inception Resnet model had the highest overall mAP of 0.245, following by 0.234 of the Faster-RCNN Resnet 101, and 0.227 of the Faster-RCNN Resnet 50 model. The SSD Mobilenet v1 model had the lowest mAP of 0.142. The Faster-RCNN Inception Resnet model had a dominant AP of 0.377, 0.236, and 0.129 for the large, medium, and small size of moles. We observed that the Faster RCNN Inception Resnet has shown the best performance with the high confident scores (over 97%) for all kinds of moles. CONCLUSIONS We successfully developed the detection models based on the techniques of SSD and Faster-RCNN. These models might help researchers to localize accurately the position of the moles with its risks as a feasible detection app on the smartphone. We provided the pre-trained models for further studies via GitHub link, https://github.com/vietdaica/Mole_Detection.


2020 ◽  
Vol 2020 ◽  
pp. 1-15
Author(s):  
Jingzhe Ma ◽  
Shaobo Duan ◽  
Ye Zhang ◽  
Jing Wang ◽  
Zongmin Wang ◽  
...  

Ultrasonography is widely used in the clinical diagnosis of thyroid nodules. Ultrasound images of thyroid nodules have different appearances, interior features, and blurred borders that are difficult for a physician to diagnose into malignant or benign types merely through visual recognition. The development of artificial intelligence, especially deep learning, has led to great advances in the field of medical image diagnosis. However, there are some challenges to achieve precision and efficiency in the recognition of thyroid nodules. In this work, we propose a deep learning architecture, you only look once v3 dense multireceptive fields convolutional neural network (YOLOv3-DMRF), based on YOLOv3. It comprises a DMRF-CNN and multiscale detection layers. In DMRF-CNN, we integrate dilated convolution with different dilation rates to continue passing the edge and the texture features to deeper layers. Two different scale detection layers are deployed to recognize the different sizes of the thyroid nodules. We used two datasets to train and evaluate the YOLOv3-DMRF during the experiments. One dataset includes 699 original ultrasound images of thyroid nodules collected from a local health physical center. We obtained 10,485 images after data augmentation. Another dataset is an open-access dataset that includes ultrasound images of 111 malignant and 41 benign thyroid nodules. Average precision (AP) and mean average precision (mAP) are used as the metrics for quantitative and qualitative evaluations. We compared the proposed YOLOv3-DMRF with some state-of-the-art deep learning networks. The experimental results show that YOLOv3-DMRF outperforms others on mAP and detection time on both the datasets. Specifically, the values of mAP and detection time were 90.05 and 95.23% and 3.7 and 2.2 s, respectively, on the two test datasets. Experimental results demonstrate that the proposed YOLOv3-DMRF is efficient for detection and recognition of thyroid nodules for ultrasound images.


Author(s):  
Jen-Yung Tsai ◽  
Isabella Yu-Ju Hung ◽  
Yue Leon Guo ◽  
Yih-Kuen Jan ◽  
Chih-Yang Lin ◽  
...  

Background: Lumbar disc herniation (LDH) is among the most common causes of lower back pain and sciatica. The causes of LDH have not been fully elucidated but most likely involve a complex combination of mechanical and biological processes. Magnetic resonance imaging (MRI) is a tool most frequently used for LDH because it can show abnormal soft tissue areas around the spine. Deep learning models may be trained to recognize images with high speed and accuracy to diagnose LDH. Although the deep learning model requires huge numbers of image datasets to train and establish the best model, this study processed enhanced medical image features for training the small-scale deep learning dataset.Methods: We propose automatic detection to assist the initial LDH exam for lower back pain. The subjects were between 20 and 65 years old with at least 6 months of work experience. The deep learning method employed the YOLOv3 model to train and detect small object changes such as LDH on MRI. The dataset images were processed and combined with labeling and annotation from the radiologist’s diagnosis record.Results: Our method proves the possibility of using deep learning with a small-scale dataset with limited medical images. The highest mean average precision (mAP) was 92.4% at 550 images with data augmentation (550-aug), and the YOLOv3 LDH training was 100% with the best average precision at 550-aug among all datasets. This study used data augmentation to prevent under- or overfitting in an object detection model that was trained with the small-scale dataset.Conclusions: The data augmentation technique plays a crucial role in YOLOv3 training and detection results. This method displays a high possibility for rapid initial tests and auto-detection for a limited clinical dataset.


Author(s):  
Ye Wang ◽  
Yueru Chen ◽  
Jongmoo Choi ◽  
C.-C. Jay Kuo

This paper reports a visible and thermal drone monitoring system that integrates deep-learning-based detection and tracking modules. The biggest challenge in adopting deep learning methods for drone detection is the paucity of training drone images especially thermal drone images. To address this issue, we develop two data augmentation techniques. One is a model-based drone augmentation technique that automatically generates visible drone images with a bounding box label on the drone's location. The other is exploiting an adversarial data augmentation methodology to create thermal drone images. To track a small flying drone, we utilize the residual information between consecutive image frames. Finally, we present an integrated detection and tracking system that outperforms the performance of each individual module containing detection or tracking only. The experiments show that, even being trained on synthetic data, the proposed system performs well on real-world drone images with complex background. The USC drone detection and tracking dataset with user labeled bounding boxes is available to the public.


2022 ◽  
Vol 12 (1) ◽  
pp. 489
Author(s):  
Mizuki Yoshida ◽  
Atsushi Teramoto ◽  
Kohei Kudo ◽  
Shoji Matsumoto ◽  
Kuniaki Saito ◽  
...  

Since recognizing the location and extent of infarction is essential for diagnosis and treatment, many methods using deep learning have been reported. Generally, deep learning requires a large amount of training data. To overcome this problem, we generated pseudo patient images using CycleGAN, which performed image transformation without paired images. Then, we aimed to improve the extraction accuracy by using the generated images for the extraction of cerebral infarction regions. First, we used CycleGAN for data augmentation. Pseudo-cerebral infarction images were generated from healthy images using CycleGAN. Finally, U-Net was used to segment the cerebral infarction region using CycleGAN-generated images. Regarding the extraction accuracy, the Dice index was 0.553 for U-Net with CycleGAN, which was an improvement over U-Net without CycleGAN. Furthermore, the number of false positives per case was 3.75 for U-Net without CycleGAN and 1.23 for U-Net with CycleGAN, respectively. The number of false positives was reduced by approximately 67% by introducing the CycleGAN-generated images to training cases. These results indicate that utilizing CycleGAN-generated images was effective and facilitated the accurate extraction of the infarcted regions while maintaining the detection rate.


Author(s):  
W. Lin ◽  
Y. Chen ◽  
C. Wang ◽  
J. Li

<p><strong>Abstract.</strong> In this paper, we proposed a novel 3D deep learning model for object localization and object bounding boxes estimation. To increase the detection efficiency of small objects in the large scale scenes, the local neighbourhood geometric structure information of objects has been taken into the Edgeconv model, which can operate the original point clouds. We evaluated the 3D bounding box with high resolution in the RGB-D dataset and acquired stable effectiveness even under the sparse points and the strong occlusion. The experimental results indicate that our method achieved the higher mean average precision and better IOU of bounding boxes in SUN RGB-D dataset and KITTI benchmark.</p>


Author(s):  
Felix Erne ◽  
Daniel Dehncke ◽  
Steven C. Herath ◽  
Fabian Springer ◽  
Nico Pfeifer ◽  
...  

Abstract Background Fracture detection by artificial intelligence and especially Deep Convolutional Neural Networks (DCNN) is a topic of growing interest in current orthopaedic and radiological research. As learning a DCNN usually needs a large amount of training data, mostly frequent fractures as well as conventional X-ray are used. Therefore, less common fractures like acetabular fractures (AF) are underrepresented in the literature. The aim of this pilot study was to establish a DCNN for detection of AF using computer tomography (CT) scans. Methods Patients with an acetabular fracture were identified from the monocentric consecutive pelvic injury registry at the BG Trauma Center XXX from 01/2003 – 12/2019. All patients with unilateral AF and CT scans available in DICOM-format were included for further processing. All datasets were automatically anonymised and digitally post-processed. Extraction of the relevant region of interests was performed and the technique of data augmentation (DA) was implemented to artificially increase the number of training samples. A DCNN based on Med3D was used for autonomous fracture detection, using global average pooling (GAP) to reduce overfitting. Results From a total of 2,340 patients with a pelvic fracture, 654 patients suffered from an AF. After screening and post-processing of the datasets, a total of 159 datasets were enrolled for training of the algorithm. A random assignment into training datasets (80%) and test datasets (20%) was performed. The technique of bone area extraction, DA and GAP increased the accuracy of fracture detection from 58.8% (native DCNN) up to an accuracy of 82.8% despite the low number of datasets. Conclusion The accuracy of fracture detection of our trained DCNN is comparable to published values despite the low number of training datasets. The techniques of bone extraction, DA and GAP are useful for increasing the detection rates of rare fractures by a DCNN. Based on the used DCNN in combination with the described techniques from this pilot study, the possibility of an automatic fracture classification of AF is under investigation in a multicentre study.


2020 ◽  
Vol 71 (7) ◽  
pp. 868-880
Author(s):  
Nguyen Hong-Quan ◽  
Nguyen Thuy-Binh ◽  
Tran Duc-Long ◽  
Le Thi-Lan

Along with the strong development of camera networks, a video analysis system has been become more and more popular and has been applied in various practical applications. In this paper, we focus on person re-identification (person ReID) task that is a crucial step of video analysis systems. The purpose of person ReID is to associate multiple images of a given person when moving in a non-overlapping camera network. Many efforts have been made to person ReID. However, most of studies on person ReID only deal with well-alignment bounding boxes which are detected manually and considered as the perfect inputs for person ReID. In fact, when building a fully automated person ReID system the quality of the two previous steps that are person detection and tracking may have a strong effect on the person ReID performance. The contribution of this paper are two-folds. First, a unified framework for person ReID based on deep learning models is proposed. In this framework, the coupling of a deep neural network for person detection and a deep-learning-based tracking method is used. Besides, features extracted from an improved ResNet architecture are proposed for person representation to achieve a higher ReID accuracy. Second, our self-built dataset is introduced and employed for evaluation of all three steps in the fully automated person ReID framework.


2020 ◽  
Author(s):  
Dean Sumner ◽  
Jiazhen He ◽  
Amol Thakkar ◽  
Ola Engkvist ◽  
Esben Jannik Bjerrum

<p>SMILES randomization, a form of data augmentation, has previously been shown to increase the performance of deep learning models compared to non-augmented baselines. Here, we propose a novel data augmentation method we call “Levenshtein augmentation” which considers local SMILES sub-sequence similarity between reactants and their respective products when creating training pairs. The performance of Levenshtein augmentation was tested using two state of the art models - transformer and sequence-to-sequence based recurrent neural networks with attention. Levenshtein augmentation demonstrated an increase performance over non-augmented, and conventionally SMILES randomization augmented data when used for training of baseline models. Furthermore, Levenshtein augmentation seemingly results in what we define as <i>attentional gain </i>– an enhancement in the pattern recognition capabilities of the underlying network to molecular motifs.</p>


Sign in / Sign up

Export Citation Format

Share Document