Developing an Image-Based Deep Learning Framework for Automatic Scoring of The Pentagon Drawing Test

Journal of Alzheimer s Disease ◽

10.3233/jad-210714 ◽

2021 ◽

pp. 1-11

Author(s):

Yike Li ◽

Jiajie Guo ◽

Peikai Yang

Keyword(s):

Deep Learning ◽

Object Detection ◽

Transfer Learning ◽

High Efficiency ◽

Characteristic Curve ◽

Data Partitioning ◽

Training Data ◽

Drawing Test ◽

Automatic Scoring ◽

Efficiency And Reliability

Background: The Pentagon Drawing Test (PDT) is a common assessment for visuospatial function. Evaluating the PDT by artificial intelligence can improve efficiency and reliability in the big data era. This study aimed to develop a deep learning (DL) framework for automatic scoring of the PDT based on image data. Methods: A total of 823 PDT photos were retrospectively collected and preprocessed into black-and-white, square-shape images. Stratified fivefold cross-validation was applied for training and testing. Two strategies based on convolutional neural networks were compared. The first strategy was to perform an image classification task using supervised transfer learning. The second strategy was designed with an object detection model for recognizing the geometric shapes in the figure, followed by a predetermined algorithm to score based on their classes and positions. Results: On average, the first framework demonstrated 62%accuracy, 62%recall, 65%precision, 63%specificity, and 0.72 area under the receiver operating characteristic curve. This performance was substantially outperformed by the second framework, with averages of 94%, 95%, 93%, 93%, and 0.95, respectively. Conclusion: An image-based DL framework based on the object detection approach may be clinically applicable for automatic scoring of the PDT with high efficiency and reliability. With a limited sample size, transfer learning should be used with caution if the new images are distinct from the previous training data. Partitioning the problem-solving workflow into multiple simple tasks should facilitate model selection, improve performance, and allow comprehensible logic of the DL framework.

Download Full-text

Automatic detection of 39 fundus diseases and conditions in retinal photographs using deep neural networks

Nature Communications ◽

10.1038/s41467-021-25138-w ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Ling-Ping Cen ◽

Jie Ji ◽

Jian-Wei Lin ◽

Si-Tong Ju ◽

Hong-Jie Lin ◽

...

Keyword(s):

Deep Learning ◽

High Efficiency ◽

Characteristic Curve ◽

Weighted Average ◽

Age Related Macular Degeneration ◽

Learning Platform ◽

Age Related ◽

Public Data ◽

Primary Test ◽

Retinal Fundus

AbstractRetinal fundus diseases can lead to irreversible visual impairment without timely diagnoses and appropriate treatments. Single disease-based deep learning algorithms had been developed for the detection of diabetic retinopathy, age-related macular degeneration, and glaucoma. Here, we developed a deep learning platform (DLP) capable of detecting multiple common referable fundus diseases and conditions (39 classes) by using 249,620 fundus images marked with 275,543 labels from heterogenous sources. Our DLP achieved a frequency-weighted average F1 score of 0.923, sensitivity of 0.978, specificity of 0.996 and area under the receiver operating characteristic curve (AUC) of 0.9984 for multi-label classification in the primary test dataset and reached the average level of retina specialists. External multihospital test, public data test and tele-reading application also showed high efficiency for multiple retinal diseases and conditions detection. These results indicate that our DLP can be applied for retinal fundus disease triage, especially in remote areas around the world.

Download Full-text

U-Infuse: Democratization of Customizable Deep Learning for Object Detection

Sensors ◽

10.3390/s21082611 ◽

2021 ◽

Vol 21 (8) ◽

pp. 2611

Author(s):

Andrew Shepley ◽

Greg Falzon ◽

Christopher Lawson ◽

Paul Meek ◽

Paul Kwan

Keyword(s):

Deep Learning ◽

Intellectual Property ◽

Object Detection ◽

Image Data ◽

Learning Technologies ◽

Training Data ◽

Learning Models ◽

Ecological Data ◽

Single Class ◽

Large Numbers

Image data is one of the primary sources of ecological data used in biodiversity conservation and management worldwide. However, classifying and interpreting large numbers of images is time and resource expensive, particularly in the context of camera trapping. Deep learning models have been used to achieve this task but are often not suited to specific applications due to their inability to generalise to new environments and inconsistent performance. Models need to be developed for specific species cohorts and environments, but the technical skills required to achieve this are a key barrier to the accessibility of this technology to ecologists. Thus, there is a strong need to democratize access to deep learning technologies by providing an easy-to-use software application allowing non-technical users to train custom object detectors. U-Infuse addresses this issue by providing ecologists with the ability to train customised models using publicly available images and/or their own images without specific technical expertise. Auto-annotation and annotation editing functionalities minimize the constraints of manually annotating and pre-processing large numbers of images. U-Infuse is a free and open-source software solution that supports both multiclass and single class training and object detection, allowing ecologists to access deep learning technologies usually only available to computer scientists, on their own device, customised for their application, without sharing intellectual property or sensitive data. It provides ecological practitioners with the ability to (i) easily achieve object detection within a user-friendly GUI, generating a species distribution report, and other useful statistics, (ii) custom train deep learning models using publicly available and custom training data, (iii) achieve supervised auto-annotation of images for further training, with the benefit of editing annotations to ensure quality datasets. Broad adoption of U-Infuse by ecological practitioners will improve ecological image analysis and processing by allowing significantly more image data to be processed with minimal expenditure of time and resources, particularly for camera trap images. Ease of training and use of transfer learning means domain-specific models can be trained rapidly, and frequently updated without the need for computer science expertise, or data sharing, protecting intellectual property and privacy.

Download Full-text

Augmenting Crop Detection for Precision Agriculture with Deep Visual Transfer Learning—A Case Study of Bale Detection

Remote Sensing ◽

10.3390/rs13010023 ◽

2020 ◽

Vol 13 (1) ◽

pp. 23

Author(s):

Wei Zhao ◽

William Yamada ◽

Tianxin Li ◽

Matthew Digman ◽

Troy Runge

Keyword(s):

Object Detection ◽

Transfer Learning ◽

Precision Agriculture ◽

Crop Production ◽

Domain Adaptation ◽

Training Data ◽

Detection Accuracy ◽

Detection Model ◽

Agriculture Products

In recent years, precision agriculture has been researched to increase crop production with less inputs, as a promising means to meet the growing demand of agriculture products. Computer vision-based crop detection with unmanned aerial vehicle (UAV)-acquired images is a critical tool for precision agriculture. However, object detection using deep learning algorithms rely on a significant amount of manually prelabeled training datasets as ground truths. Field object detection, such as bales, is especially difficult because of (1) long-period image acquisitions under different illumination conditions and seasons; (2) limited existing prelabeled data; and (3) few pretrained models and research as references. This work increases the bale detection accuracy based on limited data collection and labeling, by building an innovative algorithms pipeline. First, an object detection model is trained using 243 images captured with good illimitation conditions in fall from the crop lands. In addition, domain adaptation (DA), a kind of transfer learning, is applied for synthesizing the training data under diverse environmental conditions with automatic labels. Finally, the object detection model is optimized with the synthesized datasets. The case study shows the proposed method improves the bale detecting performance, including the recall, mean average precision (mAP), and F measure (F1 score), from averages of 0.59, 0.7, and 0.7 (the object detection) to averages of 0.93, 0.94, and 0.89 (the object detection + DA), respectively. This approach could be easily scaled to many other crop field objects and will significantly contribute to precision agriculture.

Download Full-text

Copy-Move Forgery Detection (CMFD) Using Deep Learning for Image and Video Forensics

Journal of Imaging ◽

10.3390/jimaging7030059 ◽

2021 ◽

Vol 7 (3) ◽

pp. 59

Author(s):

Yohanna Rodriguez-Ortega ◽

Dora M. Ballesteros ◽

Diego Renza

Keyword(s):

Deep Learning ◽

Transfer Learning ◽

Training Data ◽

Video Editing ◽

Forgery Detection ◽

Copy Move Forgery Detection ◽

The Impact ◽

And Training ◽

Selection Of ◽

Traditional Image

With the exponential growth of high-quality fake images in social networks and media, it is necessary to develop recognition algorithms for this type of content. One of the most common types of image and video editing consists of duplicating areas of the image, known as the copy-move technique. Traditional image processing approaches manually look for patterns related to the duplicated content, limiting their use in mass data classification. In contrast, approaches based on deep learning have shown better performance and promising results, but they present generalization problems with a high dependence on training data and the need for appropriate selection of hyperparameters. To overcome this, we propose two approaches that use deep learning, a model by a custom architecture and a model by transfer learning. In each case, the impact of the depth of the network is analyzed in terms of precision (P), recall (R) and F1 score. Additionally, the problem of generalization is addressed with images from eight different open access datasets. Finally, the models are compared in terms of evaluation metrics, and training and inference times. The model by transfer learning of VGG-16 achieves metrics about 10% higher than the model by a custom architecture, however, it requires approximately twice as much inference time as the latter.

Download Full-text

Deep Learning Using Multiple Degrees of Maximum-Intensity Projection for PET/CT Image Classification in Breast Cancer

Tomography ◽

10.3390/tomography8010011 ◽

2022 ◽

Vol 8 (1) ◽

pp. 131-141

Author(s):

Kanae Takahashi ◽

Tomoyuki Fujioka ◽

Jun Oyama ◽

Mio Mori ◽

Emi Yamaga ◽

...

Keyword(s):

Breast Cancer ◽

Deep Learning ◽

Image Classification ◽

Test Data ◽

Characteristic Curve ◽

Maximum Intensity Projection ◽

Maximum Intensity ◽

Training Data ◽

Ct Image ◽

Pet Ct

Deep learning (DL) has become a remarkably powerful tool for image processing recently. However, the usefulness of DL in positron emission tomography (PET)/computed tomography (CT) for breast cancer (BC) has been insufficiently studied. This study investigated whether a DL model using images with multiple degrees of PET maximum-intensity projection (MIP) images contributes to increase diagnostic accuracy for PET/CT image classification in BC. We retrospectively gathered 400 images of 200 BC and 200 non-BC patients for training data. For each image, we obtained PET MIP images with four different degrees (0°, 30°, 60°, 90°) and made two DL models using Xception. One DL model diagnosed BC with only 0-degree MIP and the other used four different degrees. After training phases, our DL models analyzed test data including 50 BC and 50 non-BC patients. Five radiologists interpreted these test data. Sensitivity, specificity, and area under the receiver operating characteristic curve (AUC) were calculated. Our 4-degree model, 0-degree model, and radiologists had a sensitivity of 96%, 82%, and 80–98% and a specificity of 80%, 88%, and 76–92%, respectively. Our 4-degree model had equal or better diagnostic performance compared with that of the radiologists (AUC = 0.936 and 0.872–0.967, p = 0.036–0.405). A DL model similar to our 4-degree model may lead to help radiologists in their diagnostic work in the future.

Download Full-text

Automatic Object Detection from Digital Images by Deep Learning with Transfer Learning

Advanced Computing Strategies for Engineering - Lecture Notes in Computer Science ◽

10.1007/978-3-319-91635-4_1 ◽

2018 ◽

pp. 3-15 ◽

Cited By ~ 3

Author(s):

Nobuyoshi Yabuki ◽

Naoto Nishimura ◽

Tomohiro Fukuda

Keyword(s):

Deep Learning ◽

Object Detection ◽

Transfer Learning ◽

Digital Images

Download Full-text

Diagnostic assessment of a deep learning system for detecting atrial fibrillation in pulse waveforms

Heart ◽

10.1136/heartjnl-2018-313147 ◽

2018 ◽

Vol 104 (23) ◽

pp. 1921-1928 ◽

Cited By ~ 36

Author(s):

Ming-Zher Poh ◽

Yukkee Cheung Poh ◽

Pak-Hei Chan ◽

Chun-Ka Wong ◽

Louise Pun ◽

...

Keyword(s):

Atrial Fibrillation ◽

Deep Learning ◽

Test Data ◽

Predictive Value ◽

Characteristic Curve ◽

Performance Comparison ◽

Learning System ◽

Training Data ◽

Validation Data ◽

Data Set

ObjectiveTo evaluate the diagnostic performance of a deep learning system for automated detection of atrial fibrillation (AF) in photoplethysmographic (PPG) pulse waveforms.MethodsWe trained a deep convolutional neural network (DCNN) to detect AF in 17 s PPG waveforms using a training data set of 149 048 PPG waveforms constructed from several publicly available PPG databases. The DCNN was validated using an independent test data set of 3039 smartphone-acquired PPG waveforms from adults at high risk of AF at a general outpatient clinic against ECG tracings reviewed by two cardiologists. Six established AF detectors based on handcrafted features were evaluated on the same test data set for performance comparison.ResultsIn the validation data set (3039 PPG waveforms) consisting of three sequential PPG waveforms from 1013 participants (mean (SD) age, 68.4 (12.2) years; 46.8% men), the prevalence of AF was 2.8%. The area under the receiver operating characteristic curve (AUC) of the DCNN for AF detection was 0.997 (95% CI 0.996 to 0.999) and was significantly higher than all the other AF detectors (AUC range: 0.924–0.985). The sensitivity of the DCNN was 95.2% (95% CI 88.3% to 98.7%), specificity was 99.0% (95% CI 98.6% to 99.3%), positive predictive value (PPV) was 72.7% (95% CI 65.1% to 79.3%) and negative predictive value (NPV) was 99.9% (95% CI 99.7% to 100%) using a single 17 s PPG waveform. Using the three sequential PPG waveforms in combination (<1 min in total), the sensitivity was 100.0% (95% CI 87.7% to 100%), specificity was 99.6% (95% CI 99.0% to 99.9%), PPV was 87.5% (95% CI 72.5% to 94.9%) and NPV was 100% (95% CI 99.4% to 100%).ConclusionsIn this evaluation of PPG waveforms from adults screened for AF in a real-world primary care setting, the DCNN had high sensitivity, specificity, PPV and NPV for detecting AF, outperforming other state-of-the-art methods based on handcrafted features.

Download Full-text

Improved Method to Detect the Tailings Ponds from Multispectral Remote Sensing Images Based on Faster R-CNN and Transfer Learning

Remote Sensing ◽

10.3390/rs14010103 ◽

2021 ◽

Vol 14 (1) ◽

pp. 103

Author(s):

Dongchuan Yan ◽

Hao Zhang ◽

Guoqing Li ◽

Xiangqiang Li ◽

Hua Lei ◽

...

Keyword(s):

Remote Sensing ◽

Deep Learning ◽

Object Detection ◽

Transfer Learning ◽

Near Infrared ◽

Image Data ◽

Remote Sensing Images ◽

Tailings Pond ◽

True Color ◽

Tailings Ponds

The breaching of tailings pond dams may lead to casualties and environmental pollution; therefore, timely and accurate monitoring is an essential aspect of managing such structures and preventing accidents. Remote sensing technology is suitable for the regular extraction and monitoring of tailings pond information. However, traditional remote sensing is inefficient and unsuitable for the frequent extraction of large volumes of highly precise information. Object detection, based on deep learning, provides a solution to this problem. Most remote sensing imagery applications for tailings pond object detection using deep learning are based on computer vision, utilizing the true-color triple-band data of high spatial resolution imagery for information extraction. The advantage of remote sensing image data is their greater number of spectral bands (more than three), providing more abundant spectral information. There is a lack of research on fully harnessing multispectral band information to improve the detection precision of tailings ponds. Accordingly, using a sample dataset of tailings pond satellite images from the Gaofen-1 high-resolution Earth observation satellite, we improved the Faster R-CNN deep learning object detection model by increasing the inputs from three true-color bands to four multispectral bands. Moreover, we used the attention mechanism to recalibrate the input contributions. Subsequently, we used a step-by-step transfer learning method to improve and gradually train our model. The improved model could fully utilize the near-infrared (NIR) band information of the images to improve the precision of tailings pond detection. Compared with that of the three true-color band input models, the tailings pond detection average precision (AP) and recall notably improved in our model, with the AP increasing from 82.3% to 85.9% and recall increasing from 65.4% to 71.9%. This research could serve as a reference for using multispectral band information from remote sensing images in the construction and application of deep learning models.

Download Full-text

A DEEP LEARNING BASED SURROGATE MODEL FOR ESTIMATING THE FLUX AND POWER DISTRIBUTION SOLVED BY DIFFUSION EQUATION

EPJ Web of Conferences ◽

10.1051/epjconf/202124703013 ◽

2021 ◽

Vol 247 ◽

pp. 03013

Author(s):

Qian Zhang ◽

Jinchao Zhang ◽

Liang Liang ◽

Zhuo Li ◽

Tengfei Zhang

Keyword(s):

Deep Learning ◽

Diffusion Equation ◽

Power Distribution ◽

Surrogate Model ◽

High Efficiency ◽

Reactor Core ◽

Training Data ◽

Diffusion Method ◽

Convolutional Network ◽

Learning Platform

A deep learning based surrogate model is proposed for replacing the conventional diffusion equation solver and predicting the flux and power distribution of the reactor core. Using the training data generated by the conventional diffusion equation solver, a special designed convolutional neural network inspired by the FCN (Fully Convolutional Network) is trained under the deep learning platform TensorFlow. Numerical results show that the deep learning based surrogate model is effective for estimating the flux and power distribution calculated by the diffusion method, which means it can be used for replacing the conventional diffusion equation solver with high efficiency boost.

Download Full-text

Caries Detection with Near-Infrared Transillumination Using Deep Learning

Journal of Dental Research ◽

10.1177/0022034519871884 ◽

2019 ◽

Vol 98 (11) ◽

pp. 1227-1233 ◽

Cited By ~ 17

Author(s):

F. Casalegno ◽

T. Newton ◽

R. Daher ◽

M. Abdelaziz ◽

A. Lodi-Rizzini ◽

...

Keyword(s):

Deep Learning ◽

Near Infrared ◽

Early Stage ◽

Characteristic Curve ◽

Class Imbalance ◽

Training Data ◽

Caries Detection ◽

Carious Lesions ◽

Dental Practitioners ◽

Segmentation Task

Dental caries is the most prevalent chronic condition worldwide. Early detection can significantly improve treatment outcomes and reduce the need for invasive procedures. Recently, near-infrared transillumination (TI) imaging has been shown to be effective for the detection of early stage lesions. In this work, we present a deep learning model for the automated detection and localization of dental lesions in TI images. Our method is based on a convolutional neural network (CNN) trained on a semantic segmentation task. We use various strategies to mitigate issues related to training data scarcity, class imbalance, and overfitting. With only 185 training samples, our model achieved an overall mean intersection-over-union (IOU) score of 72.7% on a 5-class segmentation task and specifically an IOU score of 49.5% and 49.0% for proximal and occlusal carious lesions, respectively. In addition, we constructed a simplified task, in which regions of interest were evaluated for the binary presence or absence of carious lesions. For this task, our model achieved an area under the receiver operating characteristic curve of 83.6% and 85.6% for occlusal and proximal lesions, respectively. Our work demonstrates that a deep learning approach for the analysis of dental images holds promise for increasing the speed and accuracy of caries detection, supporting the diagnoses of dental practitioners, and improving patient outcomes.

Download Full-text