Factors determining generalization in deep learning models for scoring COVID-CT images

Michael James Horry;  ; Subrata Chakraborty; Biswajeet Pradhan; Maryam Fallahpoor; Hossein Chegeni; Manoranjan Paul;  ;  ;  ;

doi:10.3934/mbe.2021456

Factors determining generalization in deep learning models for scoring COVID-CT images

Mathematical Biosciences and Engineering ◽

10.3934/mbe.2021456 ◽

2021 ◽

Vol 18 (6) ◽

pp. 9264-9293

Author(s):

Michael James Horry ◽

◽

Subrata Chakraborty ◽

Biswajeet Pradhan ◽

Maryam Fallahpoor ◽

...

Keyword(s):

Deep Learning ◽

Predictive Accuracy ◽

Gabor Filter ◽

Predictive Ability ◽

Histogram Equalization ◽

Data Sets ◽

Lung Involvement ◽

Learning Models ◽

Key Factors ◽

Independent Dataset

<abstract> <p>The COVID-19 pandemic has inspired unprecedented data collection and computer vision modelling efforts worldwide, focused on the diagnosis of COVID-19 from medical images. However, these models have found limited, if any, clinical application due in part to unproven generalization to data sets beyond their source training corpus. This study investigates the generalizability of deep learning models using publicly available COVID-19 Computed Tomography data through cross dataset validation. The predictive ability of these models for COVID-19 severity is assessed using an independent dataset that is stratified for COVID-19 lung involvement. Each inter-dataset study is performed using histogram equalization, and contrast limited adaptive histogram equalization with and without a learning Gabor filter. We show that under certain conditions, deep learning models can generalize well to an external dataset with F1 scores up to 86%. The best performing model shows predictive accuracy of between 75% and 96% for lung involvement scoring against an external expertly stratified dataset. From these results we identify key factors promoting deep learning generalization, being primarily the uniform acquisition of training images, and secondly diversity in CT slice position.</p> </abstract>

Download Full-text

Systematic investigation into generalization of COVID-19 CT deep learning models with Gabor ensemble for lung involvement scoring

10.31224/osf.io/tfqa4 ◽

2021 ◽

Author(s):

Michael J Horry ◽

Subrata Chakraborty ◽

Biswajeet Pradhan ◽

maryam fallahpoor ◽

Chegeni Hossein ◽

...

Keyword(s):

Large Scale ◽

Predictive Accuracy ◽

Gabor Filter ◽

Research Effort ◽

Predictive Ability ◽

Histogram Equalization ◽

Systematic Investigation ◽

Lung Involvement ◽

Independent Dataset ◽

Acquisition Processes

The COVID-19 pandemic has inspired unprecedented data collection and computer vision modelling efforts worldwide, focusing on diagnosis and stratification of COVID-19 from medical images. Despite this large-scale research effort, these models have found limited practical application due in part to unproven generalization of these models beyond their source study. This study investigates the generalizability of key published models using the publicly available COVID-19 Computed Tomography data through cross dataset validation. We then assess the predictive ability of these models for COVID-19 severity using an independent new dataset that is stratified for COVID-19 lung involvement. Each inter-dataset study is performed using histogram equalization, and contrast limited adaptive histogram equalization with and without a learning Gabor filter. The study shows high variability in the generalization of models trained on these datasets due to varied sample image provenances and acquisition processes amongst other factors. We show that under certain conditions, an internally consistent dataset can generalize well to an external dataset despite structural differences between these datasets with f1 scores up to 86%. Our best performing model shows high predictive accuracy for lung involvement score for an independent dataset for which expertly labelled lung involvement stratification is available. Creating an ensemble of our best model for disease positive prediction with our best model for disease negative prediction using a min-max function resulted in a superior model for lung involvement prediction with average predictive accuracy of 75% for zero lung involvement and 96% for 75-100% lung involvement with almost linear relationship between these stratifications.

Download Full-text

KeyNet: An Asymmetric Key-Style Framework for Watermarking Deep Learning Models

Applied Sciences ◽

10.3390/app11030999 ◽

2021 ◽

Vol 11 (3) ◽

pp. 999

Author(s):

Najeeb Moharram Jebreel ◽

Josep Domingo-Ferrer ◽

David Sánchez ◽

Alberto Blanco-Justicia

Keyword(s):

Deep Learning ◽

Intellectual Property ◽

High Fidelity ◽

Data Sets ◽

Learning Models ◽

Robust Watermarking ◽

Empirical Results ◽

Private Key ◽

Extensive Evaluation ◽

Original Classification

Many organizations devote significant resources to building high-fidelity deep learning (DL) models. Therefore, they have a great interest in making sure the models they have trained are not appropriated by others. Embedding watermarks (WMs) in DL models is a useful means to protect the intellectual property (IP) of their owners. In this paper, we propose KeyNet, a novel watermarking framework that satisfies the main requirements for an effective and robust watermarking. In KeyNet, any sample in a WM carrier set can take more than one label based on where the owner signs it. The signature is the hashed value of the owner’s information and her model. We leverage multi-task learning (MTL) to learn the original classification task and the watermarking task together. Another model (called the private model) is added to the original one, so that it acts as a private key. The two models are trained together to embed the WM while preserving the accuracy of the original task. To extract a WM from a marked model, we pass the predictions of the marked model on a signed sample to the private model. Then, the private model can provide the position of the signature. We perform an extensive evaluation of KeyNet’s performance on the CIFAR10 and FMNIST5 data sets and prove its effectiveness and robustness. Empirical results show that KeyNet preserves the utility of the original task and embeds a robust WM.

Download Full-text

Assessing the Performance of Deep Learning Algorithms for Short-Term Surface Water Quality Prediction

Sustainability ◽

10.3390/su131910690 ◽

2021 ◽

Vol 13 (19) ◽

pp. 10690

Author(s):

Heelak Choi ◽

Sang-Ik Suh ◽

Su-Hee Kim ◽

Eun Jin Han ◽

Seo Jin Ki

Keyword(s):

Water Quality ◽

Deep Learning ◽

Surface Water ◽

Learning Algorithms ◽

Arima Model ◽

Surface Water Quality ◽

Percentage Error ◽

Data Sets ◽

Learning Models ◽

Dependent Variables

This study aimed to investigate the applicability of deep learning algorithms to (monthly) surface water quality forecasting. A comparison was made between the performance of an autoregressive integrated moving average (ARIMA) model and four deep learning models. All prediction algorithms, except for the ARIMA model working on a single variable, were tested with univariate inputs consisting of one of two dependent variables as well as multivariate inputs containing both dependent and independent variables. We found that deep learning models (6.31–18.78%, in terms of the mean absolute percentage error) showed better performance than the ARIMA model (27.32–404.54%) in univariate data sets, regardless of dependent variables. However, the accuracy of prediction was not improved for all dependent variables in the presence of other associated water quality variables. In addition, changes in the number of input variables, sliding window size (i.e., input and output time steps), and relevant variables (e.g., meteorological and discharge parameters) resulted in wide variation of the predictive accuracy of deep learning models, reaching as high as 377.97%. Therefore, a refined search identifying the optimal values on such influencing factors is recommended to achieve the best performance of any deep learning model in given multivariate data sets.

Download Full-text

Knowledge transfer to enhance the performance of deep learning models for automated classification of B-cell neoplasms

10.1101/2021.03.03.21252824 ◽

2021 ◽

Author(s):

Nanditha Mallesh ◽

Max Zhao ◽

Lisa Meintker ◽

Alexander Höllein ◽

Franz Elsner ◽

...

Keyword(s):

Flow Cytometry ◽

Deep Learning ◽

Knowledge Transfer ◽

Clinical Decision Making ◽

Dimensional Space ◽

Clinical Decision ◽

Data Sets ◽

Automated Classification ◽

Learning Models ◽

Hematological Disorders

AbstractMulti-parameter flow cytometry (MFC) is a cornerstone in clinical decision making for hematological disorders such as leukemia or lymphoma. MFC data analysis requires trained experts to manually gate cell populations of interest, which is time-consuming and subjective. Manual gating is often limited to a two-dimensional space. In recent years, deep learning models have been developed to analyze the data in high-dimensional space and are highly accurate. Such models have been used successfully in histology, cytopathology, image flow cytometry, and conventional MFC analysis. However, current AI models used for subtype classification based on MFC data are limited to the antibody (flow cytometry) panel they were trained on. Thus, a key challenge in deploying AI models into routine diagnostics is the robustness and adaptability of such models. In this study, we present a workflow to extend our previous model to four additional MFC panels. We employ knowledge transfer to adapt the model to smaller data sets. We trained models for each of the data sets by transferring the features learned from our base model. With our workflow, we could increase the model’s overall performance and more prominently, increase the learning rate for very small training sizes.

Download Full-text

Deep Learning-Based Models for Pain Recognition: A Systematic Review

10.20944/preprints202008.0040.v1 ◽

2020 ◽

Author(s):

Rasha M. Al-Eidan ◽

Hend Al-Khalifa ◽

AbdulMalik Alsalman

Keyword(s):

Systematic Review ◽

Deep Learning ◽

Data Sets ◽

Observer Variability ◽

Learning Models ◽

Learning Methods ◽

Inter Observer Variability ◽

Recognition Systems ◽

Pain Recognition ◽

Open Issues

The traditional standards employed for pain assessment have many limitations. One such limitation is reliability because of inter-observer variability. Therefore, there have been many approaches to automate the task of pain recognition. Recently, deep-learning methods have appeared to solve many challenges, such as feature selection and cases with a small number of data sets. This study provides a systematic review of pain-recognition systems that are based on deep-learning models for the last two years only. Furthermore, it presents the major deep-learning methods that were used in review papers. Finally, it provides a discussion of the challenges and open issues.

Download Full-text

A Multivariate Poisson Deep Learning Model for Genomic Prediction of Count Data

G3 Genes|Genome|Genetics ◽

10.1534/g3.120.401631 ◽

2020 ◽

Vol 10 (11) ◽

pp. 4177-4190

Author(s):

Osval Antonio Montesinos-López ◽

José Cricelio Montesinos-López ◽

Pawan Singh ◽

Nerida Lozano-Ramirez ◽

Alberto Barrón-López ◽

...

Keyword(s):

Deep Learning ◽

Count Data ◽

Poisson Regression ◽

Regression Models ◽

Deep Neural Network ◽

Activation Function ◽

Data Sets ◽

Learning Models ◽

Generalized Poisson ◽

Generalized Poisson Regression

The paradigm called genomic selection (GS) is a revolutionary way of developing new plants and animals. This is a predictive methodology, since it uses learning methods to perform its task. Unfortunately, there is no universal model that can be used for all types of predictions; for this reason, specific methodologies are required for each type of output (response variables). Since there is a lack of efficient methodologies for multivariate count data outcomes, in this paper, a multivariate Poisson deep neural network (MPDN) model is proposed for the genomic prediction of various count outcomes simultaneously. The MPDN model uses the minus log-likelihood of a Poisson distribution as a loss function, in hidden layers for capturing nonlinear patterns using the rectified linear unit (RELU) activation function and, in the output layer, the exponential activation function was used for producing outputs on the same scale of counts. The proposed MPDN model was compared to conventional generalized Poisson regression models and univariate Poisson deep learning models in two experimental data sets of count data. We found that the proposed MPDL outperformed univariate Poisson deep neural network models, but did not outperform, in terms of prediction, the univariate generalized Poisson regression models. All deep learning models were implemented in Tensorflow as back-end and Keras as front-end, which allows implementing these models on moderate and large data sets, which is a significant advantage over previous GS models for multivariate count data.

Download Full-text

PM2.5 Prediction Model Based on Combinational Hammerstein Recurrent Neural Networks

Mathematics ◽

10.3390/math8122178 ◽

2020 ◽

Vol 8 (12) ◽

pp. 2178

Author(s):

Yi-Chung Chen ◽

Tsu-Chiang Lei ◽

Shun Yao ◽

Hsin-Ping Wang

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Recurrent Neural Networks ◽

Predictive Accuracy ◽

Learning Models ◽

Short Term ◽

Computational Overhead ◽

Implementation Costs ◽

External Sources

Airborne particulate matter 2.5 (PM2.5) can have a profound effect on the health of the population. Many researchers have been reporting highly accurate numerical predictions based on raw PM2.5 data imported directly into deep learning models; however, there is still considerable room for improvement in terms of implementation costs due to heavy computational overhead. From the perspective of environmental science, PM2.5 values in a given location can be attributed to local sources as well as external sources. Local sources tend to have a dramatic short-term impact on PM2.5 values, whereas external sources tend to have more subtle but longer-lasting effects. In the presence of PM2.5 from both sources at the same time, this combination of effects can undermine the predictive accuracy of the model. This paper presents a novel combinational Hammerstein recurrent neural network (CHRNN) to enhance predictive accuracy and overcome the heavy computational and monetary burden imposed by deep learning models. The CHRNN comprises a based-neural network tasked with learning gradual (long-term) fluctuations in conjunction with add-on neural networks to deal with dramatic (short-term) fluctuations. The CHRNN can be coupled with a random forest model to determine the degree to which short-term effects influence long-term outcomes. We also developed novel feature selection and normalization methods to enhance prediction accuracy. Using real-world measurement data of air quality and PM2.5 datasets from Taiwan, the precision of the proposed system in the numerical prediction of PM2.5 levels was comparable to that of state-of-the-art deep learning models, such as deep recurrent neural networks and long short-term memory, despite far lower implementation costs and computational overhead.

Download Full-text

Deep Relation Network for Hyperspectral Image Few-Shot Classification

Remote Sensing ◽

10.3390/rs12060923 ◽

2020 ◽

Vol 12 (6) ◽

pp. 923 ◽

Cited By ~ 2

Author(s):

Kuiliang Gao ◽

Bing Liu ◽

Xuchu Yu ◽

Jinchun Qin ◽

Pengqiang Zhang ◽

...

Keyword(s):

Deep Learning ◽

Hyperspectral Image ◽

Hyperspectral Images ◽

Classification Model ◽

Support Vector ◽

Data Sets ◽

Learning Models ◽

Shot Classification ◽

Learning Module ◽

Relation Learning

Deep learning has achieved great success in hyperspectral image classification. However, when processing new hyperspectral images, the existing deep learning models must be retrained from scratch with sufficient samples, which is inefficient and undesirable in practical tasks. This paper aims to explore how to accurately classify new hyperspectral images with only a few labeled samples, i.e., the hyperspectral images few-shot classification. Specifically, we design a new deep classification model based on relational network and train it with the idea of meta-learning. Firstly, the feature learning module and the relation learning module of the model can make full use of the spatial–spectral information in hyperspectral images and carry out relation learning by comparing the similarity between samples. Secondly, the task-based learning strategy can enable the model to continuously enhance its ability to learn how to learn with a large number of tasks randomly generated from different data sets. Benefitting from the above two points, the proposed method has excellent generalization ability and can obtain satisfactory classification results with only a few labeled samples. In order to verify the performance of the proposed method, experiments were carried out on three public data sets. The results indicate that the proposed method can achieve better classification results than the traditional semisupervised support vector machine and semisupervised deep learning models.

Download Full-text

H&E image-based consensus molecular subtype classification of colorectal cancer using weak labeling.

Journal of Clinical Oncology ◽

10.1200/jco.2020.38.15_suppl.e16097 ◽

2020 ◽

Vol 38 (15_suppl) ◽

pp. e16097-e16097

Author(s):

Andrew J. Kruger ◽

Lingdao Sha ◽

Madhavi Kannan ◽

Rohan P. Joshi ◽

Benjamin D. Leibowitz ◽

...

Keyword(s):

Neural Network ◽

Deep Learning ◽

Molecular Subtypes ◽

Molecular Subtype ◽

Multiple Instance Learning ◽

Data Sets ◽

Learning Models ◽

Subtype Identification ◽

Consensus Molecular Subtype

e16097 Background: Using gene-expression, consensus molecular subtypes (CMS) divide colorectal cancers (CRC) into four categories with prognostic and therapy-predictive clinical utilities. These subtypes also manifest as different morphological phenotypes in whole-slide images (WSIs). Here, we implemented and trained a novel deep multiple instance learning (MIL) framework that requires only a single label per WSI to identify morphological biomarkers and accelerate CMS classification. Methods: Deep learning models can be trained by MIL frameworks to classify tissue in localized tiles from large ( > 1 Gb) WSIs using only weakly supervised, slide-level classification labels. Here we demonstrate a novel framework that advances on instance-based MIL by using a multi-phase approach to training deep learning models. The framework allows us to train on WSIs that contain multiple CMS classes while further identifying previously undiscovered tissue features that have low or no correlation with any subtype. Identification of these uncorrelated features results in improved insights into the specific tissue features that are most associated with the four CMS classes and a more accurate classification of CMS status. Results: We trained and validated (n = 735 WSIs and 184 withheld WSIs, respectively) a ResNet34 convolutional neural network to classify 224x224 pixel tiles distributed across tumor, lymphocyte, and stroma tissue regions. The slide-level CMS classification probability was calculated by an aggregation of the tiles correlated with each one of the four subtypes. The receiver operating characteristic curves had the following one-vs-all AUCs: CMS1 = 0.854, CMS2 = 0.921, CMS3 = 0.850, and CMS4 = 0.866, resulting in an average AUC of 0.873. Initial tests to generalize to other data sets, such as TCGA, are promising and constitute one of the future directions of this work. Conclusions: The MIL framework robustly identified tissue features correlated with CMS groups, allowing for a more efficient classification of CRC samples. We also demonstrated that the morphological features indicative of different molecular subtypes can be identified from the deep neural network.

Download Full-text