Automatic Feature Selection for Improved Interpretability on Whole Slide Imaging

Antoine Pirovano; Hippolyte Heuberger; Sylvain Berlemont; SaÏd Ladjal; Isabelle Bloch

doi:10.3390/make3010012

Automatic Feature Selection for Improved Interpretability on Whole Slide Imaging

Machine Learning and Knowledge Extraction ◽

10.3390/make3010012 ◽

2021 ◽

Vol 3 (1) ◽

pp. 243-262

Author(s):

Antoine Pirovano ◽

Hippolyte Heuberger ◽

Sylvain Berlemont ◽

SaÏd Ladjal ◽

Isabelle Bloch

Keyword(s):

Deep Learning ◽

Multiple Instance Learning ◽

Daily Routine ◽

Learning Context ◽

Learning Methods ◽

Fully Integrated ◽

Gradient Based ◽

Important Challenge ◽

The Stability ◽

The Impact

Deep learning methods are widely used for medical applications to assist medical doctors in their daily routine. While performances reach expert’s level, interpretability (highlighting how and what a trained model learned and why it makes a specific decision) is the next important challenge that deep learning methods need to answer to be fully integrated in the medical field. In this paper, we address the question of interpretability in the context of whole slide images (WSI) classification with the formalization of the design of WSI classification architectures and propose a piece-wise interpretability approach, relying on gradient-based methods, feature visualization and multiple instance learning context. After training two WSI classification architectures on Camelyon-16 WSI dataset, highlighting discriminative features learned, and validating our approach with pathologists, we propose a novel manner of computing interpretability slide-level heat-maps, based on the extracted features, that improves tile-level classification performances. We measure the improvement using the tile-level AUC that we called Localization AUC, and show an improvement of more than 0.2. We also validate our results with a RemOve And Retrain (ROAR) measure. Then, after studying the impact of the number of features used for heat-map computation, we propose a corrective approach, relying on activation colocalization of selected features, that improves the performances and the stability of our proposed method.

Download Full-text

A Review of Computer-Aided Expert Systems for Breast Cancer Diagnosis

Cancers ◽

10.3390/cancers13112764 ◽

2021 ◽

Vol 13 (11) ◽

pp. 2764

Author(s):

Xin Yu Liew ◽

Nazia Hameed ◽

Jeremie Clos

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Deep Learning ◽

Main Process ◽

Learning Approaches ◽

Learning Methods ◽

Advantages And Disadvantages ◽

Computer Aided ◽

Conventional Methods ◽

The Impact

A computer-aided diagnosis (CAD) expert system is a powerful tool to efficiently assist a pathologist in achieving an early diagnosis of breast cancer. This process identifies the presence of cancer in breast tissue samples and the distinct type of cancer stages. In a standard CAD system, the main process involves image pre-processing, segmentation, feature extraction, feature selection, classification, and performance evaluation. In this review paper, we reviewed the existing state-of-the-art machine learning approaches applied at each stage involving conventional methods and deep learning methods, the comparisons within methods, and we provide technical details with advantages and disadvantages. The aims are to investigate the impact of CAD systems using histopathology images, investigate deep learning methods that outperform conventional methods, and provide a summary for future researchers to analyse and improve the existing techniques used. Lastly, we will discuss the research gaps of existing machine learning approaches for implementation and propose future direction guidelines for upcoming researchers.

Download Full-text

Deep Learning for Caries Detection and Classification

Diagnostics ◽

10.3390/diagnostics11091672 ◽

2021 ◽

Vol 11 (9) ◽

pp. 1672

Author(s):

Luya Lian ◽

Tianer Zhu ◽

Fudong Zhu ◽

Haihua Zhu

Keyword(s):

Neural Network ◽

Deep Learning ◽

Disease Diagnosis ◽

Validation Dataset ◽

Reference Dataset ◽

Dice Coefficient ◽

Learning Methods ◽

Test Dataset ◽

The Impact ◽

Caries Lesions

Objectives: Deep learning methods have achieved impressive diagnostic performance in the field of radiology. The current study aimed to use deep learning methods to detect caries lesions, classify different radiographic extensions on panoramic films, and compare the classification results with those of expert dentists. Methods: A total of 1160 dental panoramic films were evaluated by three expert dentists. All caries lesions in the films were marked with circles, whose combination was defined as the reference dataset. A training and validation dataset (1071) and a test dataset (89) were then established from the reference dataset. A convolutional neural network, called nnU-Net, was applied to detect caries lesions, and DenseNet121 was applied to classify the lesions according to their depths (dentin lesions in the outer, middle, or inner third D1/2/3 of dentin). The performance of the test dataset in the trained nnU-Net and DenseNet121 models was compared with the results of six expert dentists in terms of the intersection over union (IoU), Dice coefficient, accuracy, precision, recall, negative predictive value (NPV), and F1-score metrics. Results: nnU-Net yielded caries lesion segmentation IoU and Dice coefficient values of 0.785 and 0.663, respectively, and the accuracy and recall rate of nnU-Net were 0.986 and 0.821, respectively. The results of the expert dentists and the neural network were shown to be no different in terms of accuracy, precision, recall, NPV, and F1-score. For caries depth classification, DenseNet121 showed an overall accuracy of 0.957 for D1 lesions, 0.832 for D2 lesions, and 0.863 for D3 lesions. The recall results of the D1/D2/D3 lesions were 0.765, 0.652, and 0.918, respectively. All metric values, including accuracy, precision, recall, NPV, and F1-score values, were proven to be no different from those of the experienced dentists. Conclusion: In detecting and classifying caries lesions on dental panoramic radiographs, the performance of deep learning methods was similar to that of expert dentists. The impact of applying these well-trained neural networks for disease diagnosis and treatment decision making should be explored.

Download Full-text

Adapting for Informal Language in Arabic Twitter Improves ‎Monitoring of COVID-19 Pandemic and Influenza Epidemic‎ (Preprint)

10.2196/preprints.27670 ◽

2021 ◽

Author(s):

Lama Alsudias ◽

Paul Rayson

Keyword(s):

Social Media ◽

Deep Learning ◽

Building Blocks ◽

Detection Algorithm ◽

Entity Recognition ◽

Disease Spread ◽

Learning Methods ◽

Infected People ◽

Standard Terminology ◽

The Impact

BACKGROUND Twitter is a real time messaging platform widely used by people and organisations to share ‎information on many topics. It could potentially be useful to analyse tweets for infectious ‎disease monitoring purposes ‎ in order to reduce reporting lag time, and to provide an ‎independent complementary source of data, compared to traditional approaches. ‎However, such analysis is currently not possible in the Arabic speaking world due to lack of ‎basic building blocks for research.‎ OBJECTIVE We collect around 4,000 Arabic tweets related to COVID-19 and Influenza. We clean and ‎label the tweets relative to the Arabic Infectious Diseases Ontology which includes non-‎standard terminology and 11 core concepts and 21 relations. The aim of this study is to ‎analyse Arabic tweets to estimate their usefulness for health surveillance, understand the ‎impact of the informal terms in the analysis, show the effect of the deep learning methods ‎in the classification process, and identify the locations where the infection is spreading.‎ METHODS We apply multi-label classification techniques: Binary Relevance, Classifier Chains, Label ‎Powerset, Adapted Algorithm (MLKNN), NBSVM, BERT, and AraBERT to identify infected ‎people. We also use Named Entity Recognition to predict the locations affected. ‎ RESULTS We achieve an F1-score up to 88% in the Influenza case study and 94% in the COVID-19 one. ‎ ‎ Adapting for non-standard terminology and informal language helps to improve ‎accuracy by as ‎much as 15% with an average improvement of 8%.‎ Deep learning methods ‎achieve around 5% on hamming loss during the classifying process. Our geo-location ‎detection algorithm can predict on average 54% accuracy for the location of the users using ‎tweet content.‎ ‎ ‎ ‎ CONCLUSIONS This study identifies two Arabic social media datasets for monitoring tweets related to ‎Influenza and COVID-19‎. It demonstrates the importance of including informal terms, which ‎is regularly used by social media users, in the analysis. It also proves that BERT achieves good ‎results when used with new terms in COVID-19 tweets. Finally, the tweet content may ‎contain useful information to determine the location of the disease spread.

Download Full-text

Three-dimensional deep learning with spatial erasing for unsupervised anomaly segmentation in brain MRI

International Journal of Computer Assisted Radiology and Surgery ◽

10.1007/s11548-021-02451-9 ◽

2021 ◽

Author(s):

Marcel Bengs ◽

Finn Behrendt ◽

Julia Krüger ◽

Roland Opfer ◽

Alexander Schlaefer

Keyword(s):

Deep Learning ◽

Brain Mri ◽

Magnetic Resonance Images ◽

Spatial Context ◽

Data Sets ◽

Learning Methods ◽

Data Set ◽

Performance Improvements ◽

Wide Range ◽

The Impact

Abstract Purpose Brain Magnetic Resonance Images (MRIs) are essential for the diagnosis of neurological diseases. Recently, deep learning methods for unsupervised anomaly detection (UAD) have been proposed for the analysis of brain MRI. These methods rely on healthy brain MRIs and eliminate the requirement of pixel-wise annotated data compared to supervised deep learning. While a wide range of methods for UAD have been proposed, these methods are mostly 2D and only learn from MRI slices, disregarding that brain lesions are inherently 3D and the spatial context of MRI volumes remains unexploited. Methods We investigate whether using increased spatial context by using MRI volumes combined with spatial erasing leads to improved unsupervised anomaly segmentation performance compared to learning from slices. We evaluate and compare 2D variational autoencoder (VAE) to their 3D counterpart, propose 3D input erasing, and systemically study the impact of the data set size on the performance. Results Using two publicly available segmentation data sets for evaluation, 3D VAEs outperform their 2D counterpart, highlighting the advantage of volumetric context. Also, our 3D erasing methods allow for further performance improvements. Our best performing 3D VAE with input erasing leads to an average DICE score of 31.40% compared to 25.76% for the 2D VAE. Conclusions We propose 3D deep learning methods for UAD in brain MRI combined with 3D erasing and demonstrate that 3D methods clearly outperform their 2D counterpart for anomaly segmentation. Also, our spatial erasing method allows for further performance improvements and reduces the requirement for large data sets.

Download Full-text

An Efficient Method for Generating Adversarial Malware Samples

Electronics ◽

10.3390/electronics11010154 ◽

2022 ◽

Vol 11 (1) ◽

pp. 154

Author(s):

Yuxin Ding ◽

Miaomiao Shao ◽

Cai Nie ◽

Kunyang Fu

Keyword(s):

Deep Learning ◽

Efficient Method ◽

Learning Algorithms ◽

Malware Detection ◽

Experimental Results ◽

Learning Models ◽

Learning Methods ◽

Gradient Based ◽

Novel Method

Deep learning methods have been applied to malware detection. However, deep learning algorithms are not safe, which can easily be fooled by adversarial samples. In this paper, we study how to generate malware adversarial samples using deep learning models. Gradient-based methods are usually used to generate adversarial samples. These methods generate adversarial samples case-by-case, which is very time-consuming to generate a large number of adversarial samples. To address this issue, we propose a novel method to generate adversarial malware samples. Different from gradient-based methods, we extract feature byte sequences from benign samples. Feature byte sequences represent the characteristics of benign samples and can affect classification decision. We directly inject feature byte sequences into malware samples to generate adversarial samples. Feature byte sequences can be shared to produce different adversarial samples, which can efficiently generate a large number of adversarial samples. We compare the proposed method with the randomly injecting and gradient-based methods. The experimental results show that the adversarial samples generated using our proposed method have a high successful rate.

Download Full-text

Assessing the relationship between routine and schizophrenia symptoms with passively sensed measures of behavioral stability

npj Schizophrenia ◽

10.1038/s41537-020-00123-2 ◽

2020 ◽

Vol 6 (1) ◽

Author(s):

Joy He-Yueya ◽

Benjamin Buck ◽

Andrew Campbell ◽

Tanzeem Choudhury ◽

John M. Kane ◽

...

Keyword(s):

Behavioral Interventions ◽

Stability Index ◽

Well Being ◽

Mental Illnesses ◽

Daily Routine ◽

Schizophrenia Spectrum Disorders ◽

Behavioral Stability ◽

Passive Sensors ◽

The Stability ◽

The Impact

AbstractIncreased stability in one’s daily routine is associated with well-being in the general population and often a goal of behavioral interventions for people with serious mental illnesses like schizophrenia. Assessing behavioral stability has been limited in clinical research by the use of retrospective scales, which are susceptible to reporting biases and memory inaccuracies. Mobile passive sensors, which are less susceptible to these sources of error, have emerged as tools to assess behavioral patterns in a range of populations. The present study developed and examined a metric of behavioral stability from data generated by a passive sensing system carried by 61 individuals with schizophrenia for one year. This metric—the Stability Index—appeared orthogonal from existing measures drawn from passive sensors and matched the predictive performance of state-of-the-art features. Specifically, greater stability in social activity (e.g., calls and messages) were associated with lower symptoms, and greater stability in physical activity (e.g., being still) appeared associated with elevated symptoms. This study provides additional support for the predictive value of individualized over population-level data in psychiatric populations. The Stability Index offers also a promising tool for generating insights about the impact of behavioral stability in schizophrenia-spectrum disorders.

Download Full-text

Genome-Wide Prediction of cis-Regulatory Regions Using Supervised Deep Learning Methods

10.1101/041616 ◽

2016 ◽

Cited By ~ 4

Author(s):

Yifeng Li ◽

Wenqiang Shi ◽

Wyeth W Wasserman

Keyword(s):

Deep Learning ◽

Human Genome ◽

Predictive Performance ◽

Significant Advance ◽

Complex Data ◽

Promoter Regions ◽

Learning Methods ◽

Regulatory Regions ◽

Genome Wide ◽

The Impact

Identifying active cis-regulatory regions in the human genome is critical for understanding gene regulation and assessing the impact of genetic variation on phenotype. Based on rich data resources such as the Encyclopedia of DNA Elements (ENCODE) and the Functional Annotation of the Mammalian Genome (FANTOM) projects, we introduce DECRES, the first supervised deep learning approach for the identification of enhancer and promoter regions in the human genome. Due to their ability to discover patterns in large and complex data, the introduction of deep learning methods enables a significant advance in our knowledge of the genomic locations of cis-regulatory regions. Using models for well-characterized cell lines, we identify key experimental features that contribute to the predictive performance. Applying DECRES, we delineate locations of 300,000 candidate enhancers genome wide (6.8% of the genome, of which 40,000 are supported by bidirectional transcription data) and 26,000 candidate promoters (0.6% of the genome).

Download Full-text

Deep Learning Theory and Software

Advances in Computer and Electrical Engineering - MatConvNet Deep Learning and iOS Mobile App Design for Pattern Recognition ◽

10.4018/978-1-7998-1554-9.ch002 ◽

2020 ◽

pp. 23-61

Keyword(s):

Deep Learning ◽

Loss Function ◽

Recognition Rate ◽

Back Propagation ◽

Descent Method ◽

Gradient Descent Method ◽

Learning Methods ◽

Adaptive Weights ◽

Gradient Based ◽

Self Learning

In the past decade, deep learning has achieved a significant breakthrough in development. In addition to the emergence of convolution, the most important is self-learning of deep neural networks. By self-learning methods, adaptive weights of kernels and built-in parameters or interconnections are automatically modified such that the error rate is reduced along the learning process, and the recognition rate is improved. Emulating mechanism of the brain, it can have accurate recognition ability after learning. One of the most important self-learning methods is back-propagation (BP). The current BP method is indeed a systematic way of calculating the gradient of the loss with respect to adaptive interconnections. The main core of the gradient descent method addresses on modifying the weights negatively proportional to the determined gradient of the loss function, subsequently reducing the error of the network response in comparison with the standard answer. The basic assumption for this type of the gradient-based self-learning is that the loss function is the first-order differential.

Download Full-text

Audio Classification for Noise Filtering Using Convolutional Neural Network Approach

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.37218 ◽

2021 ◽

Vol 9 (VII) ◽

pp. 3675-3680

Author(s):

D. Sudheer

Keyword(s):

Neural Network ◽

Deep Learning ◽

Convolutional Neural Network ◽

Daily Routine ◽

Learning Networks ◽

Audio Classification ◽

Neural Network Approach ◽

Sound Classification ◽

Natural Sound ◽

The Impact

In each part of daily routine, sound assumes a significant part. From discrete security features to basic reconnaissance, a sound is a vivacious component to create automated frameworks for these fields. Scarcely any frameworks are now on the lookout, yet their effectiveness is a concerned point for their execution, real-time conditions. The learning capacities of Deep learning designs can be utilized to create sound characterization frameworks increase the impact of sound classification. Our main aim in this paper is to implement deep learning networks for filtering the nose and arrangement of these sound created by the natural phenomenon’s according to the spectrograms that are created accordingly. The spectrograms of these natural sounds are utilized for the preparation of the Convolutional neural network (CNN) and Tensor Deep Stacking Network (TDSN). The utilized datasets for analysis and creation of the networks are ESC-10 and ESC-50. These frameworks produced from these datasets were efficient in accomplishment of filtering the audio and recognizing the audio of the natural sound. The precision obtained from the developed system is 80% for CNN and 70% for TDSN. Form the implemented framework, it is presumed that proposed approach for sound filtering and recognition through the utility spectrogram of their subsequent sounds can be productively used to create efficient frameworks for audio classification and recognition based on neural networks.

Download Full-text

Estimation of PM2.5 Concentration Using Deep Bayesian Model Considering Spatial Multiscale

Remote Sensing ◽

10.3390/rs13224545 ◽

2021 ◽

Vol 13 (22) ◽

pp. 4545

Author(s):

Xingdi Chen ◽

Peng Kong ◽

Peng Jiang ◽

Yanlan Wu

Keyword(s):

Neural Network ◽

Deep Learning ◽

Satellite Data ◽

Multiple Scales ◽

Data Uncertainty ◽

Estimation Accuracy ◽

Learning Methods ◽

Generalization Ability ◽

The Impact ◽

Pm2.5 Concentration

Directly establishing the relationship between satellite data and PM2.5 concentration through deep learning methods for PM2.5 concentration estimation is an important means for estimating regional PM2.5 concentration. However, due to the lack of consideration of uncertainty in deep learning methods, methods based on deep learning have certain overfitting problems in the process of PM2.5 estimation. In response to this problem, this paper designs a deep Bayesian PM2.5 estimation model that takes into account multiple scales. The model uses a Bayesian neural network to describe key parameters a priori, provide regularization effects to the neural network, perform posterior inference through parameters, and take into account the characteristics of data uncertainty, which is used to alleviate the problem of model overfitting and to improve the generalization ability of the model. In addition, different-scale Moderate-Resolution Imaging Spectroradiometer (MODIS) satellite data and ERA5 reanalysis data were used as input to the model to strengthen the model’s perception of different-scale features of the atmosphere, as well as to further enhance the model’s PM2.5 estimation accuracy and generalization ability. Experiments with Anhui Province as the research area showed that the R2 of this method on the independent test set was 0.78, which was higher than that of the DNN, random forest, and BNN models that do not consider the impact of the surrounding environment; moreover, the RMSE was 19.45 μg·m−3, which was also lower than the three compared models. In the experiment of different seasons in 2019, compared with the other three models, the estimation accuracy was significantly reduced; however, the R2 of the model in this paper could still reach 0.66 or more. Thus, the model in this paper has a higher accuracy and better generalization ability.

Download Full-text