Variational Characterizations of Local Entropy and Heat Regularization in Deep Learning

Nicolas García Trillos; Zachary Kaplan; Daniel Sanz-Alonso

doi:10.3390/e21050511

Variational Characterizations of Local Entropy and Heat Regularization in Deep Learning

Entropy ◽

10.3390/e21050511 ◽

2019 ◽

Vol 21 (5) ◽

pp. 511

Author(s):

Nicolas García Trillos ◽

Zachary Kaplan ◽

Daniel Sanz-Alonso

Keyword(s):

Deep Learning ◽

Gaussian Approximation ◽

Computational Cost ◽

Approximation Error ◽

Local Entropy ◽

Leibler Divergence ◽

Monotonicity Result ◽

Sampling Algorithms ◽

Training Error ◽

Optimization Schemes

The aim of this paper is to provide new theoretical and computational understanding on two loss regularizations employed in deep learning, known as local entropy and heat regularization. For both regularized losses, we introduce variational characterizations that naturally suggest a two-step scheme for their optimization, based on the iterative shift of a probability density and the calculation of a best Gaussian approximation in Kullback–Leibler divergence. Disregarding approximation error in these two steps, the variational characterizations allow us to show a simple monotonicity result for training error along optimization iterates. The two-step optimization schemes for local entropy and heat regularized loss differ only over which argument of the Kullback–Leibler divergence is used to find the best Gaussian approximation. Local entropy corresponds to minimizing over the second argument, and the solution is given by moment matching. This allows replacing traditional backpropagation calculation of gradients by sampling algorithms, opening an avenue for gradient-free, parallelizable training of neural networks. However, our presentation also acknowledges the potential increase in computational cost of naive optimization of regularized costs, thus giving a less optimistic view than existing works of the gains facilitated by loss regularization.

Download Full-text

Improving Sentiment Analysis using Hybrid Deep Learning Model

Recent Advances in Computer Science and Communications ◽

10.2174/2213275912666190328200012 ◽

2020 ◽

Vol 13 (4) ◽

pp. 627-640 ◽

Cited By ~ 1

Author(s):

Avinash Chandra Pandey ◽

Dharmveer Singh Rajpoot

Keyword(s):

Neural Network ◽

Deep Learning ◽

Sentiment Analysis ◽

Classification Accuracy ◽

Short Term Memory ◽

Computational Cost ◽

Extraction Process ◽

Learning Model ◽

Sentiment Classification ◽

Deep Learning Model

Background: Sentiment analysis is a contextual mining of text which determines viewpoint of users with respect to some sentimental topics commonly present at social networking websites. Twitter is one of the social sites where people express their opinion about any topic in the form of tweets. These tweets can be examined using various sentiment classification methods to find the opinion of users. Traditional sentiment analysis methods use manually extracted features for opinion classification. The manual feature extraction process is a complicated task since it requires predefined sentiment lexicons. On the other hand, deep learning methods automatically extract relevant features from data hence; they provide better performance and richer representation competency than the traditional methods. Objective: The main aim of this paper is to enhance the sentiment classification accuracy and to reduce the computational cost. Method: To achieve the objective, a hybrid deep learning model, based on convolution neural network and bi-directional long-short term memory neural network has been introduced. Results: The proposed sentiment classification method achieves the highest accuracy for the most of the datasets. Further, from the statistical analysis efficacy of the proposed method has been validated. Conclusion: Sentiment classification accuracy can be improved by creating veracious hybrid models. Moreover, performance can also be enhanced by tuning the hyper parameters of deep leaning models.

Download Full-text

Prediction of 3D Cardiovascular hemodynamics before and after coronary artery bypass surgery via deep learning

Communications Biology ◽

10.1038/s42003-020-01638-1 ◽

2021 ◽

Vol 4 (1) ◽

Author(s):

Gaoyang Li ◽

Haoran Wang ◽

Mingzi Zhang ◽

Simon Tupin ◽

Aike Qiao ◽

...

Keyword(s):

Deep Learning ◽

Artery Bypass ◽

Computational Cost ◽

Specific Model ◽

Arterial System ◽

Patient Specific ◽

Cardiovascular Hemodynamics ◽

Before And After ◽

Deep Learning Network ◽

Cfd Method

AbstractThe clinical treatment planning of coronary heart disease requires hemodynamic parameters to provide proper guidance. Computational fluid dynamics (CFD) is gradually used in the simulation of cardiovascular hemodynamics. However, for the patient-specific model, the complex operation and high computational cost of CFD hinder its clinical application. To deal with these problems, we develop cardiovascular hemodynamic point datasets and a dual sampling channel deep learning network, which can analyze and reproduce the relationship between the cardiovascular geometry and internal hemodynamics. The statistical analysis shows that the hemodynamic prediction results of deep learning are in agreement with the conventional CFD method, but the calculation time is reduced 600-fold. In terms of over 2 million nodes, prediction accuracy of around 90%, computational efficiency to predict cardiovascular hemodynamics within 1 second, and universality for evaluating complex arterial system, our deep learning method can meet the needs of most situations.

Download Full-text

Directional TGV-Based Image Restoration under Poisson Noise

Journal of Imaging ◽

10.3390/jimaging7060099 ◽

2021 ◽

Vol 7 (6) ◽

pp. 99

Author(s):

Daniela di Serafino ◽

Germana Landi ◽

Marco Viola

Keyword(s):

Computed Tomography ◽

Image Restoration ◽

Gaussian Noise ◽

Computational Cost ◽

Data Fitting ◽

Poisson Noise ◽

Glass Fibres ◽

Leibler Divergence ◽

Generalized Variation ◽

Low Computational Cost

We are interested in the restoration of noisy and blurry images where the texture mainly follows a single direction (i.e., directional images). Problems of this type arise, for example, in microscopy or computed tomography for carbon or glass fibres. In order to deal with these problems, the Directional Total Generalized Variation (DTGV) was developed by Kongskov et al. in 2017 and 2019, in the case of impulse and Gaussian noise. In this article we focus on images corrupted by Poisson noise, extending the DTGV regularization to image restoration models where the data fitting term is the generalized Kullback–Leibler divergence. We also propose a technique for the identification of the main texture direction, which improves upon the techniques used in the aforementioned work about DTGV. We solve the problem by an ADMM algorithm with proven convergence and subproblems that can be solved exactly at a low computational cost. Numerical results on both phantom and real images demonstrate the effectiveness of our approach.

Download Full-text

Lightweight deep residual network for alzheimer’s disease classification using sMRI slices

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-211247 ◽

2021 ◽

pp. 1-9

Author(s):

Yanteng Zhang ◽

Qizhi Teng ◽

Linbo Qing ◽

Yan Liu ◽

Xiaohai He

Keyword(s):

Alzheimer’S Disease ◽

Alzheimer's Disease ◽

Deep Learning ◽

Computational Cost ◽

Disease Classification ◽

Learning Networks ◽

Feature Maps ◽

Medical Field ◽

Widespread Application ◽

Healthy Control

Alzheimer’s disease (AD) is a degenerative brain disease and the most common cause of dementia. In recent years, with the widespread application of artificial intelligence in the medical field, various deep learning-based methods have been applied for AD detection using sMRI images. Many of these networks achieved AD vs HC (Healthy Control) classification accuracy of up to 90%but with a large number of computational parameters and floating point operations (FLOPs). In this paper, we adopt a novel ghost module, which uses a series of cheap operations of linear transformation to generate more feature maps, embedded into our designed ResNet architecture for task of AD vs HC classification. According to experiments on the OASIS dataset, our lightweight network achieves an optimistic accuracy of 97.92%and its total parameters are dozens of times smaller than state-of-the-art deep learning networks. Our proposed AD classification network achieves better performance while the computational cost is reduced significantly.

Download Full-text

Accelerating Super-Resolution and Visual Task Analysis in Medical Images

Applied Sciences ◽

10.3390/app10124282 ◽

2020 ◽

Vol 10 (12) ◽

pp. 4282

Author(s):

Ghada Zamzmi ◽

Sivaramakrishnan Rajaraman ◽

Sameer Antani

Keyword(s):

Deep Learning ◽

High Resolution ◽

Task Analysis ◽

Multiple Scales ◽

Medical Images ◽

Computational Cost ◽

Super Resolution ◽

Visual Task ◽

Learning Networks ◽

High Resolution Images

Medical images are acquired at different resolutions based on clinical goals or available technology. In general, however, high-resolution images with fine structural details are preferred for visual task analysis. Recognizing this significance, several deep learning networks have been proposed to enhance medical images for reliable automated interpretation. These deep networks are often computationally complex and require a massive number of parameters, which restrict them to highly capable computing platforms with large memory banks. In this paper, we propose an efficient deep learning approach, called Hydra, which simultaneously reduces computational complexity and improves performance. The Hydra consists of a trunk and several computing heads. The trunk is a super-resolution model that learns the mapping from low-resolution to high-resolution images. It has a simple architecture that is trained using multiple scales at once to minimize a proposed learning-loss function. We also propose to append multiple task-specific heads to the trained Hydra trunk for simultaneous learning of multiple visual tasks in medical images. The Hydra is evaluated on publicly available chest X-ray image collections to perform image enhancement, lung segmentation, and abnormality classification. Our experimental results support our claims and demonstrate that the proposed approach can improve the performance of super-resolution and visual task analysis in medical images at a remarkably reduced computational cost.

Download Full-text

Channel State Estimation in LTE-Based Heterogenous Networks Using Deep Learning

Sensors ◽

10.3390/s21227716 ◽

2021 ◽

Vol 21 (22) ◽

pp. 7716

Author(s):

Krzysztof K. Cwalina ◽

Piotr Rajchowski ◽

Alicja Olejniczak ◽

Olga Błaszkiewicz ◽

Robert Burczyk

Keyword(s):

Deep Learning ◽

Interface Design ◽

Mean Squared Error ◽

Radio Channel ◽

Computational Cost ◽

Urban Environments ◽

Data Allocation ◽

Urban Networks ◽

Radio Interface ◽

Channel State Estimation

Following the continuous development of the information technology, the concept of dense urban networks has evolved as well. The powerful tools, like machine learning, break new ground in smart network and interface design. In this paper the concept of using deep learning for estimating the radio channel parameters of the LTE (Long Term Evolution) radio interface is presented. It was proved that the deep learning approach provides a significant gain (almost 40%) with 10.7% compared to the linear model with the lowest RMSE (Root Mean Squared Error) 17.01%. The solution can be adopted as a part of the data allocation algorithm implemented in the telemetry devices equipped with the 4G radio interface, or, after the adjustment, the NB-IoT (Narrowband Internet of Things), to maximize the reliability of the services in harsh indoor or urban environments. Presented results also prove the existence of the inverse proportional dependence between the number of hidden layers and the number of historical samples in terms of the obtained RMSE. The increase of the historical data memory allows using models with fewer hidden layers while maintaining a comparable RMSE value for each scenario, which reduces the total computational cost.

Download Full-text

Fast Filtering of Search Results Sorted by Attribute

ACM Transactions on Information Systems ◽

10.1145/3477982 ◽

2022 ◽

Vol 40 (2) ◽

pp. 1-24

Author(s):

Franco Maria Nardini ◽

Roberto Trani ◽

Rossano Venturini

Keyword(s):

Heuristic Algorithms ◽

State Of The Art ◽

Optimal Algorithm ◽

Computational Cost ◽

Approximation Error ◽

Optimal Solution ◽

Exact Algorithm ◽

Performance Bounds ◽

Search Results ◽

Filtering Problem

Modern search services often provide multiple options to rank the search results, e.g., sort “by relevance”, “by price” or “by discount” in e-commerce. While the traditional rank by relevance effectively places the relevant results in the top positions of the results list, the rank by attribute could place many marginally relevant results in the head of the results list leading to poor user experience. In the past, this issue has been addressed by investigating the relevance-aware filtering problem, which asks to select the subset of results maximizing the relevance of the attribute-sorted list. Recently, an exact algorithm has been proposed to solve this problem optimally. However, the high computational cost of the algorithm makes it impractical for the Web search scenario, which is characterized by huge lists of results and strict time constraints. For this reason, the problem is often solved using efficient yet inaccurate heuristic algorithms. In this article, we first prove the performance bounds of the existing heuristics. We then propose two efficient and effective algorithms to solve the relevance-aware filtering problem. First, we propose OPT-Filtering, a novel exact algorithm that is faster than the existing state-of-the-art optimal algorithm. Second, we propose an approximate and even more efficient algorithm, ϵ-Filtering, which, given an allowed approximation error ϵ, finds a (1-ϵ)–optimal filtering, i.e., the relevance of its solution is at least (1-ϵ) times the optimum. We conduct a comprehensive evaluation of the two proposed algorithms against state-of-the-art competitors on two real-world public datasets. Experimental results show that OPT-Filtering achieves a significant speedup of up to two orders of magnitude with respect to the existing optimal solution, while ϵ-Filtering further improves this result by trading effectiveness for efficiency. In particular, experiments show that ϵ-Filtering can achieve quasi-optimal solutions while being faster than all state-of-the-art competitors in most of the tested configurations.

Download Full-text

Synthesis of Prostate MR Images for Classification Using Capsule Network-Based GAN Model

Sensors ◽

10.3390/s20205736 ◽

2020 ◽

Vol 20 (20) ◽

pp. 5736

Author(s):

Houqiang Yu ◽

Xuming Zhang

Keyword(s):

Prostate Cancer ◽

Deep Learning ◽

Data Augmentation ◽

Spatial Relationship ◽

Health Concern ◽

Complex Data ◽

Mr Images ◽

Generative Adversarial Network ◽

Generation Task ◽

Leibler Divergence

Prostate cancer remains a major health concern among elderly men. Deep learning is a state-of-the-art technique for MR image-based prostate cancer diagnosis, but one of major bottlenecks is the severe lack of annotated MR images. The traditional and Generative Adversarial Network (GAN)-based data augmentation methods cannot ensure the quality and the diversity of generated training samples. In this paper, we have proposed a novel GAN model for synthesis of MR images by utilizing its powerful ability in modeling the complex data distributions. The proposed model is designed based on the architecture of deep convolutional GAN. To learn the more equivariant representation of images that is robust to the changes in the pose and spatial relationship of objects in the images, the capsule network is applied to replace CNN used in the discriminator of regular GAN. Meanwhile, the least squares loss has been adopted for both the generator and discriminator in the proposed GAN to address the vanishing gradient problem of sigmoid cross entropy loss function in regular GAN. Extensive experiments are conducted on the simulated and real MR images. The results demonstrate that the proposed capsule network-based GAN model can generate more realistic and higher quality MR images than the compared GANs. The quantitative comparisons show that among all evaluated models, the proposed GAN generally achieves the smallest Kullback–Leibler divergence values for image generation task and provides the best classification performance when it is introduced into the deep learning method for image classification task.

Download Full-text

Predicting the Computational Cost of Deep Learning Models

2018 IEEE International Conference on Big Data (Big Data) ◽

10.1109/bigdata.2018.8622396 ◽

2018 ◽

Cited By ~ 13

Author(s):

Daniel Justus ◽

John Brennan ◽

Stephen Bonner ◽

Andrew Stephen McGough

Keyword(s):

Deep Learning ◽

Computational Cost ◽

Learning Models

Download Full-text

A Deep Learning Method for Near-Real-Time Cloud and Cloud Shadow Segmentation from Gaofen-1 Images

Computational Intelligence and Neuroscience ◽

10.1155/2020/8811630 ◽

2020 ◽

Vol 2020 ◽

pp. 1-13

Author(s):

Mehdi Khoshboresh-Masouleh ◽

Reza Shah-Hosseini

Keyword(s):

Deep Learning ◽

Real Time ◽

Computational Cost ◽

High Definition ◽

Satellite Mission ◽

Real Time Processing ◽

National Space ◽

Spatial Features ◽

Efficiency And Effectiveness ◽

Earth Observation Satellite

In this study, an essential application of remote sensing using deep learning functionality is presented. Gaofen-1 satellite mission, developed by the China National Space Administration (CNSA) for the civilian high-definition Earth observation satellite program, provides near-real-time observations for geographical mapping, environment surveying, and climate change monitoring. Cloud and cloud shadow segmentation are a crucial element to enable automatic near-real-time processing of Gaofen-1 images, and therefore, their performances must be accurately validated. In this paper, a robust multiscale segmentation method based on deep learning is proposed to improve the efficiency and effectiveness of cloud and cloud shadow segmentation from Gaofen-1 images. The proposed method first implements feature map based on the spectral-spatial features from residual convolutional layers and the cloud/cloud shadow footprints extraction based on a novel loss function to generate the final footprints. The experimental results using Gaofen-1 images demonstrate the more reasonable accuracy and efficient computational cost achievement of the proposed method compared to the cloud and cloud shadow segmentation performance of two existing state-of-the-art methods.

Download Full-text