A Comparison of Variational Bounds for the Information Bottleneck Functional

Bernhard C. Geiger; Ian S. Fischer

doi:10.3390/e22111229

A Comparison of Variational Bounds for the Information Bottleneck Functional

Entropy ◽

10.3390/e22111229 ◽

2020 ◽

Vol 22 (11) ◽

pp. 1229 ◽

Cited By ~ 1

Author(s):

Bernhard C. Geiger ◽

Ian S. Fischer

Keyword(s):

Cost Function ◽

Short Note ◽

General Setting ◽

Conditional Entropy ◽

Variational Bounds ◽

Information Bottleneck ◽

Feasible Sets ◽

Shed Light ◽

Relevant Cost ◽

General Setup

In this short note, we relate the variational bounds proposed in Alemi et al. (2017) and Fischer (2020) for the information bottleneck (IB) and the conditional entropy bottleneck (CEB) functional, respectively. Although the two functionals were shown to be equivalent, it was empirically observed that optimizing bounds on the CEB functional achieves better generalization performance and adversarial robustness than optimizing those on the IB functional. This work tries to shed light on this issue by showing that, in the most general setting, no ordering can be established between these variational bounds, while such an ordering can be enforced by restricting the feasible sets over which the optimizations take place. The absence of such an ordering in the general setup suggests that the variational bound on the CEB functional is either more amenable to optimization or a relevant cost function for optimization in its own regard, i.e., without justification from the IB or CEB functionals.

Download Full-text

MARTIAL AND THE DOCTORS: OPHTHALMOLOGY AND UVULECTOMY IN EPIGRAM 10.56

The Classical Quarterly ◽

10.1017/s0009838821000641 ◽

2021 ◽

pp. 1-3

Author(s):

Lawrence J. Bliquez

Keyword(s):

Surgical Procedures ◽

Short Note ◽

Medical Texts ◽

Intervention Treatment ◽

Shed Light

Abstract This short note attempts to shed light on some of the surgical procedures referred to in Martial's epigram 10.56 by consulting pertinent Graeco-Roman medical texts. A fuller understanding of one such intervention (treatment of infected/inflamed uvula) supports Martial's text as transmitted.

Download Full-text

An Efficient, Parallelized Algorithm for Optimal Conditional Entropy-Based Feature Selection

Entropy ◽

10.3390/e22040492 ◽

2020 ◽

Vol 22 (4) ◽

pp. 492

Author(s):

Gustavo Estrela ◽

Marco Dimas Gubitoso ◽

Carlos Eduardo Ferreira ◽

Junior Barrera ◽

Marcelo S. Reis

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Cost Function ◽

Optimal Algorithm ◽

Synthetic Data ◽

Conditional Entropy ◽

Boolean Lattice ◽

Search Space ◽

Classifier Design ◽

Golden Standard

In Machine Learning, feature selection is an important step in classifier design. It consists of finding a subset of features that is optimum for a given cost function. One possibility to solve feature selection is to organize all possible feature subsets into a Boolean lattice and to exploit the fact that the costs of chains in that lattice describe U-shaped curves. Minimization of such cost function is known as the U-curve problem. Recently, a study proposed U-Curve Search (UCS), an optimal algorithm for that problem, which was successfully used for feature selection. However, despite of the algorithm optimality, the UCS required time in computational assays was exponential on the number of features. Here, we report that such scalability issue arises due to the fact that the U-curve problem is NP-hard. In the sequence, we introduce the Parallel U-Curve Search (PUCS), a new algorithm for the U-curve problem. In PUCS, we present a novel way to partition the search space into smaller Boolean lattices, thus rendering the algorithm highly parallelizable. We also provide computational assays with both synthetic data and Machine Learning datasets, where the PUCS performance was assessed against UCS and other golden standard algorithms in feature selection.

Download Full-text

L-fuzzifying antimatroids: A fuzzy approach to the generalization of shelling precedence structures

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-200274 ◽

2020 ◽

Vol 39 (3) ◽

pp. 4183-4196

Author(s):

Fu-Ning Lin ◽

Guang-Ji Yu ◽

Guang-Ming Xue ◽

Jiang-Feng Han

Keyword(s):

Greedy Algorithm ◽

Optimization Problems ◽

Fuzzy Optimization ◽

Optimal Solutions ◽

Fuzzy Approach ◽

Bijective Correspondence ◽

Fundamental Properties ◽

Feasible Sets ◽

The Greedy Algorithm ◽

Shed Light

Crisp antimatroid is a combinatorial abstraction of convexity. It also can be incorporated into the greedy algorithm in order to seek the optimal solutions. Nevertheless, this kind of significant classical structure has inherent limitations in addressing fuzzy optimization problems and abstracting fuzzy convexities. This paper introduces the concept of L-fuzzifying antimatroid associated with an L-fuzzifying family of feasible sets. Several relevant fundamental properties are obtained. We also propose the concept of L-fuzzifying rank functions for L-fuzzifying antimatroids, and then investigate their axiomatic characterizations. Finally, we shed light upon the bijective correspondence between an L-fuzzifying antimatroid and its L-fuzzifying rank function.

Download Full-text

Estimating time series models using the relevant cost function

Journal of Applied Econometrics ◽

10.1002/(sici)1099-1255(199609)11:5<539::aid-jae412>3.0.co;2-i ◽

1996 ◽

Vol 11 (5) ◽

pp. 539-560 ◽

Cited By ~ 50

Author(s):

Andrew A. Weiss

Keyword(s):

Time Series ◽

Cost Function ◽

Time Series Models ◽

Relevant Cost

Download Full-text

Dynamic Programming in Data Driven Model Predictive Control

WSEAS TRANSACTIONS ON SYSTEMS ◽

10.37394/23202.2021.20.19 ◽

2021 ◽

Vol 20 ◽

pp. 170-177

Author(s):

Wang Jianhong

Keyword(s):

Optimal Control ◽

Dynamic Programming ◽

Model Predictive Control ◽

Cost Function ◽

Predictive Control ◽

Short Note ◽

Data Driven ◽

Programming System ◽

Control Sequence ◽

Actual Output

In this short note, one data driven model predictive control is studied to design the optimal control sequence. The idea of data driven means the actual output value in cost function for model predictive control is identi_ed through input-output observed data in case of unknown but bounded noise and martingale di_erence sequence. After substituting the identi_ed actual output in cost function, the total cost function in model predictive control is reformulated as the other standard form, so that dynamic programming can be applied directly. As dynamic programming is only used in optimization theory, so to extend its advantage in control theory, dynamic programming algorithm is proposed to construct the optimal control sequence. Furthermore, stability analysis for data drive model predictive control is also given based on dynamic programming strategy. Generally, the goal of this short note is to bridge the dynamic programming, system identi_cation and model predictive control. Finally, one simulation example is used to prove the e_ciency of our proposed theory

Download Full-text

Probabilistic Ensemble of Deep Information Networks

Entropy ◽

10.3390/e22010100 ◽

2020 ◽

Vol 22 (1) ◽

pp. 100 ◽

Cited By ~ 2

Author(s):

Giulio Franzese ◽

Monica Visintin

Keyword(s):

Information Theory ◽

Cost Function ◽

Decision Trees ◽

Information Networks ◽

Information Bottleneck ◽

Tree Classifier ◽

Reduced Complexity ◽

Function Information

We describe a classifier made of an ensemble of decision trees, designed using information theory concepts. In contrast to algorithms C4.5 or ID3, the tree is built from the leaves instead of the root. Each tree is made of nodes trained independently of the others, to minimize a local cost function (information bottleneck). The trained tree outputs the estimated probabilities of the classes given the input datum, and the outputs of many trees are combined to decide the class. We show that the system is able to provide results comparable to those of the tree classifier in terms of accuracy, while it shows many advantages in terms of modularity, reduced complexity, and memory requirements.

Download Full-text

m-Consecutive-k-out-of-n: F Structures with a Single Change Point

Mathematics ◽

10.3390/math8122203 ◽

2020 ◽

Vol 8 (12) ◽

pp. 2203

Author(s):

Ioannis S. Triantafyllou

Keyword(s):

Explicit Expression ◽

Present Article ◽

Change Point ◽

Reliability Function ◽

Time To Failure ◽

Mean Time To Failure ◽

Independent Components ◽

Mean Time ◽

Shed Light ◽

General Setup

In the present article, we introduce the m-consecutive-k-out-of-n:F structures with a single change point. The aforementioned system consists of n independent components, of which the first n1 units are identically distributed with common reliability p1, while the remaining ones share a different functioning probability p2. The general setup of the proposed reliability structures is presented in detail, while an explicit expression for determining the number of its path sets of a given size is derived. Additionally, closed formulae for the reliability function and mean time to failure of the aforementioned models are also provided. For illustration purposes, several numerical results and comparisons are presented in order to shed light on the performance of the proposed structure.

Download Full-text

Freshwater medusae Limnocnida indica Annandale, 1911 in the Cauvery Wildlife Sanctuary, Dubare Reserve Forest and Shivanasamudram in Karnataka, India, with a commentary note on the exotic Craspedacusta sowerbii Lankester, 1880

Journal of Threatened Taxa ◽

10.11609/jott.6609.13.3.18035-18038 ◽

2021 ◽

Vol 13 (3) ◽

pp. 18035-18038

Author(s):

Naren Sreenivasan ◽

Joshua Barton

Keyword(s):

Short Note ◽

Cauvery River ◽

Cauvery Basin ◽

Published Report ◽

Western Ghats Of India ◽

Field Identification ◽

The Status ◽

Reserve Forest ◽

The Western Ghats ◽

Shed Light

Fifty years after the first report of freshwater medusae (Limnocnida indica) from Cauvery River in Krishanrajasagar Reservoir, there has been only one other published report of its occurrence in the Cauvery Basin at Hemavathi Reservoir, Kodagu District. Recent interest in freshwater photography has revealed three more locations in the Cauvery Basin where medusae are found. Medusae are often observed at these locations but are erroneously identified as invasive species. According to published literature, this is true of Craspedacusta sowerbii, a cosmopolitan species with only three confirmed reports from India. All these reports were from artificial structures such as ponds and aquaria. The native Limnocnida and exotic Craspedacusta can be distinguished from each other visually and with respect to temporal variation in the occurrence of their free swimming medusae. This short note is intended to shed light on the status, distribution, and field identification of L. indica, a species endemic to the Western Ghats of India.

Download Full-text

The Conditional Entropy Bottleneck

Entropy ◽

10.3390/e22090999 ◽

2020 ◽

Vol 22 (9) ◽

pp. 999 ◽

Cited By ~ 3

Author(s):

Ian Fischer

Keyword(s):

Machine Learning ◽

Objective Function ◽

Failure Modes ◽

Conditional Entropy ◽

Learning Systems ◽

Training Data ◽

Deterministic Models ◽

Information Bottleneck ◽

Adversarial Examples

Much of the field of Machine Learning exhibits a prominent set of failure modes, including vulnerability to adversarial examples, poor out-of-distribution (OoD) detection, miscalibration, and willingness to memorize random labelings of datasets. We characterize these as failures of robust generalization, which extends the traditional measure of generalization as accuracy or related metrics on a held-out set. We hypothesize that these failures to robustly generalize are due to the learning systems retaining too much information about the training data. To test this hypothesis, we propose the Minimum Necessary Information (MNI) criterion for evaluating the quality of a model. In order to train models that perform well with respect to the MNI criterion, we present a new objective function, the Conditional Entropy Bottleneck (CEB), which is closely related to the Information Bottleneck (IB). We experimentally test our hypothesis by comparing the performance of CEB models with deterministic models and Variational Information Bottleneck (VIB) models on a variety of different datasets and robustness challenges. We find strong empirical evidence supporting our hypothesis that MNI models improve on these problems of robust generalization.

Download Full-text

The Deterministic Information Bottleneck

Neural Computation ◽

10.1162/neco_a_00961 ◽

2017 ◽

Vol 29 (6) ◽

pp. 1611-1630 ◽

Cited By ~ 20

Author(s):

DJ Strouse ◽

David J. Schwab

Keyword(s):

Mutual Information ◽

Cost Function ◽

Optimization Problem ◽

Synthetic Data ◽

Alternative Formulation ◽

Trade Off ◽

Information Theoretic ◽

Soft Clustering ◽

Information Bottleneck ◽

Hard Clustering

Lossy compression and clustering fundamentally involve a decision about which features are relevant and which are not. The information bottleneck method (IB) by Tishby, Pereira, and Bialek ( 1999 ) formalized this notion as an information-theoretic optimization problem and proposed an optimal trade-off between throwing away as many bits as possible and selectively keeping those that are most important. In the IB, compression is measured by mutual information. Here, we introduce an alternative formulation that replaces mutual information with entropy, which we call the deterministic information bottleneck (DIB) and argue better captures this notion of compression. As suggested by its name, the solution to the DIB problem turns out to be a deterministic encoder, or hard clustering, as opposed to the stochastic encoder, or soft clustering, that is optimal under the IB. We compare the IB and DIB on synthetic data, showing that the IB and DIB perform similarly in terms of the IB cost function, but that the DIB significantly outperforms the IB in terms of the DIB cost function. We also empirically find that the DIB offers a considerable gain in computational efficiency over the IB, over a range of convergence parameters. Our derivation of the DIB also suggests a method for continuously interpolating between the soft clustering of the IB and the hard clustering of the DIB.

Download Full-text