On the Information Bottleneck Problems: Models, Connections, Applications and Information Theoretic Views

Abdellatif Zaidi; Iñaki Estella-Aguerri; Shlomo Shamai (Shitz)

doi:10.3390/e22020151

On the Information Bottleneck Problems: Models, Connections, Applications and Information Theoretic Views

Entropy ◽

10.3390/e22020151 ◽

2020 ◽

Vol 22 (2) ◽

pp. 151 ◽

Cited By ~ 6

Author(s):

Abdellatif Zaidi ◽

Iñaki Estella-Aguerri ◽

Shlomo Shamai (Shitz)

Keyword(s):

Gaussian Model ◽

Representation Learning ◽

Distributed Information ◽

Information Theoretic ◽

Information Bottleneck ◽

Radio Access ◽

Trade Offs ◽

Optimal Inputs ◽

Logarithmic Loss ◽

Complexity Constraints

This tutorial paper focuses on the variants of the bottleneck problem taking an information theoretic perspective and discusses practical methods to solve it, as well as its connection to coding and learning aspects. The intimate connections of this setting to remote source-coding under logarithmic loss distortion measure, information combining, common reconstruction, the Wyner–Ahlswede–Korner problem, the efficiency of investment information, as well as, generalization, variational inference, representation learning, autoencoders, and others are highlighted. We discuss its extension to the distributed information bottleneck problem with emphasis on the Gaussian model and highlight the basic connections to the uplink Cloud Radio Access Networks (CRAN) with oblivious processing. For this model, the optimal trade-offs between relevance (i.e., information) and complexity (i.e., rates) in the discrete and vector Gaussian frameworks is determined. In the concluding outlook, some interesting problems are mentioned such as the characterization of the optimal inputs (“features”) distributions under power limitations maximizing the “relevance” for the Gaussian information bottleneck, under “complexity” constraints.

Download Full-text

Information Bottleneck and Representation Learning

Information-Theoretic Methods in Data Science ◽

10.1017/9781108616799.012 ◽

2021 ◽

pp. 330-358

Author(s):

Pablo Piantanida ◽

Leonardo Rey Vega

Keyword(s):

Representation Learning ◽

Information Bottleneck

Download Full-text

Model-Induced Generalization Error Bound for Information-Theoretic Representation Learning in Source-Data-Free Unsupervised Domain Adaptation

IEEE Transactions on Image Processing ◽

10.1109/tip.2021.3130530 ◽

2022 ◽

Vol 31 ◽

pp. 419-432

Author(s):

Baoyao Yang ◽

Hao-Wei Yeh ◽

Tatsuya Harada ◽

Pong C. Yuen

Keyword(s):

Error Bound ◽

Domain Adaptation ◽

Representation Learning ◽

Generalization Error ◽

Information Theoretic ◽

Unsupervised Domain Adaptation ◽

Source Data ◽

Generalization Error Bound

Download Full-text

The Effect of Evidence Transfer on Latent Feature Relevance for Clustering

Informatics ◽

10.3390/informatics6020017 ◽

2019 ◽

Vol 6 (2) ◽

pp. 17

Author(s):

Athanasios Davvetas ◽

Iraklis A. Klampanos ◽

Spiros Skiadopoulos ◽

Vangelis Karkaletsis

Keyword(s):

Mutual Information ◽

Ground Truth ◽

Original Data ◽

Information Theoretic ◽

Information Bottleneck ◽

Latent Space ◽

Before And After ◽

Feature Relevance ◽

Latent Representations ◽

Transfer Method

Evidence transfer for clustering is a deep learning method that manipulates the latent representations of an autoencoder according to external categorical evidence with the effect of improving a clustering outcome. Evidence transfer’s application on clustering is designed to be robust when introduced with a low quality of evidence, while increasing the effectiveness of the clustering accuracy during relevant corresponding evidence. We interpret the effects of evidence transfer on the latent representation of an autoencoder by comparing our method to the information bottleneck method. Information bottleneck is an optimisation problem of finding the best tradeoff between maximising the mutual information of data representations and a task outcome while at the same time being effective in compressing the original data source. We posit that the evidence transfer method has essentially the same objective regarding the latent representations produced by an autoencoder. We verify our hypothesis using information theoretic metrics from feature selection in order to perform an empirical analysis over the information that is carried through the bottleneck of the latent space. We use the relevance metric to compare the overall mutual information between the latent representations and the ground truth labels before and after their incremental manipulation, as well as, to study the effects of evidence transfer regarding the significance of each latent feature.

Download Full-text

Distributed information-theoretic biclustering of two memoryless sources

2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton) ◽

10.1109/allerton.2015.7447035 ◽

2015 ◽

Cited By ~ 3

Author(s):

Georg Pichler ◽

Pablo Piantanida ◽

Gerald Matz

Keyword(s):

Distributed Information ◽

Information Theoretic

Download Full-text

Optimal prediction with resource constraints using the information bottleneck

10.1101/2020.04.29.069179 ◽

2020 ◽

Author(s):

Vedant Sachdeva ◽

Thierry Mora ◽

Aleksandra M. Walczak ◽

Stephanie Palmer

Keyword(s):

Resource Constraints ◽

Internal Representation ◽

Motion Prediction ◽

Long Term Memory ◽

Resource Limitations ◽

Information Bottleneck ◽

Trade Offs ◽

Markovian Dynamics ◽

The Future

Responding to stimuli requires that organisms encode information about the external world. Not all parts of the signal are important for behavior, and resource limitations demand that signals be compressed. Prediction of the future input is widely beneficial in many biological systems. We compute the trade-offs between representing the past faithfully and predicting the future for input dynamics with different levels of complexity. For motion prediction, we show that, depending on the parameters in the input dynamics, velocity or position coordinates prove more predictive. We identify the properties of global, transferrable strategies for time-varying stimuli. For non-Markovian dynamics we explore the role of long-term memory of the internal representation. Lastly, we show that prediction in evolutionary population dynamics is linked to clustering allele frequencies into non-overlapping memories, revealing a very different prediction strategy from motion prediction.

Download Full-text

Energy-Spectral Efficiency Trade-Offs in Full-Duplex MU-MIMO Cloud-RANs with SWIPT

Wireless Communications and Mobile Computing ◽

10.1155/2021/6678792 ◽

2021 ◽

Vol 2021 ◽

pp. 1-21

Author(s):

Xuan-Xinh Nguyen ◽

Ha Hoang Kha

Keyword(s):

Energy Efficiency ◽

Spectral Efficiency ◽

Optimization Problems ◽

Iterative Algorithms ◽

Full Duplex ◽

Total Power ◽

Achievable Rate ◽

Power Splitting ◽

Radio Access ◽

Trade Offs

The present paper investigates the trade-offs between the energy efficiency (EE) and spectral efficiency (SE) in the full-duplex (FD) multiuser multi-input multioutput (MU-MIMO) cloud radio access networks (CRANs) with simultaneous wireless information and power transfer (SWIPT). In the considered network, the central unit (CU) intends to concurrently not only transfer both energy and information toward downlink (DL) users using power splitting structures but also receive signals from uplink (UL) users. This communication is executed via FD radio units (RUs) which are distributed nearby users and connected to the CU through limited capacity fronthaul (FH) links. In order to unveil interesting trade-offs between the EE and SE metrics, we first introduce three conventional single-objective optimization problems (SOOPs) including (i) system sum rate maximization, (ii) total power minimization, and (iii) fractional energy efficiency maximization. Then, by making use of the multiobjective optimization (MOO) framework, the MOO problem (MOOP) with the objective vector of the achievable rate and power consumption is addressed. All considered problems are nonconvex with respect to designing variables comprising precoding matrices, compression matrices, and DL power splitting factors; thus, it is extremely intractable to solve these problems directly. To overcome these issues, we develop iterative algorithms by utilizing the sequential convex approximation (SCA) approach for the first two SOO problems and the SCA-based Dinkelbach method for the fractional EE problem. Regarding the MOOP, we first rewrite it as an SOOP by applying the modified weighted Tchebycheff method and, then, propose the iterative algorithm-based SCA to find its optimal Pareto set. Various numerical simulations are conducted to study the system performance and appealing EE-SE trade-offs in the considered system.

Download Full-text

Heterogeneous Graph Information Bottleneck

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/226 ◽

2021 ◽

Author(s):

Liang Yang ◽

Fan Wu ◽

Zichen Zheng ◽

Bingxin Niu ◽

Junhua Gu ◽

...

Keyword(s):

Mutual Information ◽

Representation Learning ◽

Specific Information ◽

Heterogeneous Information ◽

Heterogeneous Information Networks ◽

Information Bottleneck ◽

Homogeneous Network ◽

Attributed Network ◽

Graph Neural Networks ◽

Attributed Networks

Most attempts on extending Graph Neural Networks (GNNs) to Heterogeneous Information Networks (HINs) implicitly take the direct assumption that the multiple homogeneous attributed networks induced by different meta-paths are complementary. The doubts about the hypothesis of complementary motivate an alternative assumption of consensus. That is, the aggregated node attributes shared by multiple homogeneous attributed networks are essential for node representations, while the specific ones in each homogeneous attributed network should be discarded. In this paper, a novel Heterogeneous Graph Information Bottleneck (HGIB) is proposed to implement the consensus hypothesis in an unsupervised manner. To this end, information bottleneck (IB) is extended to unsupervised representation learning by leveraging self-supervision strategy. Specifically, HGIB simultaneously maximizes the mutual information between one homogeneous network and the representation learned from another homogeneous network, while minimizes the mutual information between the specific information contained in one homogeneous network and the representation learned from this homogeneous network. Model analysis reveals that the two extreme cases of HGIB correspond to the supervised heterogeneous GNN and the infomax on homogeneous graph, respectively. Extensive experiments on real datasets demonstrate that the consensus-based unsupervised HGIB significantly outperforms most semi-supervised SOTA methods based on complementary assumption.

Download Full-text