scholarly journals Two Measures of Dependence

Entropy ◽  
2019 ◽  
Vol 21 (8) ◽  
pp. 778 ◽  
Author(s):  
Amos Lapidoth ◽  
Christoph Pfister

Two families of dependence measures between random variables are introduced. They are based on the Rényi divergence of order α and the relative α -entropy, respectively, and both dependence measures reduce to Shannon’s mutual information when their order α is one. The first measure shares many properties with the mutual information, including the data-processing inequality, and can be related to the optimal error exponents in composite hypothesis testing. The second measure does not satisfy the data-processing inequality, but appears naturally in the context of distributed task encoding.

Entropy ◽  
2020 ◽  
Vol 22 (3) ◽  
pp. 316 ◽  
Author(s):  
Cédric Bleuler ◽  
Amos Lapidoth ◽  
Christoph Pfister

Motivated by a horse betting problem, a new conditional Rényi divergence is introduced. It is compared with the conditional Rényi divergences that appear in the definitions of the dependence measures by Csiszár and Sibson, and the properties of all three are studied with emphasis on their behavior under data processing. In the same way that Csiszár’s and Sibson’s conditional divergence lead to the respective dependence measures, so does the new conditional divergence lead to the Lapidoth–Pfister mutual information. Moreover, the new conditional divergence is also related to the Arimoto–Rényi conditional entropy and to Arimoto’s measure of dependence. In the second part of the paper, the horse betting problem is analyzed where, instead of Kelly’s expected log-wealth criterion, a more general family of power-mean utility functions is considered. The key role in the analysis is played by the Rényi divergence, and in the setting where the gambler has access to side information, the new conditional Rényi divergence is key. The setting with side information also provides another operational meaning to the Lapidoth–Pfister mutual information. Finally, a universal strategy for independent and identically distributed races is presented that—without knowing the winning probabilities or the parameter of the utility function—asymptotically maximizes the gambler’s utility function.


Entropy ◽  
2021 ◽  
Vol 23 (2) ◽  
pp. 199
Author(s):  
Sergio Verdú

Over the last six decades, the representation of error exponent functions for data transmission through noisy channels at rates below capacity has seen three distinct approaches: (1) Through Gallager’s E0 functions (with and without cost constraints); (2) large deviations form, in terms of conditional relative entropy and mutual information; (3) through the α-mutual information and the Augustin–Csiszár mutual information of order α derived from the Rényi divergence. While a fairly complete picture has emerged in the absence of cost constraints, there have remained gaps in the interrelationships between the three approaches in the general case of cost-constrained encoding. Furthermore, no systematic approach has been proposed to solve the attendant optimization problems by exploiting the specific structure of the information functions. This paper closes those gaps and proposes a simple method to maximize Augustin–Csiszár mutual information of order α under cost constraints by means of the maximization of the α-mutual information subject to an exponential average constraint.


1978 ◽  
Vol 17 (01) ◽  
pp. 36-40 ◽  
Author(s):  
J.-P. Durbec ◽  
Jaqueline Cornée ◽  
P. Berthezene

The practice of systematic examinations in hospitals and the increasing development of automatic data processing permits the storing of a great deal of information about a large number of patients belonging to different diagnosis groups.To predict or to characterize these diagnosis groups some descriptors are particularly useful, others carry no information. Data screening based on the properties of mutual information and on the log cross products ratios in contingency tables is developed. The most useful descriptors are selected. For each one the characterized groups are specified.This approach has been performed on a set of binary (presence—absence) radiological variables. Four diagnoses groups are concerned: cancer of pancreas, chronic calcifying pancreatitis, non-calcifying pancreatitis and probable pancreatitis. Only twenty of the three hundred and forty initial radiological variables are selected. The presence of each corresponding sign is associated with one or more diagnosis groups.


Entropy ◽  
2021 ◽  
Vol 23 (5) ◽  
pp. 533
Author(s):  
Milan S. Derpich ◽  
Jan Østergaard

We present novel data-processing inequalities relating the mutual information and the directed information in systems with feedback. The internal deterministic blocks within such systems are restricted only to be causal mappings, but are allowed to be non-linear and time varying, and randomized by their own external random input, can yield any stochastic mapping. These randomized blocks can for example represent source encoders, decoders, or even communication channels. Moreover, the involved signals can be arbitrarily distributed. Our first main result relates mutual and directed information and can be interpreted as a law of conservation of information flow. Our second main result is a pair of data-processing inequalities (one the conditional version of the other) between nested pairs of random sequences entirely within the closed loop. Our third main result introduces and characterizes the notion of in-the-loop (ITL) transmission rate for channel coding scenarios in which the messages are internal to the loop. Interestingly, in this case the conventional notions of transmission rate associated with the entropy of the messages and of channel capacity based on maximizing the mutual information between the messages and the output turn out to be inadequate. Instead, as we show, the ITL transmission rate is the unique notion of rate for which a channel code attains zero error probability if and only if such an ITL rate does not exceed the corresponding directed information rate from messages to decoded messages. We apply our data-processing inequalities to show that the supremum of achievable (in the usual channel coding sense) ITL transmission rates is upper bounded by the supremum of the directed information rate across the communication channel. Moreover, we present an example in which this upper bound is attained. Finally, we further illustrate the applicability of our results by discussing how they make possible the generalization of two fundamental inequalities known in networked control literature.


Microbiome ◽  
2021 ◽  
Vol 9 (1) ◽  
Author(s):  
Ina Maria Deutschmann ◽  
Gipsi Lima-Mendez ◽  
Anders K. Krabberød ◽  
Jeroen Raes ◽  
Sergio M. Vallina ◽  
...  

Abstract Background Ecological interactions among microorganisms are fundamental for ecosystem function, yet they are mostly unknown or poorly understood. High-throughput-omics can indicate microbial interactions through associations across time and space, which can be represented as association networks. Associations could result from either ecological interactions between microorganisms, or from environmental selection, where the association is environmentally driven. Therefore, before downstream analysis and interpretation, we need to distinguish the nature of the association, particularly if it is due to environmental selection or not. Results We present EnDED (environmentally driven edge detection), an implementation of four approaches as well as their combination to predict which links between microorganisms in an association network are environmentally driven. The four approaches are sign pattern, overlap, interaction information, and data processing inequality. We tested EnDED on networks from simulated data of 50 microorganisms. The networks contained on average 50 nodes and 1087 edges, of which 60 were true interactions but 1026 false associations (i.e., environmentally driven or due to chance). Applying each method individually, we detected a moderate to high number of environmentally driven edges—87% sign pattern and overlap, 67% interaction information, and 44% data processing inequality. Combining these methods in an intersection approach resulted in retaining more interactions, both true and false (32% of environmentally driven associations). After validation with the simulated datasets, we applied EnDED on a marine microbial network inferred from 10 years of monthly observations of microbial-plankton abundance. The intersection combination predicted that 8.3% of the associations were environmentally driven, while individual methods predicted 24.8% (data processing inequality), 25.7% (interaction information), and up to 84.6% (sign pattern as well as overlap). The fraction of environmentally driven edges among negative microbial associations in the real network increased rapidly with the number of environmental factors. Conclusions To reach accurate hypotheses about ecological interactions, it is important to determine, quantify, and remove environmentally driven associations in marine microbial association networks. For that, EnDED offers up to four individual methods as well as their combination. However, especially for the intersection combination, we suggest using EnDED with other strategies to reduce the number of false associations and consequently the number of potential interaction hypotheses.


Author(s):  
Manabu Kimura ◽  
◽  
Masashi Sugiyama

Recently, statistical dependence measures such as mutual information and kernelized covariance have been successfully applied to clustering. In this paper, we follow this line of research and propose a novel dependence-maximization clustering method based on least-squares mutual information, which is an estimator of a squared-loss variant of mutual information. A notable advantage of the proposed method over existing approaches is that hyperparameters such as kernel parameters and regularization parameters can be objectively optimized based on cross-validation. Thus, subjective manual-tuning of hyperparameters is not necessary in the proposed method, which is a highly useful property in unsupervised clustering scenarios. Through experiments, we illustrate the usefulness of the proposed approach.


2019 ◽  
Vol 32 (02) ◽  
pp. 2050005 ◽  
Author(s):  
Andreas Bluhm ◽  
Ángela Capel

In this work, we provide a strengthening of the data processing inequality for the relative entropy introduced by Belavkin and Staszewski (BS-entropy). This extends previous results by Carlen and Vershynina for the relative entropy and other standard [Formula: see text]-divergences. To this end, we provide two new equivalent conditions for the equality case of the data processing inequality for the BS-entropy. Subsequently, we extend our result to a larger class of maximal [Formula: see text]-divergences. Here, we first focus on quantum channels which are conditional expectations onto subalgebras and use the Stinespring dilation to lift our results to arbitrary quantum channels.


1965 ◽  
Vol 20 (1) ◽  
pp. 79-86 ◽  
Author(s):  
Joe B. Alexander ◽  
Howard E. Gudeman

This study was concerned with the relationship between perceptual and interpersonal measures of dependence for a sample of 60 male Ss. Four groups of alcoholics, one group of hospitalized psychiatric patients, and a group of normals were compared on the Rod and Frame Test and three laboratory interpersonal tasks to evaluate the hypothesis that perceptual and interpersonal dependence measures are significantly related. The results only partially confirmed the hypothesis. The over-all correlation was significant, as was the over-all correlation for four groups of alcoholics. Only two of the six subgroup correlations, however, were significant. These results suggest the need for further study, using larger sample sizes, to determine the specific relationship of the two variables.


Sign in / Sign up

Export Citation Format

Share Document