A practical tool for Maximal Information Coefficient analysis

Mapping Intimacies ◽

10.1101/215855 ◽

2017 ◽

Cited By ~ 1

Author(s):

Davide Albanese ◽

Samantha Riccadonna ◽

Claudio Donati ◽

Pietro Franceschi

Keyword(s):

Statistical Significance ◽

Metagenomic Dataset ◽

Measures Of Association ◽

Step Procedure ◽

Practical Tool ◽

Two Measures ◽

Information Coefficient ◽

Bias Variance ◽

Synthetic Datasets ◽

Maximal Information Coefficient

AbstractBackgroundThe ability of finding complex associations in large omics datasets, assessing their significance, and prioritizing them according to their strength can be of great help in the data exploration phase. Mutual Information based measures of association are particularly promising, in particular after the recent introduction of the TICeand MICeestimators, which combine computational efficiency with good bias/variance properties. Despite that, a complete software implementation of these two measures and of a statistical procedure to test the significance of each association is still missing.FindingsIn this paper we present MICtools, a comprehensive and effective pipeline which combines TICeand MICeinto a multi-step procedure that allows the identification of relationships of various degrees of complexity. MICtools calculates their strength assessing statistical significance using a permutation-based strategy. The performances of the proposed approach are assessed by an extensive investigation in synthetic datasets and an example of a potential application on a metagenomic dataset is also illustrated.ConclusionsWe show that MICtools, combining TICeand MICe, is able to highlight associations that would not be captured by conventional strategies. MICtools is implemented in Python, and is available for download athttps://github.com/minepy/mictools.

Download Full-text

A practical tool for maximal information coefficient analysis

GigaScience ◽

10.1093/gigascience/giy032 ◽

2018 ◽

Vol 7 (4) ◽

Cited By ~ 14

Author(s):

Davide Albanese ◽

Samantha Riccadonna ◽

Claudio Donati ◽

Pietro Franceschi

Keyword(s):

Practical Tool ◽

Information Coefficient ◽

Maximal Information Coefficient

Download Full-text

Income Status and Life Satisfaction

Journal of Happiness Studies ◽

10.1007/s10902-021-00397-y ◽

2021 ◽

Author(s):

Felix R. FitzRoy ◽

Michael A. Nolan

Keyword(s):

Life Satisfaction ◽

Household Income ◽

Statistical Significance ◽

Panel Survey ◽

Explanatory Variables ◽

Income Status ◽

Two Measures ◽

Reference Income ◽

Relative Form ◽

Retirement Status

AbstractThe importance of both income rank and relative income, as indicators of status, has long been recognised in the literature on life satisfaction and happiness. Recently, several authors have made explicit comparisons of the relative importance of these two measures of income status, and concluded that rank dominates to the extent that reference income becomes insignificant in regressions including both these explanatory variables, and that even absolute or household income, otherwise always positively related to happiness, may lose statistical significance. Here we test this hypothesis with a large UK panel (British Household Panel Survey and Understanding Society) for 1996–2017, split by age and retirement status, and find, contrary to previous results, that rank, household income and reference income are all usually important explanatory variables, but with significant differences between subgroups. This finding holds when rank is in its often-used relative form, and also with absolute rank.

Download Full-text

Context likelihood of relatedness with maximal information coefficient for Gene Regulatory Network inference

2015 18th International Conference on Computer and Information Technology (ICCIT) ◽

10.1109/iccitechn.2015.7488088 ◽

2015 ◽

Author(s):

M. A. H. Akhand ◽

R. N. Nandi ◽

S. M. Amran ◽

K. Murase

Keyword(s):

Gene Regulatory Network ◽

Regulatory Network ◽

Network Inference ◽

Gene Regulatory Network Inference ◽

Information Coefficient ◽

Gene Regulatory ◽

Maximal Information Coefficient

Download Full-text

Cleaning up the record on the maximal information coefficient and equitability

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1408920111 ◽

2014 ◽

Vol 111 (33) ◽

pp. E3362-E3363 ◽

Cited By ~ 16

Author(s):

D. N. Reshef ◽

Y. A. Reshef ◽

M. Mitzenmacher ◽

P. C. Sabeti

Keyword(s):

Information Coefficient ◽

Maximal Information Coefficient

Download Full-text

Resolution dependence of the maximal information coefficient for noiseless relationship

Statistics and Computing ◽

10.1007/s11222-013-9405-5 ◽

2013 ◽

Vol 24 (5) ◽

pp. 845-852 ◽

Cited By ~ 2

Author(s):

Shih-Chang Lee ◽

Ning-Ning Pang ◽

Wen-Jer Tzeng

Keyword(s):

Information Coefficient ◽

Maximal Information Coefficient

Download Full-text

Core–periphery structure in directed networks

Proceedings of The Royal Society A Mathematical Physical and Engineering Sciences ◽

10.1098/rspa.2019.0783 ◽

2020 ◽

Vol 476 (2241) ◽

pp. 20190783

Author(s):

Andrew Elliott ◽

Angus Chiu ◽

Marya Bazzi ◽

Gesine Reinert ◽

Mihai Cucuringu

Keyword(s):

Statistical Significance ◽

Ground Truth ◽

Directed Networks ◽

Political Blogs ◽

Novel Structure ◽

Two Measures ◽

Likelihood Approach ◽

Core Sets ◽

Empirical Networks

Empirical networks often exhibit different meso-scale structures, such as community and core–periphery structures. Core–periphery structure typically consists of a well-connected core and a periphery that is well connected to the core but sparsely connected internally. Most core–periphery studies focus on undirected networks. We propose a generalization of core–periphery structure to directed networks. Our approach yields a family of core–periphery block model formulations in which, contrary to many existing approaches, core and periphery sets are edge-direction dependent. We focus on a particular structure consisting of two core sets and two periphery sets, which we motivate empirically. We propose two measures to assess the statistical significance and quality of our novel structure in empirical data, where one often has no ground truth. To detect core–periphery structure in directed networks, we propose three methods adapted from two approaches in the literature, each with a different trade-off between computational complexity and accuracy. We assess the methods on benchmark networks where our methods match or outperform standard methods from the literature, with a likelihood approach achieving the highest accuracy. Applying our methods to three empirical networks—faculty hiring, a world trade dataset and political blogs—illustrates that our proposed structure provides novel insights in empirical networks.

Download Full-text

Inferring joint sequence-structural determinants of protein functional specificity

eLife ◽

10.7554/elife.29880 ◽

2018 ◽

Vol 7 ◽

Cited By ~ 5

Author(s):

Andrew F Neuwald ◽

L Aravind ◽

Stephen F Altschul

Keyword(s):

Statistical Significance ◽

Structural Features ◽

Dna Glycosylases ◽

Rna Helicases ◽

Structural Determinants ◽

Functional Specificity ◽

Allosteric Sites ◽

Two Measures ◽

Direct Coupling Analysis ◽

Atomic Coordinates

Residues responsible for allostery, cooperativity, and other subtle but functionally important interactions remain difficult to detect. To aid such detection, we employ statistical inference based on the assumption that residues distinguishing a protein subgroup from evolutionarily divergent subgroups often constitute an interacting functional network. We identify such networks with the aid of two measures of statistical significance. One measure aids identification of divergent subgroups based on distinguishing residue patterns. For each subgroup, a second measure identifies structural interactions involving pattern residues. Such interactions are derived either from atomic coordinates or from Direct Coupling Analysis scores, used as surrogates for structural distances. Applying this approach to N-acetyltransferases, P-loop GTPases, RNA helicases, synaptojanin-superfamily phosphatases and nucleases, and thymine/uracil DNA glycosylases yielded results congruent with biochemical understanding of these proteins, and also revealed striking sequence-structural features overlooked by other methods. These and similar analyses can aid the design of drugs targeting allosteric sites.

Download Full-text