scholarly journals A practical tool for Maximal Information Coefficient analysis

2017 ◽  
Author(s):  
Davide Albanese ◽  
Samantha Riccadonna ◽  
Claudio Donati ◽  
Pietro Franceschi

AbstractBackgroundThe ability of finding complex associations in large omics datasets, assessing their significance, and prioritizing them according to their strength can be of great help in the data exploration phase. Mutual Information based measures of association are particularly promising, in particular after the recent introduction of the TICeand MICeestimators, which combine computational efficiency with good bias/variance properties. Despite that, a complete software implementation of these two measures and of a statistical procedure to test the significance of each association is still missing.FindingsIn this paper we present MICtools, a comprehensive and effective pipeline which combines TICeand MICeinto a multi-step procedure that allows the identification of relationships of various degrees of complexity. MICtools calculates their strength assessing statistical significance using a permutation-based strategy. The performances of the proposed approach are assessed by an extensive investigation in synthetic datasets and an example of a potential application on a metagenomic dataset is also illustrated.ConclusionsWe show that MICtools, combining TICeand MICe, is able to highlight associations that would not be captured by conventional strategies. MICtools is implemented in Python, and is available for download athttps://github.com/minepy/mictools.

GigaScience ◽  
2018 ◽  
Vol 7 (4) ◽  
Author(s):  
Davide Albanese ◽  
Samantha Riccadonna ◽  
Claudio Donati ◽  
Pietro Franceschi

Author(s):  
Felix R. FitzRoy ◽  
Michael A. Nolan

AbstractThe importance of both income rank and relative income, as indicators of status, has long been recognised in the literature on life satisfaction and happiness. Recently, several authors have made explicit comparisons of the relative importance of these two measures of income status, and concluded that rank dominates to the extent that reference income becomes insignificant in regressions including both these explanatory variables, and that even absolute or household income, otherwise always positively related to happiness, may lose statistical significance. Here we test this hypothesis with a large UK panel (British Household Panel Survey and Understanding Society) for 1996–2017, split by age and retirement status, and find, contrary to previous results, that rank, household income and reference income are all usually important explanatory variables, but with significant differences between subgroups. This finding holds when rank is in its often-used relative form, and also with absolute rank.


2014 ◽  
Vol 111 (33) ◽  
pp. E3362-E3363 ◽  
Author(s):  
D. N. Reshef ◽  
Y. A. Reshef ◽  
M. Mitzenmacher ◽  
P. C. Sabeti

2013 ◽  
Vol 24 (5) ◽  
pp. 845-852 ◽  
Author(s):  
Shih-Chang Lee ◽  
Ning-Ning Pang ◽  
Wen-Jer Tzeng

Author(s):  
Andrew Elliott ◽  
Angus Chiu ◽  
Marya Bazzi ◽  
Gesine Reinert ◽  
Mihai Cucuringu

Empirical networks often exhibit different meso-scale structures, such as community and core–periphery structures. Core–periphery structure typically consists of a well-connected core and a periphery that is well connected to the core but sparsely connected internally. Most core–periphery studies focus on undirected networks. We propose a generalization of core–periphery structure to directed networks. Our approach yields a family of core–periphery block model formulations in which, contrary to many existing approaches, core and periphery sets are edge-direction dependent. We focus on a particular structure consisting of two core sets and two periphery sets, which we motivate empirically. We propose two measures to assess the statistical significance and quality of our novel structure in empirical data, where one often has no ground truth. To detect core–periphery structure in directed networks, we propose three methods adapted from two approaches in the literature, each with a different trade-off between computational complexity and accuracy. We assess the methods on benchmark networks where our methods match or outperform standard methods from the literature, with a likelihood approach achieving the highest accuracy. Applying our methods to three empirical networks—faculty hiring, a world trade dataset and political blogs—illustrates that our proposed structure provides novel insights in empirical networks.


eLife ◽  
2018 ◽  
Vol 7 ◽  
Author(s):  
Andrew F Neuwald ◽  
L Aravind ◽  
Stephen F Altschul

Residues responsible for allostery, cooperativity, and other subtle but functionally important interactions remain difficult to detect. To aid such detection, we employ statistical inference based on the assumption that residues distinguishing a protein subgroup from evolutionarily divergent subgroups often constitute an interacting functional network. We identify such networks with the aid of two measures of statistical significance. One measure aids identification of divergent subgroups based on distinguishing residue patterns. For each subgroup, a second measure identifies structural interactions involving pattern residues. Such interactions are derived either from atomic coordinates or from Direct Coupling Analysis scores, used as surrogates for structural distances. Applying this approach to N-acetyltransferases, P-loop GTPases, RNA helicases, synaptojanin-superfamily phosphatases and nucleases, and thymine/uracil DNA glycosylases yielded results congruent with biochemical understanding of these proteins, and also revealed striking sequence-structural features overlooked by other methods. These and similar analyses can aid the design of drugs targeting allosteric sites.


2018 ◽  
Vol 8 (1) ◽  
Author(s):  
Maria Sole Morelli ◽  
Alberto Greco ◽  
Gaetano Valenza ◽  
Alberto Giannoni ◽  
Michele Emdin ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document