Information-theoretic model selection for optimal prediction of stochastic dynamical systems from data

David Darmon

doi:10.1103/physreve.97.032206

Information-Theoretic Model Selection for Independent Components

Latent Variable Analysis and Signal Separation - Lecture Notes in Computer Science ◽

10.1007/978-3-642-15995-4_32 ◽

2010 ◽

pp. 254-262

Author(s):

Claudia Plant ◽

Fabian J. Theis ◽

Anke Meyer-Baese ◽

Christian Böhm

Keyword(s):

Model Selection ◽

Theoretic Model ◽

Information Theoretic ◽

Independent Components ◽

Selection For

Download Full-text

Information Theoretic Model Selection for Accurately Estimating Unreported COVID-19 Infections

10.1101/2021.09.14.21263467 ◽

2021 ◽

Author(s):

Jiaming Cui ◽

Arash Haddadan ◽

A S M Ahsan-Ul Haque ◽

Bijaya Adhikari ◽

Anil Vullikanti ◽

...

Keyword(s):

Model Selection ◽

Disease Spread ◽

Theoretic Model ◽

Theoretic Approach ◽

Information Theoretic ◽

The Us ◽

Serological Studies ◽

Selection For ◽

Information Theoretic Approach ◽

Pharmaceutical Interventions

Estimating the true extent of the outbreak was one of the major challenges in combating COVID-19 outbreak early on. Our inability in doing so, allowed unreported/undetected in- fections to drive up disease spread in numerous regions in the US and worldwide. Accurately identifying the true magnitude of infections still remains a major challenge, despite the use of surveillance-based methods such as serological studies, due to their costs and biases. In this paper, we propose an information theoretic approach to accurately estimate the unreported infections. Our approach, built on top of an existing ordinary differential equations based epi- demiological model, aims to deduce an optimal parameterization of the epidemiological model and the true extent of the outbreak which "best describes" the observed reported infections. Our experiments show that the parameterization learned by our framework leads to a better estimation of unreported infections as well as more accurate forecasts of the reported infec- tions compared to the baseline parameterization. We also demonstrate that our framework can be leveraged to simulate what-if scenarios with non-pharmaceutical interventions. Our results also support earlier findings that a large majority of COVID-19 infections were unreported and non-pharmaceutical interventions indeed helped in mitigating the COVID-19 outbreak.

Download Full-text

Characterizing the effects of sex, APOE ɛ4, and literacy on mid-life cognitive trajectories: Application of Information-Theoretic model-averaging and multi-model inference techniques to the Wisconsin Registry for Alzheimer’s Prevention Study

10.1101/229237 ◽

2017 ◽

Author(s):

Rebecca L. Koscik ◽

Derek L. Norton ◽

Samantha L. Allison ◽

Erin M. Jonaitis ◽

Lindsay R. Clark ◽

...

Keyword(s):

Model Selection ◽

Cognitive Decline ◽

Model Averaging ◽

Parameter Estimates ◽

Theoretic Model ◽

Traditional Model ◽

Test Model ◽

Prevention Study ◽

Information Theoretic ◽

Modifiable Factors

ObjectiveIn this paper we apply Information-Theoretic (IT) model averaging to characterize a set of complex interactions in a longitudinal study on cognitive decline. Prior research has identified numerous genetic (including sex), education, health and lifestyle factors that predict cognitive decline. Traditional model selection approaches (e.g., backward or stepwise selection) attempt to find models that best fit the observed data; these techniques risk interpretations that only the selected predictors are important. In reality, several models may fit similarly well but result in different conclusions (e.g., about size and significance of parameter estimates); inference from traditional model selection approaches can lead to overly confident conclusions.MethodHere we use longitudinal cognitive data from ~1550 late-middle aged adults the Wisconsin Registry for Alzheimer’s Prevention study to examine the effects of sex, Apolipoprotein E (APOE) ɛ4 allele (non-modifiable factors), and literacy achievement (modifiable) on cognitive decline. For each outcome, we applied IT model averaging to a model set with combinations of interactions among sex, APOE, literacy, and age.ResultsFor a list-learning test, model-averaged results showed better performance for women vs men, with faster decline among men; increased literacy was associated with better performance, particularly among men. APOE had less of an effect on cognitive performance in this age range (~40-70).ConclusionsThese results illustrate the utility of the IT approach and point to literacy as a potential modifier of decline. Whether the protective effect of literacy is due to educational attainment or intrinsic verbal intellectual ability is the topic of ongoing work.

Download Full-text

Model Selection for Optimal Prediction in Statistical Machine Learning

Notices of the American Mathematical Society ◽

10.1090/noti2014 ◽

2020 ◽

Vol 67 (02) ◽

pp. 1

Author(s):

Ernest Fokoué

Keyword(s):

Machine Learning ◽

Model Selection ◽

Optimal Prediction ◽

Statistical Machine Learning ◽

Selection For

Download Full-text

Information-theoretic bounds on model selection for Gaussian Markov random fields

2010 IEEE International Symposium on Information Theory ◽

10.1109/isit.2010.5513573 ◽

2010 ◽

Cited By ~ 20

Author(s):

Wei Wang ◽

Martin J. Wainwright ◽

Kannan Ramchandran

Keyword(s):

Model Selection ◽

Random Fields ◽

Markov Random Fields ◽

Information Theoretic ◽

Gaussian Markov Random Fields ◽

Markov Random ◽

Selection For

Download Full-text

The interplay between communities and homophily in semi-supervised classification using graph neural networks

Applied Network Science ◽

10.1007/s41109-021-00423-1 ◽

2021 ◽

Vol 6 (1) ◽

Author(s):

Hussain Hussain ◽

Tomislav Duricic ◽

Elisabeth Lex ◽

Denis Helic ◽

Roman Kern

Keyword(s):

Neural Networks ◽

Community Structure ◽

Model Selection ◽

Graph Structure ◽

Information Theoretic ◽

Node Classification ◽

Selection For ◽

Graph Neural Networks ◽

Node Labels ◽

The Impact

AbstractGraph Neural Networks (GNNs) are effective in many applications. Still, there is a limited understanding of the effect of common graph structures on the learning process of GNNs. To fill this gap, we study the impact of community structure and homophily on the performance of GNNs in semi-supervised node classification on graphs. Our methodology consists of systematically manipulating the structure of eight datasets, and measuring the performance of GNNs on the original graphs and the change in performance in the presence and the absence of community structure and/or homophily. Our results show the major impact of both homophily and communities on the classification accuracy of GNNs, and provide insights on their interplay. In particular, by analyzing community structure and its correlation with node labels, we are able to make informed predictions on the suitability of GNNs for classification on a given graph. Using an information-theoretic metric for community-label correlation, we devise a guideline for model selection based on graph structure. With our work, we provide insights on the abilities of GNNs and the impact of common network phenomena on their performance. Our work improves model selection for node classification in semi-supervised settings.

Download Full-text

Information theoretic model selection applied to supernovae data

Journal of Cosmology and Astroparticle Physics ◽

10.1088/1475-7516/2007/02/003 ◽

2007 ◽

Vol 2007 (02) ◽

pp. 003-003 ◽

Cited By ~ 44

Author(s):

Marek Biesiada

Keyword(s):

Model Selection ◽

Theoretic Model ◽

Information Theoretic

Download Full-text

Information-Theoretic Model Selection and Model Averaging for Closed-Population Capture-Recapture Studies

Biometrical Journal ◽

10.1002/(sici)1521-4036(199808)40:4<475::aid-bimj475>3.0.co;2-# ◽

1998 ◽

Vol 40 (4) ◽

pp. 475-494 ◽

Cited By ~ 27

Author(s):

Thomas R. Stanley ◽

Kenneth P. Burnham

Keyword(s):

Model Selection ◽

Model Averaging ◽

Theoretic Model ◽

Closed Population ◽

Information Theoretic ◽

Capture Recapture

Download Full-text

Information-theoretic model selection affects home-range estimation and habitat preference inference: a case study of male Reeves’s Pheasants Syrmaticus reevesii

Ibis ◽

10.1111/j.1474-919x.2012.01214.x ◽

2012 ◽

Vol 154 (2) ◽

pp. 273-284 ◽

Cited By ~ 6

Author(s):

YONG WANG ◽

JILING XU ◽

JOHN P. CARPENTER ◽

ZHENGWANG ZHANG ◽

GUANGMEI ZHENG

Keyword(s):

Model Selection ◽

Home Range ◽

Habitat Preference ◽

Theoretic Model ◽

Range Estimation ◽

Information Theoretic ◽

Syrmaticus Reevesii

Download Full-text

Ontology Modularization with OAPT

Journal on Data Semantics ◽

10.1007/s13740-020-00114-7 ◽

2020 ◽

Vol 9 (2-3) ◽

pp. 53-83

Author(s):

Alsayed Algergawy ◽

Samira Babalou ◽

Friederike Klan ◽

Birgitta König-Ries

Keyword(s):

Model Selection ◽

Semantic Web ◽

Selection Method ◽

Theoretic Model ◽

New Approach ◽

Information Theoretic ◽

Model Selection Method ◽

Partitioning Algorithm

Abstract Ontologies are the backbone of the Semantic Web. As a result, the number of existing ontologies and the number of topics covered by them has increased considerably. With this, reusing these ontologies becomes preferable to constructing new ontologies from scratch. However, a user might be interested in a part and/or a set of parts of a given ontology, only. Therefore, ontology modularization, i.e., splitting up an ontology into smaller parts that can be independently used, becomes a necessity. In this paper, we introduce a new approach to partition ontology based on the seeding-based scheme, which is developed and implemented through the Ontology Analysis and Partitioning Tool (OAPT). This tool proceeds according to the following methodology: first, before a candidate ontology is partitioned, OAPT optionally analyzes the input ontology to determine, if this ontology is worth considering using a predefined set of criteria that quantify the semantic and structural richness of the ontology. After that, we apply the seeding-based partitioning algorithm to modularize it into a set of modules. To decide upon a suitable number of modules that will be generated by partitioning the ontology, we provide the user a recommendation based on an information theoretic model selection method. We demonstrate the effectiveness of the OAPT tool and validate the performance of the partitioning approach by conducting an extensive set of experiments. The results prove the quality and the efficiency of the proposed tool.

Download Full-text