Lagrangian Data Analysis in Mesoscale Prediction and Model Validation Studies

1999 ◽  
Author(s):  
Annalisa Griffa ◽  
Tamay Ozgokmen
Author(s):  
Conrado Chiarello ◽  
Hedilberto Barros ◽  
Moisés Marcelino Neto ◽  
Rigoberto Morales

2019 ◽  
Vol 579 ◽  
pp. 124209 ◽  
Author(s):  
Dangwei Wang ◽  
Yuanxu Ma ◽  
Xiaofang Liu ◽  
He Qing Huang ◽  
Li Huang ◽  
...  

2020 ◽  
Vol 69 (4) ◽  
pp. 795-812 ◽  
Author(s):  
Xiaodong Jiang ◽  
Scott V Edwards ◽  
Liang Liu

Abstract A statistical framework of model comparison and model validation is essential to resolving the debates over concatenation and coalescent models in phylogenomic data analysis. A set of statistical tests are here applied and developed to evaluate and compare the adequacy of substitution, concatenation, and multispecies coalescent (MSC) models across 47 phylogenomic data sets collected across tree of life. Tests for substitution models and the concatenation assumption of topologically congruent gene trees suggest that a poor fit of substitution models, rejected by 44% of loci, and concatenation models, rejected by 38% of loci, is widespread. Logistic regression shows that the proportions of GC content and informative sites are both negatively correlated with the fit of substitution models across loci. Moreover, a substantial violation of the concatenation assumption of congruent gene trees is consistently observed across six major groups (birds, mammals, fish, insects, reptiles, and others, including other invertebrates). In contrast, among those loci adequately described by a given substitution model, the proportion of loci rejecting the MSC model is 11%, significantly lower than those rejecting the substitution and concatenation models. Although conducted on reduced data sets due to computational constraints, Bayesian model validation and comparison both strongly favor the MSC over concatenation across all data sets; the concatenation assumption of congruent gene trees rarely holds for phylogenomic data sets with more than 10 loci. Thus, for large phylogenomic data sets, model comparisons are expected to consistently and more strongly favor the coalescent model over the concatenation model. We also found that loci rejecting the MSC have little effect on species tree estimation. Our study reveals the value of model validation and comparison in phylogenomic data analysis, as well as the need for further improvements of multilocus models and computational tools for phylogenetic inference. [Bayes factor; Bayesian model validation; coalescent prior; congruent gene trees; independent prior; Metazoa; posterior predictive simulation.]


2019 ◽  
Author(s):  
Xiaodong Jian ◽  
Scott V. Edwards ◽  
Liang Liu

ABSTRACTA statistical framework of model comparison and model validation is essential to resolving the debates over concatenation and coalescent models in phylogenomic data analysis. A set of statistical tests are here applied and developed to evaluate and compare the adequacy of substitution, concatenation, and multispecies coalescent (MSC) models across 47 phylogenomic data sets collected across tree of life. Tests for substitution models and the concatenation assumption of topologically concordant gene trees suggest that a poor fit of substitution models (44% of loci rejecting the substitution model) and concatenation models (38% of loci rejecting the hypothesis of topologically congruent gene trees) is widespread. Logistic regression shows that the proportions of GC content and informative sites are both negatively correlated with the fit of substitution models across loci. Moreover, a substantial violation of the concatenation assumption of congruent gene trees is consistently observed across 6 major groups (birds, mammals, fish, insects, reptiles, and others, including other invertebrates). In contrast, among those loci adequately described by a given substitution model, the proportion of loci rejecting the MSC model is 11%, significantly lower than those rejecting the substitution and concatenation models, and Bayesian model comparison strongly favors the MSC over concatenation across all data sets. Species tree inference suggests that loci rejecting the MSC have little effect on species tree estimation. Due to computational constraints, the Bayesian model validation and comparison analyses were conducted on the reduced data sets. A complete analysis of phylogenomic data requires the development of efficient algorithms for phylogenetic inference. Nevertheless, the concatenation assumption of congruent gene trees rarely holds for phylogenomic data with more than 10 loci. Thus, for large phylogenomic data sets, model comparison analyses are expected to consistently and more strongly favor the coalescent model over the concatenation model. Our analysis reveals the value of model validation and comparison in phylogenomic data analysis, as well as the need for further improvements of multilocus models and computational tools for phylogenetic inference.


Author(s):  
Fernando Alarid-Escudero ◽  
Roman Gulati ◽  
Carolyn M. Rutter

This chapter discusses validation of simulation models used to inform health policy. Confidence in a model’s validity can be weaker or stronger depending on several factors. These factors include verifying whether model specifications were implemented correctly, evaluating the extent to which model-predicted results are consistent with empirical results, and examining whether model predictions are robust to alternative structural assumptions. Systematic evaluation of these factors can be used to gauge the extent to which a model is validated for a given application. It reviews types of validation, discusses the related concepts of calibration and nonidentifiability, takes a deeper dive into cancer model validation studies, and concludes with questions that consumers of models should ask (and modelers should answer) to inform judgment about a model’s fitness for purpose. Final judgments about when model results can be trusted ultimately rely on the evolving understanding of the disease and intervention effects, available data relevant to the application, and access to reporting of model validation exercises.


2017 ◽  
Vol 108 ◽  
pp. 2090-2099 ◽  
Author(s):  
Konstantin Ryabinin ◽  
Svetlana Chuprina

2012 ◽  
Vol 134 (3) ◽  
Author(s):  
Zhenfei Zhan ◽  
Yan Fu ◽  
Ren-Jye Yang ◽  
Yinghong Peng

Validation of computational models with multiple, repeated, and correlated functional responses for a dynamic system requires the consideration of uncertainty quantification and propagation, multivariate data correlation, and objective robust metrics. This paper presents a new method of model validation under uncertainty to address these critical issues. Three key technologies of this new method are uncertainty quantification and propagation using statistical data analysis, probabilistic principal component analysis (PPCA), and interval-based Bayesian hypothesis testing. Statistical data analysis is used to quantify the variabilities of the repeated tests and computer-aided engineering (CAE) model results. The differences between the mean values of test and CAE data are extracted as validation features, and the PPCA is employed to handle multivariate correlation and to reduce the dimension of the multivariate difference curves. The variabilities of the repeated test and CAE data are propagated through the data transformation to the PPCA space. In addition, physics-based thresholds are defined and transformed to the PPCA space. Finally, interval-based Bayesian hypothesis testing is conducted on the reduced difference data to assess the model validity under uncertainty. A real-world dynamic system example which has one set of the repeated test data and two stochastic CAE models is used to demonstrate this new approach.


Sign in / Sign up

Export Citation Format

Share Document