Evolutionary Sample Size and Consilience in Phylogenetic Comparative Analysis

2021 ◽  
Author(s):  
Jacob D Gardner ◽  
Chris L Organ

Abstract Phylogenetic comparative methods (PCMs) are commonly used to study evolution and adaptation. However, frequently used PCMs for discrete traits mishandle single evolutionary transitions. They erroneously detect correlated evolution in these situations. For example, hair and mammary glands cannot be said to have evolved in a correlated fashion because each evolved only once in mammals, but a commonly used model (Pagel’s Discrete) statistically supports correlated (dependent) evolution. Using simulations, we find that rate parameter estimation, which is central for model selection, is poor in these scenarios due to small effective (evolutionary) sample sizes of independent character state change. Pagel’s Discrete model also tends to favor dependent evolution in these scenarios, in part, because it forces evolution through state combinations unobserved in the tip data. This model prohibits simultaneous dual transitions along branches. Models with underlying continuous data distributions (e.g., Threshold and GLMM) are less prone to favor correlated evolution but are still susceptible when evolutionary sample sizes are small. We provide three general recommendations for researchers who encounter these common situations: i) create study designs that evaluate a priori hypotheses and maximize evolutionary sample sizes; ii) assess the suitability of evolutionary models—for discrete traits, we introduce the phylogenetic imbalance ratio; and iii) evaluate evolutionary hypotheses with a consilience of evidence from disparate fields, like biogeography and developmental biology. Consilience plays a central role in hypothesis testing within the historical sciences where experiments are difficult or impossible to conduct, such as many hypotheses about correlated evolution. These recommendations are useful for investigations that employ any type of PCM. [Class imbalance; consilience; correlated evolution; evolutionary sample size; phylogenetic comparative methods.]

2020 ◽  
Author(s):  
Evangelia Christodoulou ◽  
Maarten van Smeden ◽  
Michael Edlinger ◽  
Dirk Timmerman ◽  
Maria Wanitschek ◽  
...  

Abstract Background: We suggest an adaptive sample size calculation method for developing clinical prediction models, in which model performance is monitored sequentially as new data comes in. Methods: We illustrate the approach using data for the diagnosis of ovarian cancer (n=5914, 33% event fraction) and obstructive coronary artery disease (CAD; n=4888, 44% event fraction). We used logistic regression to develop a prediction model consisting only of a-priori selected predictors and assumed linear relations for continuous predictors. We mimicked prospective patient recruitment by developing the model on 100 randomly selected patients, and we used bootstrapping to internally validate the model. We sequentially added 50 random new patients until we reached a sample size of 3000, and re-estimated model performance at each step. We examined the required sample size for satisfying the following stopping rule: obtaining a calibration slope ≥0.9 and optimism in the c-statistic (ΔAUC) <=0.02 at two consecutive sample sizes. This procedure was repeated 500 times. We also investigated the impact of alternative modeling strategies: modeling nonlinear relations for continuous predictors, and applying Firth’s bias correction.Results: Better discrimination was achieved in the ovarian cancer data (c-statistic 0.9 with 7 predictors) than in the CAD data (c-statistic 0.7 with 11 predictors). Adequate calibration and limited optimism in discrimination was achieved after a median of 450 patients (interquartile range 450-500) for the ovarian cancer data (22 events per parameter (EPP), 20-24), and 750 patients (700-800) for the CAD data (30 EPP, 28-33). A stricter criterion, requiring ΔAUC <=0.01, was met with a median of 500 (23 EPP) and 1350 (54 EPP) patients, respectively. These sample sizes were much higher than the well-known 10 EPP rule of thumb and slightly higher than a recently published fixed sample size calculation method by Riley et al. Higher sample sizes were required when nonlinear relationships were modeled, and lower sample sizes when Firth’s correction was used. Conclusions: Adaptive sample size determination can be a useful supplement to a priori sample size calculations, because it allows to further tailor the sample size to the specific prediction modeling context in a dynamic fashion.


2021 ◽  
Author(s):  
Christopher McCrum ◽  
Jorg van Beek ◽  
Charlotte Schumacher ◽  
Sanne Janssen ◽  
Bas Van Hooren

Background: Context regarding how researchers determine the sample size of their experiments is important for interpreting the results and determining their value and meaning. Between 2018 and 2019, the journal Gait &amp; Posture introduced a requirement for sample size justification in their author guidelines.Research Question: How frequently and in what ways are sample sizes justified in Gait &amp; Posture research articles and was the inclusion of a guideline requiring sample size justification associated with a change in practice?Methods: The guideline was not in place prior to May 2018 and was in place from 25th July 2019. All articles in the three most recent volumes of the journal (84-86) and the three most recent, pre-guideline volumes (60-62) at time of preregistration were included in this analysis. This provided an initial sample of 324 articles (176 pre-guideline and 148 post-guideline). Articles were screened by two authors to extract author data, article metadata and sample size justification data. Specifically, screeners identified if (yes or no) and how sample sizes were justified. Six potential justification types (Measure Entire Population, Resource Constraints, Accuracy, A priori Power Analysis, Heuristics, No Justification) and an additional option of Other/Unsure/Unclear were used.Results: In most cases, authors of Gait &amp; Posture articles did not provide a justification for their study’s sample size. The inclusion of the guideline was associated with a modest increase in the percentage of articles providing a justification (16.6% to 28.1%). A priori power calculations were the dominant type of justification, but many were not reported in enough detail to allow replication.Significance: Gait &amp; Posture researchers should be more transparent in how they determine their sample sizes and carefully consider if they are suitable. Editors and journals may consider adding a similar guideline as a low-resource way to improve sample size justification reporting.


PeerJ ◽  
2017 ◽  
Vol 5 ◽  
pp. e3622 ◽  
Author(s):  
Tao Zhao ◽  
Di Liu ◽  
Zhiheng Li

The interplay between the pectoral module (the pectoral girdle and limbs) and the pelvic module (the pelvic girdle and limbs) plays a key role in shaping avian evolution, but prior empirical studies on trait covariation between the two modules are limited. Here we empirically test whether (size-corrected) sternal keel length and ilium length are correlated during avian evolution using phylogenetic comparative methods. Our analyses on extant birds and Mesozoic birds both recover a significantly positive correlation. The results provide new evidence regarding the integration between the pelvic and pectoral modules. The correlated evolution of sternal keel length and ilium length may serve as a mechanism to cope with the effect on performance caused by a tradeoff in muscle mass between the pectoral and pelvic modules, via changing moment arms of muscles that function in flight and in terrestrial locomotion.


2021 ◽  
Vol 5 (1) ◽  
Author(s):  
Evangelia Christodoulou ◽  
Maarten van Smeden ◽  
Michael Edlinger ◽  
Dirk Timmerman ◽  
Maria Wanitschek ◽  
...  

Abstract Background We suggest an adaptive sample size calculation method for developing clinical prediction models, in which model performance is monitored sequentially as new data comes in. Methods We illustrate the approach using data for the diagnosis of ovarian cancer (n = 5914, 33% event fraction) and obstructive coronary artery disease (CAD; n = 4888, 44% event fraction). We used logistic regression to develop a prediction model consisting only of a priori selected predictors and assumed linear relations for continuous predictors. We mimicked prospective patient recruitment by developing the model on 100 randomly selected patients, and we used bootstrapping to internally validate the model. We sequentially added 50 random new patients until we reached a sample size of 3000 and re-estimated model performance at each step. We examined the required sample size for satisfying the following stopping rule: obtaining a calibration slope ≥ 0.9 and optimism in the c-statistic (or AUC) < = 0.02 at two consecutive sample sizes. This procedure was repeated 500 times. We also investigated the impact of alternative modeling strategies: modeling nonlinear relations for continuous predictors and correcting for bias on the model estimates (Firth’s correction). Results Better discrimination was achieved in the ovarian cancer data (c-statistic 0.9 with 7 predictors) than in the CAD data (c-statistic 0.7 with 11 predictors). Adequate calibration and limited optimism in discrimination was achieved after a median of 450 patients (interquartile range 450–500) for the ovarian cancer data (22 events per parameter (EPP), 20–24) and 850 patients (750–900) for the CAD data (33 EPP, 30–35). A stricter criterion, requiring AUC optimism < = 0.01, was met with a median of 500 (23 EPP) and 1500 (59 EPP) patients, respectively. These sample sizes were much higher than the well-known 10 EPP rule of thumb and slightly higher than a recently published fixed sample size calculation method by Riley et al. Higher sample sizes were required when nonlinear relationships were modeled, and lower sample sizes when Firth’s correction was used. Conclusions Adaptive sample size determination can be a useful supplement to fixed a priori sample size calculations, because it allows to tailor the sample size to the specific prediction modeling context in a dynamic fashion.


Rangifer ◽  
2003 ◽  
Vol 23 (5) ◽  
pp. 297 ◽  
Author(s):  
Robert D. Otto ◽  
Neal P.P. Simon ◽  
Serge Couturier ◽  
Isabelle Schmelzer

Wildlife radio-telemetry and tracking projects often determine a priori required sample sizes by statistical means or default to the maximum number that can be maintained within a limited budget. After initiation of such projects, little attention is focussed on effective sample size requirements, resulting in lack of statistical power. The Department of National Defence operates a base in Labrador, Canada for low level jet fighter training activities, and maintain a sample of satellite collars on the George River caribou (Rangifer tarandus caribou) herd of the region for spatial avoidance mitiga&not;tion purposes. We analysed existing location data, in conjunction with knowledge of life history, to develop estimates of satellite collar sample sizes required to ensure adequate mitigation of GRCH. We chose three levels of probability in each of six annual caribou seasons. Estimated number of collars required ranged from 15 to 52, 23 to 68, and 36 to 184 for 50%, 75%, and 90% probability levels, respectively, depending on season. Estimates can be used to make more informed decisions about mitigation of GRCH, and, generally, our approach provides a means to adaptively assess radio collar sam&not;ple sizes for ongoing studies.


Author(s):  
Joseph P. Vitta ◽  
Christopher Nicklin ◽  
Stuart McLean

Abstract In this focused methodological synthesis, the sample construction procedures of 110 second language (L2) instructed vocabulary interventions were assessed in relation to effect size–driven sample-size planning, randomization, and multisite usage. These three areas were investigated because inferential testing makes better generalizations when researchers consider them during the sample construction process. Only nine reports used effect sizes to plan or justify sample sizes in any fashion, with only one engaging in an a priori power procedure referencing vocabulary-centric effect sizes from previous research. Randomized assignment was observed in 56% of the reports while no report involved randomized sampling. Approximately 15% of the samples observed were constructed from multiple sites and none of these empirically investigated the effect of site clustering. Leveraging the synthesized findings, we conclude by offering suggestions for future L2 instructed vocabulary researchers to consider a priori effect size–driven sample planning processes, randomization, and multisite usage when constructing samples.


2017 ◽  
Author(s):  
Jose D. Perezgonzalez

Research often necessitates of samples, yet obtaining large enough samples is not always possible. When it is, the researcher may use one of two methods for deciding upon the required sample size: rules-of-thumb, quick yet uncertain, and estimations for power, mathematically precise yet with the potential to overestimate or underestimate sample sizes when effect sizes are unknown. Misestimated sample sizes have negative repercussions in the form of increased costs, abandoned projects or abandoned publication of non-significant results. Here I describe a procedure for estimating sample sizes adequate for the testing approach which is most common in the behavioural, social, and biomedical sciences, that of Fisher’s tests of significance. The procedure focuses on a desired minimum effect size for the research at hand and finds the minimum sample size required for capturing such effect size as a statistically significant result. In a similar fashion than power analyses, sensitiveness analyses can also be extended to finding the minimum effect for a given sample size a priori as well as to calculating sensitiveness a posteriori. The article provides a full tutorial for carrying out a sensitiveness analysis, as well as empirical support via simulation.


2021 ◽  
Author(s):  
Cong Liang ◽  
Yingjun Deng

Phylogenetic comparative methods are essential in studying the evolution of traits across a phylogeny. Felsenstein's phylogenetic independent contrast (PIC) method and the generalized least squares (GLS) regression were often utilized to study whether evolutionary changes between traits were correlated. However, a neutral Brownian model is assumed in the PIC method, which impacts the performance of the PIC method when the trait is subject to adaptation. In recent years, the Ornstein-Uhlenbeck (OU) model has attracted increasing attention in studying the evolution of traits with stabilizing selection. In this study, we extended Felsenstein's PIC method under the OU model, which we termed OU-PIC. We simulated trait evolution under the OU model on phylogenetic trees with 8, 10, and 55 species. Compared to the PIC method, the OU-PIC method with correct stabilizing selection parameters achieved an appropriate type I error rate, the highest test power, and the lowest mean squared error. We presented a concise proof of the intrinsic connection between the OU-PIC and the generalized least squares (GLS) regression method in evaluating correlated evolution under the OU model. The OU-PIC method has a broad range of applications when trait evolution could be sufficiently modeled by the OU process. Compared with other phylogenetic comparative methods, OU-PIC avoids the inverse of the covariance matrix and would facilitate the analysis of correlated evolution on large phylogenies.


2020 ◽  
Vol 64 (4) ◽  
pp. 40412-1-40412-11
Author(s):  
Kexin Bai ◽  
Qiang Li ◽  
Ching-Hsin Wang

Abstract To address the issues of the relatively small size of brain tumor image datasets, severe class imbalance, and low precision in existing segmentation algorithms for brain tumor images, this study proposes a two-stage segmentation algorithm integrating convolutional neural networks (CNNs) and conventional methods. Four modalities of the original magnetic resonance images were first preprocessed separately. Next, preliminary segmentation was performed using an improved U-Net CNN containing deep monitoring, residual structures, dense connection structures, and dense skip connections. The authors adopted a multiclass Dice loss function to deal with class imbalance and successfully prevented overfitting using data augmentation. The preliminary segmentation results subsequently served as the a priori knowledge for a continuous maximum flow algorithm for fine segmentation of target edges. Experiments revealed that the mean Dice similarity coefficients of the proposed algorithm in whole tumor, tumor core, and enhancing tumor segmentation were 0.9072, 0.8578, and 0.7837, respectively. The proposed algorithm presents higher accuracy and better stability in comparison with some of the more advanced segmentation algorithms for brain tumor images.


Sign in / Sign up

Export Citation Format

Share Document