Evaluation of satellite collar sample size requirements for mitigation of low-level military jet disturbance of the George River caribou herd

Robert D. Otto; Neal P.P. Simon; Serge Couturier; Isabelle Schmelzer

doi:10.7557/2.23.5.1713

Evaluation of satellite collar sample size requirements for mitigation of low-level military jet disturbance of the George River caribou herd

Rangifer ◽

10.7557/2.23.5.1713 ◽

2003 ◽

Vol 23 (5) ◽

pp. 297 ◽

Cited By ~ 1

Author(s):

Robert D. Otto ◽

Neal P.P. Simon ◽

Serge Couturier ◽

Isabelle Schmelzer

Keyword(s):

Sample Size ◽

Statistical Power ◽

Rangifer Tarandus ◽

Radio Telemetry ◽

A Priori ◽

Effective Sample Size ◽

Sample Sizes ◽

Low Level ◽

Department Of National Defence ◽

George River

Wildlife radio-telemetry and tracking projects often determine a priori required sample sizes by statistical means or default to the maximum number that can be maintained within a limited budget. After initiation of such projects, little attention is focussed on effective sample size requirements, resulting in lack of statistical power. The Department of National Defence operates a base in Labrador, Canada for low level jet fighter training activities, and maintain a sample of satellite collars on the George River caribou (Rangifer tarandus caribou) herd of the region for spatial avoidance mitiga¬tion purposes. We analysed existing location data, in conjunction with knowledge of life history, to develop estimates of satellite collar sample sizes required to ensure adequate mitigation of GRCH. We chose three levels of probability in each of six annual caribou seasons. Estimated number of collars required ranged from 15 to 52, 23 to 68, and 36 to 184 for 50%, 75%, and 90% probability levels, respectively, depending on season. Estimates can be used to make more informed decisions about mitigation of GRCH, and, generally, our approach provides a means to adaptively assess radio collar sam¬ple sizes for ongoing studies.

Download Full-text

Statistical Primer for Athletic Trainers: Understanding the Role of Statistical Power in Comparative Athletic Training Research

Journal of Athletic Training ◽

10.4085/1062-6050-284-17 ◽

2018 ◽

Vol 53 (7) ◽

pp. 716-719

Author(s):

Monica R. Lininger ◽

Bryan L. Riemann

Keyword(s):

Sample Size ◽

Athletic Training ◽

Treatment Effect ◽

Statistical Power ◽

Athletic Trainers ◽

A Priori ◽

Significant Treatment ◽

Training Research ◽

Α Level

Objective: To describe the concept of statistical power as related to comparative interventions and how various factors, including sample size, affect statistical power.Background: Having a sufficiently sized sample for a study is necessary for an investigation to demonstrate that an effective treatment is statistically superior. Many researchers fail to conduct and report a priori sample-size estimates, which then makes it difficult to interpret nonsignificant results and causes the clinician to question the planning of the research design.Description: Statistical power is the probability of statistically detecting a treatment effect when one truly exists. The α level, a measure of differences between groups, the variability of the data, and the sample size all affect statistical power.Recommendations: Authors should conduct and provide the results of a priori sample-size estimations in the literature. This will assist clinicians in determining whether the lack of a statistically significant treatment effect is due to an underpowered study or to a treatment's actually having no effect.

Download Full-text

Methodological Reporting Behavior, Sample Sizes, and Statistical Power in Studies of Event- Related Potentials: Barriers to Reproducibility and Replicability

10.31234/osf.io/kgv9z ◽

2019 ◽

Author(s):

Peter E Clayson ◽

Kaylie Amanda Carbine ◽

Scott Baldwin ◽

Michael J. Larson

Keyword(s):

Sample Size ◽

Statistical Power ◽

Event Related Potentials ◽

Reporting Guidelines ◽

Medium Effect ◽

Sample Sizes ◽

Reporting Behavior ◽

Average Sample Size ◽

Related Potentials ◽

Average Sample

Methodological reporting guidelines for studies of event-related potentials (ERPs) were updated in Psychophysiology in 2014. These guidelines facilitate the communication of key methodological parameters (e.g., preprocessing steps). Failing to report key parameters represents a barrier to replication efforts, and difficultly with replicability increases in the presence of small sample sizes and low statistical power. We assessed whether guidelines are followed and estimated the average sample size and power in recent research. Reporting behavior, sample sizes, and statistical designs were coded for 150 randomly-sampled articles from five high-impact journals that frequently publish ERP research from 2011 to 2017. An average of 63% of guidelines were reported, and reporting behavior was similar across journals, suggesting that gaps in reporting is a shortcoming of the field rather than any specific journal. Publication of the guidelines paper had no impact on reporting behavior, suggesting that editors and peer reviewers are not enforcing these recommendations. The average sample size per group was 21. Statistical power was conservatively estimated as .72-.98 for a large effect size, .35-.73 for a medium effect, and .10-.18 for a small effect. These findings indicate that failing to report key guidelines is ubiquitous and that ERP studies are primarily powered to detect large effects. Such low power and insufficient following of reporting guidelines represent substantial barriers to replication efforts. The methodological transparency and replicability of studies can be improved by the open sharing of processing code and experimental tasks and by a priori sample size calculations to ensure adequately powered studies.

Download Full-text

A Multi-faceted Mess: A Review of Statistical Power Analysis in Psychology Journal Articles

10.31234/osf.io/3bdfu ◽

2019 ◽

Cited By ~ 2

Author(s):

Rob Cribbie ◽

Nataly Beribisky ◽

Udi Alter

Keyword(s):

Sample Size ◽

Effect Size ◽

Power Analysis ◽

Statistical Power ◽

Type I Error ◽

A Priori ◽

Type I ◽

Specific Level ◽

Maximum Sample Size ◽

Power Analyses

Many bodies recommend that a sample planning procedure, such as traditional NHST a priori power analysis, is conducted during the planning stages of a study. Power analysis allows the researcher to estimate how many participants are required in order to detect a minimally meaningful effect size at a specific level of power and Type I error rate. However, there are several drawbacks to the procedure that render it “a mess.” Specifically, the identification of the minimally meaningful effect size is often difficult but unavoidable for conducting the procedure properly, the procedure is not precision oriented, and does not guide the researcher to collect as many participants as feasibly possible. In this study, we explore how these three theoretical issues are reflected in applied psychological research in order to better understand whether these issues are concerns in practice. To investigate how power analysis is currently used, this study reviewed the reporting of 443 power analyses in high impact psychology journals in 2016 and 2017. It was found that researchers rarely use the minimally meaningful effect size as a rationale for the chosen effect in a power analysis. Further, precision-based approaches and collecting the maximum sample size feasible are almost never used in tandem with power analyses. In light of these findings, we offer that researchers should focus on tools beyond traditional power analysis when sample planning, such as collecting the maximum sample size feasible.

Download Full-text

Statistical Power in Content Analysis Designs: How Effect Size, Sample Size and Coding Accuracy Jointly Affect Hypothesis Testing ‐ A Monte Carlo Simulation Approach.

Computational Communication Research ◽

10.5117/ccr2021.1.003.geis ◽

2021 ◽

Vol 3 (1) ◽

pp. 61-89

Author(s):

Stefan Geiß

Keyword(s):

Monte Carlo Simulation ◽

Monte Carlo ◽

Content Analysis ◽

Sample Size ◽

Effect Size ◽

Statistical Power ◽

Effect Sizes ◽

Sample Sizes ◽

Expected Effect ◽

Sample Size Effect

Abstract This study uses Monte Carlo simulation techniques to estimate the minimum required levels of intercoder reliability in content analysis data for testing correlational hypotheses, depending on sample size, effect size and coder behavior under uncertainty. The ensuing procedure is analogous to power calculations for experimental designs. In most widespread sample size/effect size settings, the rule-of-thumb that chance-adjusted agreement should be ≥.80 or ≥.667 corresponds to the simulation results, resulting in acceptable α and β error rates. However, this simulation allows making precise power calculations that can consider the specifics of each study’s context, moving beyond one-size-fits-all recommendations. Studies with low sample sizes and/or low expected effect sizes may need coder agreement above .800 to test a hypothesis with sufficient statistical power. In studies with high sample sizes and/or high expected effect sizes, coder agreement below .667 may suffice. Such calculations can help in both evaluating and in designing studies. Particularly in pre-registered research, higher sample sizes may be used to compensate for low expected effect sizes and/or borderline coding reliability (e.g. when constructs are hard to measure). I supply equations, easy-to-use tables and R functions to facilitate use of this framework, along with example code as online appendix.

Download Full-text

Comparison of statistical methods for analysis of small sample sizes for detecting the differences in efficacy between treatments for knee osteoarthritis

10.21203/rs.2.20859/v1 ◽

2020 ◽

Author(s):

Chia-Lung Shih ◽

Te-Yu Hung

Keyword(s):

Knee Osteoarthritis ◽

Sample Size ◽

Statistical Methods ◽

Statistical Power ◽

Permutation Test ◽

Small Sample Size ◽

Small Sample ◽

T Test ◽

Sample Sizes ◽

Knee Oa

Abstract Background A small sample size (n < 30 for each treatment group) is usually enrolled to investigate the differences in efficacy between treatments for knee osteoarthritis (OA). The objective of this study was to use simulation for comparing the power of four statistical methods for analysis of small sample size for detecting the differences in efficacy between two treatments for knee OA. Methods A total of 10,000 replicates of 5 sample sizes (n=10, 15, 20, 25, and 30 for each group) were generated based on the previous reported measures of treatment efficacy. Four statistical methods were used to compare the differences in efficacy between treatments, including the two-sample t-test (t-test), the Mann-Whitney U-test (M-W test), the Kolmogorov-Smirnov test (K-S test), and the permutation test (perm-test). Results The bias of simulated parameter means showed a decreased trend with sample size but the CV% of simulated parameter means varied with sample sizes for all parameters. For the largest sample size (n=30), the CV% could achieve a small level (<20%) for almost all parameters but the bias could not. Among the non-parametric tests for analysis of small sample size, the perm-test had the highest statistical power, and its false positive rate was not affected by sample size. However, the power of the perm-test could not achieve a high value (80%) even using the largest sample size (n=30). Conclusion The perm-test is suggested for analysis of small sample size to compare the differences in efficacy between two treatments for knee OA.

Download Full-text

Causality in Statistical Power: Isomorphic Properties of Measurement, Research Design, Effect Size, and Sample Size

Scientifica ◽

10.1155/2016/8920418 ◽

2016 ◽

Vol 2016 ◽

pp. 1-5 ◽

Cited By ~ 2

Author(s):

R. Eric Heidel

Keyword(s):

Sample Size ◽

Cognitive Dissonance ◽

Research Design ◽

Effect Size ◽

Statistical Power ◽

Research Study ◽

A Priori ◽

Sample Size Calculation ◽

Design Effect

Statistical power is the ability to detect a significant effect, given that the effect actually exists in a population. Like most statistical concepts, statistical power tends to induce cognitive dissonance in hepatology researchers. However, planning for statistical power by ana priorisample size calculation is of paramount importance when designing a research study. There are five specific empirical components that make up ana priorisample size calculation: the scale of measurement of the outcome, the research design, the magnitude of the effect size, the variance of the effect size, and the sample size. A framework grounded in the phenomenon of isomorphism, or interdependencies amongst different constructs with similar forms, will be presented to understand the isomorphic effects of decisions made on each of the five aforementioned components of statistical power.

Download Full-text

Adaptive sample size determination for the development of clinical prediction models

10.21203/rs.3.rs-87100/v1 ◽

2020 ◽

Author(s):

Evangelia Christodoulou ◽

Maarten van Smeden ◽

Michael Edlinger ◽

Dirk Timmerman ◽

Maria Wanitschek ◽

...

Keyword(s):

Ovarian Cancer ◽

Sample Size ◽

Prediction Models ◽

A Priori ◽

Sample Size Calculation ◽

Model Performance ◽

Sample Size Determination ◽

Sample Sizes ◽

Cancer Data ◽

C Statistic

Abstract Background: We suggest an adaptive sample size calculation method for developing clinical prediction models, in which model performance is monitored sequentially as new data comes in. Methods: We illustrate the approach using data for the diagnosis of ovarian cancer (n=5914, 33% event fraction) and obstructive coronary artery disease (CAD; n=4888, 44% event fraction). We used logistic regression to develop a prediction model consisting only of a-priori selected predictors and assumed linear relations for continuous predictors. We mimicked prospective patient recruitment by developing the model on 100 randomly selected patients, and we used bootstrapping to internally validate the model. We sequentially added 50 random new patients until we reached a sample size of 3000, and re-estimated model performance at each step. We examined the required sample size for satisfying the following stopping rule: obtaining a calibration slope ≥0.9 and optimism in the c-statistic (ΔAUC) <=0.02 at two consecutive sample sizes. This procedure was repeated 500 times. We also investigated the impact of alternative modeling strategies: modeling nonlinear relations for continuous predictors, and applying Firth’s bias correction.Results: Better discrimination was achieved in the ovarian cancer data (c-statistic 0.9 with 7 predictors) than in the CAD data (c-statistic 0.7 with 11 predictors). Adequate calibration and limited optimism in discrimination was achieved after a median of 450 patients (interquartile range 450-500) for the ovarian cancer data (22 events per parameter (EPP), 20-24), and 750 patients (700-800) for the CAD data (30 EPP, 28-33). A stricter criterion, requiring ΔAUC <=0.01, was met with a median of 500 (23 EPP) and 1350 (54 EPP) patients, respectively. These sample sizes were much higher than the well-known 10 EPP rule of thumb and slightly higher than a recently published fixed sample size calculation method by Riley et al. Higher sample sizes were required when nonlinear relationships were modeled, and lower sample sizes when Firth’s correction was used. Conclusions: Adaptive sample size determination can be a useful supplement to a priori sample size calculations, because it allows to further tailor the sample size to the specific prediction modeling context in a dynamic fashion.

Download Full-text

Multiple stressor null models frequently fail to detect most interactions due to low statistical power

10.1101/2021.07.21.453207 ◽

2021 ◽

Author(s):

Benjamin J Burgess ◽

Michelle C Jackson ◽

David J Murrell

Keyword(s):

Sample Size ◽

Statistical Power ◽

Multiple Stressors ◽

Statistical Significance ◽

Null Model ◽

Null Models ◽

Data Uncertainty ◽

Sample Sizes ◽

Multiple Stressor ◽

Null Expectation

1. Most ecosystems are subject to co-occurring, anthropogenically driven changes and understanding how these multiple stressors interact is a pressing concern. Stressor interactions are typically studied using null models, with the additive and multiplicative null expectation being those most widely applied. Such approaches classify interactions as being synergistic, antagonistic, reversal, or indistinguishable from the null expectation. Despite their wide-spread use, there has been no thorough analysis of these null models, nor a systematic test of the robustness of their results to sample size or sampling error in the estimates of the responses to stressors. 2. We use data simulated from food web models where the true stressor interactions are known, and analytical results based on the null model equations to uncover how (i) sample size, (ii) variation in biological responses to the stressors and (iii) statistical significance, affect the ability to detect non-null interactions. 3. Our analyses lead to three main results. Firstly, it is clear the additive and multiplicative null models are not directly comparable, and over one third of all simulated interactions had classifications that were model dependent. Secondly, both null models have weak power to correctly classify interactions at commonly implemented sample sizes (i.e., ≤6 replicates), unless data uncertainty is unrealistically low. This means all but the most extreme interactions are indistinguishable from the null model expectation. Thirdly, we show that increasing sample size increases the power to detect the true interactions but only very slowly. However, the biggest gains come from increasing replicates from 3 up to 25 and we provide an R function for users to determine sample sizes required to detect a critical effect size of biological interest for the additive model. 4. Our results will aid researchers in the design of their experiments and the subsequent interpretation of results. We find no clear statistical advantage of using one null model over the other and argue null model choice should be based on biological relevance rather than statistical properties. However, there is a pressing need to increase experiment sample sizes otherwise many biologically important synergistic and antagonistic stressor interactions will continue to be missed.

Download Full-text

Sample Size Justifications in Gait & Posture

10.31219/osf.io/mbz63 ◽

2021 ◽

Author(s):

Christopher McCrum ◽

Jorg van Beek ◽

Charlotte Schumacher ◽

Sanne Janssen ◽

Bas Van Hooren

Keyword(s):

Sample Size ◽

Resource Constraints ◽

A Priori ◽

Sample Sizes ◽

Dominant Type ◽

Change In Practice ◽

Practice Methods ◽

A Priori Power Analysis ◽

Background Context ◽

Sample Size Justification

Background: Context regarding how researchers determine the sample size of their experiments is important for interpreting the results and determining their value and meaning. Between 2018 and 2019, the journal Gait & Posture introduced a requirement for sample size justification in their author guidelines.Research Question: How frequently and in what ways are sample sizes justified in Gait & Posture research articles and was the inclusion of a guideline requiring sample size justification associated with a change in practice?Methods: The guideline was not in place prior to May 2018 and was in place from 25th July 2019. All articles in the three most recent volumes of the journal (84-86) and the three most recent, pre-guideline volumes (60-62) at time of preregistration were included in this analysis. This provided an initial sample of 324 articles (176 pre-guideline and 148 post-guideline). Articles were screened by two authors to extract author data, article metadata and sample size justification data. Specifically, screeners identified if (yes or no) and how sample sizes were justified. Six potential justification types (Measure Entire Population, Resource Constraints, Accuracy, A priori Power Analysis, Heuristics, No Justification) and an additional option of Other/Unsure/Unclear were used.Results: In most cases, authors of Gait & Posture articles did not provide a justification for their study’s sample size. The inclusion of the guideline was associated with a modest increase in the percentage of articles providing a justification (16.6% to 28.1%). A priori power calculations were the dominant type of justification, but many were not reported in enough detail to allow replication.Significance: Gait & Posture researchers should be more transparent in how they determine their sample sizes and carefully consider if they are suitable. Editors and journals may consider adding a similar guideline as a low-resource way to improve sample size justification reporting.

Download Full-text

On the Difference between a-priori and observed statistical power — A comment on “statistical power and sample size calculations: A primer for pediatric surgeons”

Journal of Pediatric Surgery ◽

10.1016/j.jpedsurg.2019.08.054 ◽

2020 ◽

Vol 55 (1) ◽

pp. 203-205

Author(s):

Arne Schröder ◽

Christina Oetzmann von Sochaczewski

Keyword(s):

Sample Size ◽

Statistical Power ◽

A Priori ◽

Sample Size Calculations ◽

The Difference

Download Full-text