Common Medical and Statistical Problems: The Dilemma of the Sample Size Calculation for Sensitivity and Specificity Estimation

M. Rosário Oliveira; Ana Subtil; Luzia Gonçalves

doi:10.3390/math8081258

Common Medical and Statistical Problems: The Dilemma of the Sample Size Calculation for Sensitivity and Specificity Estimation

Mathematics ◽

10.3390/math8081258 ◽

2020 ◽

Vol 8 (8) ◽

pp. 1258

Author(s):

M. Rosário Oliveira ◽

Ana Subtil ◽

Luzia Gonçalves

Keyword(s):

Sample Size ◽

Confidence Intervals ◽

Sensitivity And Specificity ◽

Sample Size Calculation ◽

Estimation Method ◽

Interval Estimation ◽

Sample Size Determination ◽

Estimation Methods ◽

Sample Sizes ◽

Expected Length

Sample size calculation in biomedical practice is typically based on the problematic Wald method for a binomial proportion, with potentially dangerous consequences. This work highlights the need of incorporating the concept of conditional probability in sample size determination to avoid reduced sample sizes that lead to inadequate confidence intervals. Therefore, new definitions are proposed for coverage probability and expected length of confidence intervals for conditional probabilities, like sensitivity and specificity. The new definitions were used to assess seven confidence interval estimation methods. In order to determine the sample size, two procedures—an optimal one, based on the new definitions, and an approximation—were developed for each estimation method. Our findings confirm the similarity of the approximated sample sizes to the optimal ones. R code is provided to disseminate these methodological advances and translate them into biomedical practice.

Download Full-text

Adaptive sample size determination for the development of clinical prediction models

10.21203/rs.3.rs-87100/v1 ◽

2020 ◽

Author(s):

Evangelia Christodoulou ◽

Maarten van Smeden ◽

Michael Edlinger ◽

Dirk Timmerman ◽

Maria Wanitschek ◽

...

Keyword(s):

Ovarian Cancer ◽

Sample Size ◽

Prediction Models ◽

A Priori ◽

Sample Size Calculation ◽

Model Performance ◽

Sample Size Determination ◽

Sample Sizes ◽

Cancer Data ◽

C Statistic

Abstract Background: We suggest an adaptive sample size calculation method for developing clinical prediction models, in which model performance is monitored sequentially as new data comes in. Methods: We illustrate the approach using data for the diagnosis of ovarian cancer (n=5914, 33% event fraction) and obstructive coronary artery disease (CAD; n=4888, 44% event fraction). We used logistic regression to develop a prediction model consisting only of a-priori selected predictors and assumed linear relations for continuous predictors. We mimicked prospective patient recruitment by developing the model on 100 randomly selected patients, and we used bootstrapping to internally validate the model. We sequentially added 50 random new patients until we reached a sample size of 3000, and re-estimated model performance at each step. We examined the required sample size for satisfying the following stopping rule: obtaining a calibration slope ≥0.9 and optimism in the c-statistic (ΔAUC) <=0.02 at two consecutive sample sizes. This procedure was repeated 500 times. We also investigated the impact of alternative modeling strategies: modeling nonlinear relations for continuous predictors, and applying Firth’s bias correction.Results: Better discrimination was achieved in the ovarian cancer data (c-statistic 0.9 with 7 predictors) than in the CAD data (c-statistic 0.7 with 11 predictors). Adequate calibration and limited optimism in discrimination was achieved after a median of 450 patients (interquartile range 450-500) for the ovarian cancer data (22 events per parameter (EPP), 20-24), and 750 patients (700-800) for the CAD data (30 EPP, 28-33). A stricter criterion, requiring ΔAUC <=0.01, was met with a median of 500 (23 EPP) and 1350 (54 EPP) patients, respectively. These sample sizes were much higher than the well-known 10 EPP rule of thumb and slightly higher than a recently published fixed sample size calculation method by Riley et al. Higher sample sizes were required when nonlinear relationships were modeled, and lower sample sizes when Firth’s correction was used. Conclusions: Adaptive sample size determination can be a useful supplement to a priori sample size calculations, because it allows to further tailor the sample size to the specific prediction modeling context in a dynamic fashion.

Download Full-text

Adaptive sample size determination for the development of clinical prediction models

Diagnostic and Prognostic Research ◽

10.1186/s41512-021-00096-5 ◽

2021 ◽

Vol 5 (1) ◽

Author(s):

Evangelia Christodoulou ◽

Maarten van Smeden ◽

Michael Edlinger ◽

Dirk Timmerman ◽

Maria Wanitschek ◽

...

Keyword(s):

Ovarian Cancer ◽

Sample Size ◽

Prediction Models ◽

A Priori ◽

Sample Size Calculation ◽

Model Performance ◽

Sample Size Determination ◽

Sample Sizes ◽

Cancer Data ◽

C Statistic

Abstract Background We suggest an adaptive sample size calculation method for developing clinical prediction models, in which model performance is monitored sequentially as new data comes in. Methods We illustrate the approach using data for the diagnosis of ovarian cancer (n = 5914, 33% event fraction) and obstructive coronary artery disease (CAD; n = 4888, 44% event fraction). We used logistic regression to develop a prediction model consisting only of a priori selected predictors and assumed linear relations for continuous predictors. We mimicked prospective patient recruitment by developing the model on 100 randomly selected patients, and we used bootstrapping to internally validate the model. We sequentially added 50 random new patients until we reached a sample size of 3000 and re-estimated model performance at each step. We examined the required sample size for satisfying the following stopping rule: obtaining a calibration slope ≥ 0.9 and optimism in the c-statistic (or AUC) < = 0.02 at two consecutive sample sizes. This procedure was repeated 500 times. We also investigated the impact of alternative modeling strategies: modeling nonlinear relations for continuous predictors and correcting for bias on the model estimates (Firth’s correction). Results Better discrimination was achieved in the ovarian cancer data (c-statistic 0.9 with 7 predictors) than in the CAD data (c-statistic 0.7 with 11 predictors). Adequate calibration and limited optimism in discrimination was achieved after a median of 450 patients (interquartile range 450–500) for the ovarian cancer data (22 events per parameter (EPP), 20–24) and 850 patients (750–900) for the CAD data (33 EPP, 30–35). A stricter criterion, requiring AUC optimism < = 0.01, was met with a median of 500 (23 EPP) and 1500 (59 EPP) patients, respectively. These sample sizes were much higher than the well-known 10 EPP rule of thumb and slightly higher than a recently published fixed sample size calculation method by Riley et al. Higher sample sizes were required when nonlinear relationships were modeled, and lower sample sizes when Firth’s correction was used. Conclusions Adaptive sample size determination can be a useful supplement to fixed a priori sample size calculations, because it allows to tailor the sample size to the specific prediction modeling context in a dynamic fashion.

Download Full-text

Assessing the Lognormal Distribution Assumption For the Crude Odds Ratio: Implications For Point and Interval Estimation

10.21203/rs.3.rs-29245/v1 ◽

2020 ◽

Author(s):

David Douglas Newstein

Keyword(s):

Confidence Intervals ◽

Standard Method ◽

Lognormal Distribution ◽

Odds Ratio ◽

Estimation Method ◽

Interval Estimation ◽

Parametric Bootstrap ◽

Sampling Distribution ◽

Estimation Methods ◽

Crude Odds Ratio

Abstract Background: The assumption that the sampling distribution of the crude Odds Ratio (ORcrude) is a lognormal distribution with parameters mu and sigma leads to the incorrect conclusion that the expectation of the log of ORcrude is equal to the parameter mu. Here, the standard method of point and interval estimation (I) is compared with a modified method utilizing ORstar where ln(ORstar) = ln(ORcrude )– sigma **2/2. Methods: Confidence intervals are obtained utilizing ln(ORstar) by both parametric bootstrap simulations with a percentile derived confidence interval (II), and a simple calculation done by replacing ln(ORcrude) with ln(ORstar) in the standard formula (III) as well as a method proposed by Barendregt (IV), who also noted the bias present in estimating ORtrue by ORcrude. Simulations are conducted for a “protective” exposure (ORtrue < 1) as well as for a “harmful” exposure (ORtrue >1). Results: In simulations the estimation methods (II and III) exhibited the highest level of statistical conclusion validity for their confidence intervals as indicated by one minus the coverage probability being close to alpha. Also, as demonstrated by the MC simulations, these two methods exhibited the least biased point estimates and the narrowest confidence intervals of the four estimation approaches. Conclusions: Monte Carlo simulations prove useful in validating the inferential procedures used in data analysis. In the case of the odds ratio, the standard method of point and interval estimation is based on the assumption that the crude odds ratio has a sampling distribution that is lognormal. Utilizing this assumption, as well as the formula for the expectation of this distribution function, an alternative estimation method was obtained for ORtrue (but different from a method from the earlier report (Barendregt)), that yielded point and interval estimates that MC simulations indicate are the most statistically valid.

Download Full-text

- Confidence Intervals and Sample Size Determination

Probability, Statistics, and Reliability for Engineers and Scientists ◽

10.1201/b12161-15 ◽

2016 ◽

pp. 398-417

Keyword(s):

Sample Size ◽

Confidence Intervals ◽

Sample Size Determination ◽

Size Determination

Download Full-text

Estimating Time: Comparing the Accuracy of Estimation Methods for Interval Timing

10.31234/osf.io/pg7bs ◽

2019 ◽

Cited By ~ 1

Author(s):

Atser Damsma ◽

Nadine Schlichting ◽

Hedderik van Rijn ◽

Warrick Roseboom

Keyword(s):

Estimation Method ◽

Interval Estimation ◽

Reaction Times ◽

Interval Timing ◽

Alternative Methods ◽

Estimation Methods ◽

Complex Task ◽

Target Onset ◽

Spatial Estimation ◽

Verbal Estimation

In interval timing experiments, motor reproduction is the predominant method used when participants are asked to estimate an interval. However, it is unknown how its accuracy, precision and efficiency compare to alternative methods, such as indicating the duration by spatial estimation on a timeline. In two experiments, we compared different interval estimation methods. In the first experiment, participants were asked to reproduce an interval by means of motor reproduction, timeline estimation, or verbal estimation. We found that, on average, verbal estimates were more accurate and precise than line estimates and motor reproductions. However, we found a bias towards familiar whole second units when giving verbal estimates. Motor reproductions were more precise, but not more accurate than timeline estimates. In the second experiment, we used a more complex task: Participants were presented a stream of digits and one target letters and were subsequently asked to reproduce both the interval to target onset and the duration of the total stream by means of motor reproduction and timeline estimation. We found that motor reproductions were more accurate, but not more precise than timeline estimates. In both experiments, timeline estimates had the lowest reaction times. Overall, our results suggest that the transformation of time into space has only a relatively minor cost. In addition, they show that each estimation method comes with its own advantages, and that the choice of estimation method depends on choices in the experimental design: for example, when using durations with integer durations verbal estimates are superior, yet when testing long durations, motor reproductions are time intensive making timeline estimates a more sensible choice.

Download Full-text

Sample Size Determination and Optimal Design of Simple Pretest-Posttest Experimental Designs: Introduction, Software, and Illustrations

10.35542/osf.io/k5ey8 ◽

2021 ◽

Author(s):

Metin Bulus

Keyword(s):

Optimal Design ◽

Sample Size ◽

Small Sample ◽

Experimental Designs ◽

Sample Size Determination ◽

Size Determination ◽

Sample Sizes ◽

Control Groups ◽

Small Sample Sizes ◽

And Control

A recent systematic review of experimental studies conducted in Turkey between 2010 and 2020 reported that small sample sizes had been a significant drawback (Bulus and Koyuncu, 2021). A small chunk of the studies were small-scale true experiments (subjects randomized into the treatment and control groups). The remaining studies consisted of quasi-experiments (subjects in treatment and control groups were matched on pretest or other covariates) and weak experiments (neither randomized nor matched but had the control group). They had an average sample size below 70 for different domains and outcomes. These small sample sizes imply a strong (and perhaps erroneous) assumption about the minimum relevant effect size (MRES) of intervention before an experiment is conducted; that is, a standardized intervention effect of Cohen’s d < 0.50 is not relevant to education policy or practice. Thus, an introduction to sample size determination for pretest-posttest simple experimental designs is warranted. This study describes nuts and bolts of sample size determination, derives expressions for optimal design under differential cost per treatment and control units, provide convenient tables to guide sample size decisions for MRES values between 0.20 ≤ Cohen’s d ≤ 0.50, and describe the relevant software along with illustrations.

Download Full-text

Confidence intervals and sample sizes: don't throw out all your old sample size tables.

BMJ ◽

10.1136/bmj.302.6772.333 ◽

1991 ◽

Vol 302 (6772) ◽

pp. 333-336 ◽

Cited By ~ 24

Author(s):

L E Daly

Keyword(s):

Sample Size ◽

Confidence Intervals ◽

Sample Sizes

Download Full-text

Use of interval estimations in design and evaluation of multiregional clinical trials with continuous outcomes

Statistical Methods in Medical Research ◽

10.1177/0962280217751277 ◽

2018 ◽

Vol 28 (7) ◽

pp. 2179-2195 ◽

Cited By ~ 1

Author(s):

Chieh Chiang ◽

Chin-Fu Hsiao

Keyword(s):

Clinical Trials ◽

Sample Size ◽

Type I Error ◽

Interval Estimation ◽

Error Rates ◽

New Drugs ◽

Sample Size Determination ◽

Type I ◽

Size Determination ◽

Interval Estimators

Multiregional clinical trials have been accepted in recent years as a useful means of accelerating the development of new drugs and abridging their approval time. The statistical properties of multiregional clinical trials are being widely discussed. In practice, variance of a continuous response may be different from region to region, but it leads to the assessment of the efficacy response falling into a Behrens–Fisher problem—there is no exact testing or interval estimator for mean difference with unequal variances. As a solution, this study applies interval estimations of the efficacy response based on Howe’s, Cochran–Cox’s, and Satterthwaite’s approximations, which have been shown to have well-controlled type I error rates. However, the traditional sample size determination cannot be applied to the interval estimators. The sample size determination to achieve a desired power based on these interval estimators is then presented. Moreover, the consistency criteria suggested by the Japanese Ministry of Health, Labour and Welfare guidance to decide whether the overall results from the multiregional clinical trial obtained via the proposed interval estimation were also applied. A real example is used to illustrate the proposed method. The results of simulation studies indicate that the proposed method can correctly determine the required sample size and evaluate the assurance probability of the consistency criteria.

Download Full-text

Exact Interval Estimation, Power Calculation, and Sample Size Determination in Normal Correlation Analysis

Psychometrika ◽

10.1007/s11336-04-1221-6 ◽

2006 ◽

Vol 71 (3) ◽

pp. 529-540 ◽

Cited By ~ 14

Author(s):

Gwowen Shieh

Keyword(s):

Correlation Analysis ◽

Sample Size ◽

Interval Estimation ◽

Sample Size Determination ◽

Power Calculation ◽

Size Determination ◽

Normal Correlation

Download Full-text

Sample Size Determination for Interval Estimation of Multinomial Probabilities

The American Statistician ◽

10.1080/00031305.1993.10475978 ◽

1993 ◽

Vol 47 (3) ◽

pp. 203-206 ◽

Cited By ~ 1

Author(s):

Jeffrey F. Bromaghin

Keyword(s):

Sample Size ◽

Interval Estimation ◽

Sample Size Determination ◽

Size Determination ◽

Multinomial Probabilities

Download Full-text