Quantifying the bias in the estimated treatment effect in randomized trials having interim analyses and a rule for early stopping for futility

S. D. Walter; H. Han; M. Briel; G. H. Guyatt

doi:10.1002/sim.7242

Do we need to adjust for interim analyses in a Bayesian adaptive trial design?

10.21203/rs.2.17854/v1 ◽

2019 ◽

Author(s):

Elizabeth Ryan ◽

Kristian Brock ◽

Simon Gates ◽

Daniel Slade

Keyword(s):

Clinical Trials ◽

Case Studies ◽

Type I Error ◽

Error Rates ◽

Adaptive Designs ◽

Alternative Methods ◽

Type I ◽

Early Stopping ◽

Interim Analyses ◽

Stopping For Futility

Abstract Background Bayesian adaptive methods are increasingly being used to design clinical trials and offer a number of advantages over traditional approaches. Decisions at analysis points are usually based on the posterior distribution of the parameter of interest. However, there is some confusion amongst statisticians and trialists as to whether control of type I error is required for Bayesian adaptive designs as this is a frequentist concept. Methods We discuss the arguments for and against adjusting for multiplicities in Bayesian trials with interim analyses. We present two case studies demonstrating the effect on type I/II error rates of including interim analyses in Bayesian clinical trials. We propose alternative approaches to adjusting stopping boundaries to control type I error, and also alternative methods for decision-making in Bayesian clinical trials. Results In both case studies we found that the type I error was inflated in the Bayesian adaptive designs through incorporation of interim analyses that allowed early stopping for efficacy and do not make adjustments to account for multiplicity. Incorporation of early stopping for efficacy also increased the power in some instances. An increase in the number of interim analyses that only allowed early stopping for futility decreased the type I error, but also decreased power. An increase in the number of interim analyses that allowed for either early stopping for efficacy or futility generally increased type I error and decreased power. Conclusions If one wishes to demonstrate control of type I error in Bayesian adaptive designs then adjustments to the stopping boundaries are usually required for designs that allow for early stopping for efficacy as the number of analyses increase. If the designs only allow for early stopping for futility then adjustments to the stopping boundaries are not needed to control type I error, but may be required to ensure adequate power. If one instead uses a strict Bayesian approach then type I errors could be ignored and the designs could instead focus on the posterior probabilities of treatment effects of particular values.

Download Full-text

Do we need to adjust for interim analyses in a Bayesian adaptive trial design?

10.21203/rs.2.17854/v2 ◽

2020 ◽

Author(s):

Elizabeth Ryan ◽

Kristian Brock ◽

Simon Gates ◽

Daniel Slade

Keyword(s):

Clinical Trials ◽

Case Studies ◽

Type I Error ◽

Error Rates ◽

Adaptive Designs ◽

Alternative Methods ◽

Type I ◽

Early Stopping ◽

Interim Analyses ◽

Stopping For Futility

Abstract Background: Bayesian adaptive methods are increasingly being used to design clinical trials and offer several advantages over traditional approaches. Decisions at analysis points are usually based on the posterior distribution of the treatment effect. However, there is some confusion as to whether control of type I error is required for Bayesian designs as this is a frequentist concept.Methods: We discuss the arguments for and against adjusting for multiplicities in Bayesian trials with interim analyses. With two case studies we illustrate the effect of including interim analyses on type I/II error rates in Bayesian clinical trials where no adjustments for multiplicities are made. We propose several approaches to control type I error, and also alternative methods for decision-making in Bayesian clinical trials.Results: In both case studies we demonstrated that the type I error was inflated in the Bayesian adaptive designs through incorporation of interim analyses that allowed early stopping for efficacy and without adjustments to account for multiplicity. Incorporation of early stopping for efficacy also increased the power in some instances. An increase in the number of interim analyses that only allowed early stopping for futility decreased the type I error, but also decreased power. An increase in the number of interim analyses that allowed for either early stopping for efficacy or futility generally increased type I error and decreased power.Conclusions: Currently, regulators require demonstration of control of type I error for both frequentist and Bayesian adaptive designs, particularly for late-phase trials. To demonstrate control of type I error in Bayesian adaptive designs, adjustments to the stopping boundaries are usually required for designs that allow for early stopping for efficacy as the number of analyses increase. If the designs only allow for early stopping for futility then adjustments to the stopping boundaries are not needed to control type I error. If one instead uses a strict Bayesian approach, which is currently more accepted in the design and analysis of exploratory trials, then type I errors could be ignored and the designs could instead focus on the posterior probabilities of treatment effects of clinically-relevant values.

Download Full-text

Generalizing Treatment Effect Estimates From Sample to Population: A Case Study in the Difficulties of Finding Sufficient Data

Evaluation Review ◽

10.1177/0193841x16660663 ◽

2016 ◽

Vol 41 (4) ◽

pp. 357-388 ◽

Cited By ~ 13

Author(s):

Elizabeth A. Stuart ◽

Anna Rhodes

Keyword(s):

Head Start ◽

Randomized Trial ◽

Treatment Effect ◽

External Validity ◽

Randomized Trials ◽

Target Population ◽

Data Sets ◽

Common Data Elements ◽

Ex Post

Background: Given increasing concerns about the relevance of research to policy and practice, there is growing interest in assessing and enhancing the external validity of randomized trials: determining how useful a given randomized trial is for informing a policy question for a specific target population. Objectives: This article highlights recent advances in assessing and enhancing external validity, with a focus on the data needed to make ex post statistical adjustments to enhance the applicability of experimental findings to populations potentially different from their study sample. Research design: We use a case study to illustrate how to generalize treatment effect estimates from a randomized trial sample to a target population, in particular comparing the sample of children in a randomized trial of a supplemental program for Head Start centers (the Research-Based, Developmentally Informed study) to the national population of children eligible for Head Start, as represented in the Head Start Impact Study. Results: For this case study, common data elements between the trial sample and population were limited, making reliable generalization from the trial sample to the population challenging. Conclusions: To answer important questions about external validity, more publicly available data are needed. In addition, future studies should make an effort to collect measures similar to those in other data sets. Measure comparability between population data sets and randomized trials that use samples of convenience will greatly enhance the range of research and policy relevant questions that can be answered.

Download Full-text

Randomization‐based inference for a marginal treatment effect in stepped wedge cluster randomized trials

Statistics in Medicine ◽

10.1002/sim.9040 ◽

2021 ◽

Author(s):

Dustin J. Rabideau ◽

Rui Wang

Keyword(s):

Treatment Effect ◽

Randomized Trials ◽

Cluster Randomized Trials ◽

Stepped Wedge ◽

Cluster Randomized

Download Full-text

Biased estimates of treatment effect in randomized trials

Controlled Clinical Trials ◽

10.1016/0197-2456(84)90075-8 ◽

1984 ◽

Vol 5 (3) ◽

pp. 303 ◽

Cited By ~ 2

Author(s):

Mitchell Gail ◽

Sam Wieand ◽

Steven Piantadosi

Keyword(s):

Treatment Effect ◽

Randomized Trials ◽

Biased Estimates

Download Full-text

Principles of Early Stopping of Randomized Trials for Efficacy: A Critique of Equipoise and an Alternative Nonexploitation Ethical Framework

Kennedy Institute of Ethics Journal ◽

10.1353/ken.2005.0010 ◽

2005 ◽

Vol 15 (2) ◽

pp. 161-178 ◽

Cited By ~ 19

Author(s):

David Buchanan ◽

Franklin G. Miller

Keyword(s):

Randomized Trials ◽

Ethical Framework ◽

Early Stopping

Download Full-text

Optimal Allocation of Interviews to Baseline and Endline Surveys in Place-Based Randomized Trials and Quasi-Experiments

Evaluation Review ◽

10.1177/0193841x18799128 ◽

2018 ◽

Vol 42 (4) ◽

pp. 391-422 ◽

Cited By ~ 2

Author(s):

Donald P. Green ◽

Winston Lin ◽

Claudia Gerber

Keyword(s):

Random Sample ◽

Treatment Effect ◽

Randomized Trials ◽

Optimal Allocation ◽

Average Treatment Effect ◽

Baseline Survey ◽

Baseline Measure ◽

Panel Surveys ◽

Planning Stage ◽

Average Treatment

Background: Many place-based randomized trials and quasi-experiments use a pair of cross-section surveys, rather than panel surveys, to estimate the average treatment effect of an intervention. In these studies, a random sample of individuals in each geographic cluster is selected for a baseline (preintervention) survey, and an independent random sample is selected for an endline (postintervention) survey. Objective: This design raises the question, given a fixed budget, how should a researcher allocate resources between the baseline and endline surveys to maximize the precision of the estimated average treatment effect? Results: We formalize this allocation problem and show that although the optimal share of interviews allocated to the baseline survey is always less than one-half, it is an increasing function of the total number of interviews per cluster, the cluster-level correlation between the baseline measure and the endline outcome, and the intracluster correlation coefficient. An example using multicountry survey data from Africa illustrates how the optimal allocation formulas can be combined with data to inform decisions at the planning stage. Another example uses data from a digital political advertising experiment in Texas to explore how precision would have varied with alternative allocations.

Download Full-text

Estimation of the Overall Treatment Effect in the Presence of Interference in Cluster-Randomized Trials of Infectious Disease Prevention

Epidemiologic Methods ◽

10.1515/em-2015-0016 ◽

2016 ◽

Vol 5 (1) ◽

Cited By ~ 4

Author(s):

Nicole Bohme Carnegie ◽

Rui Wang ◽

Victor De Gruttola

Keyword(s):

Infectious Disease ◽

Disease Prevention ◽

Social Interactions ◽

Treatment Effect ◽

Randomized Trials ◽

Effect Estimate ◽

Treatment Effect Estimate ◽

Partial Interference ◽

The Difference ◽

Cluster Randomized

AbstractAn issue that remains challenging in the field of causal inference is how to relax the assumption of no interference between units. Interference occurs when the treatment of one unit can affect the outcome of another, a situation which is likely to arise with outcomes that may depend on social interactions, such as occurrence of infectious disease. Existing methods to accommodate interference largely depend upon an assumption of “partial interference” – interference only within identifiable groups but not among them. There remains a considerable need for development of methods that allow further relaxation of the no-interference assumption. This paper focuses on an estimand that is the difference in the outcome that one would observe if the treatment were provided to all clusters compared to that outcome if treatment were provided to none – referred as the overall treatment effect. In trials of infectious disease prevention, the randomized treatment effect estimate will be attenuated relative to this overall treatment effect if a fraction of the exposures in the treatment clusters come from individuals who are outside these clusters. This source of interference – contacts sufficient for transmission that are with treated clusters – is potentially measurable. In this manuscript, we leverage epidemic models to infer the way in which a given level of interference affects the incidence of infection in clusters. This leads naturally to an estimator of the overall treatment effect that is easily implemented using existing software.

Download Full-text

Early stopping, interim analyses, and monitoring committees: what are the tradeoffs?

Journal of Clinical Oncology ◽

10.1200/jco.1987.5.9.1314 ◽

1987 ◽

Vol 5 (9) ◽

pp. 1314-1315 ◽

Cited By ~ 7

Author(s):

M Zelen

Keyword(s):

Early Stopping ◽

Interim Analyses

Download Full-text

Reporting of data monitoring committees and adverse events in paediatric trials: a descriptive analysis

BMJ Paediatrics Open ◽

10.1136/bmjpo-2018-000426 ◽

2019 ◽

Vol 3 (1) ◽

pp. e000426

Author(s):

Allison Gates ◽

Patrina Caldwell ◽

Sarah Curtis ◽

Leonila Dans ◽

Ricardo M Fernandes ◽

...

Keyword(s):

Adverse Events ◽

Descriptive Analysis ◽

Data Monitoring Committee ◽

Stopping Rules ◽

Data Monitoring ◽

Cochrane Central Register ◽

Drug Trials ◽

Early Stopping ◽

Cross Sectional ◽

Interim Analyses

ObjectivesFor 300 paediatric trials, we evaluated the reporting of: a data monitoring committee (DMC); interim analyses, stopping rules and early stopping; and adverse events and harm-related endpoints.MethodsFor this cross-sectional evaluation, we randomly selected 300 paediatric trials published in 2012 from the Cochrane Central Register of Controlled Trials. We collected data on the reporting of a DMC; interim analyses, stopping rules and early stopping; and adverse events and harm-related endpoints. We reported the findings descriptively and stratified by trial characteristics.ResultsEighty-five (28%) of the trials investigated drugs, and 18% (n=55/300) reported a DMC. The reporting of a DMC was more common among multicentre than single centre trials (n=41/132, 31% vs n=14/139, 10%, p<0.001) and industry-sponsored trials compared with those sponsored by other sources (n=16/50, 32% vs n=39/250, 16%, p=0.009). Trials that reported a DMC enrolled more participants than those that did not (median [range]): 224 (10–60480) vs 91 (10–9528) (p<0.001). Only 25% of these trials reported interim analyses, and 42% reported stopping rules. Less than half (n=143/300, 48%) of trials reported on adverse events, and 72% (n=215/300) reported on harm-related endpoints. Trials that reported a DMC compared with those that did not were more likely to report adverse events (n=43/55, 78% vs 100/245, 41%, p<0.001) and harm-related endpoints (n=52/55, 95% vs. 163/245, 67%, p<0.001). Only 32% of drug trials reported a DMC; 18% and 19% did not report on adverse events or harm-related endpoints, respectively.ConclusionsThe reporting of a DMC was infrequent, even among drug trials. Few trials reported stopping rules or interim analyses. Reporting of adverse events and harm-related endpoints was suboptimal.

Download Full-text