Sample size calculation for the one-sample log-rank test

2014 ◽  
Vol 14 (1) ◽  
pp. 26-33 ◽  
Author(s):  
Jianrong Wu
2014 ◽  
Vol 34 (6) ◽  
pp. 1031-1040 ◽  
Author(s):  
René Schmidt ◽  
Robert Kwiecien ◽  
Andreas Faldum ◽  
Frank Berthold ◽  
Barbara Hero ◽  
...  

Author(s):  
Patrick Royston

Most randomized controlled trials with a time-to-event outcome are designed and analyzed assuming proportional hazards of the treatment effect. The sample-size calculation is based on a log-rank test or the equivalent Cox test. Nonproportional hazards are seen increasingly in trials and are recognized as a potential threat to the power of the log-rank test. To address the issue, Royston and Parmar (2016, BMC Medical Research Methodology 16: 16) devised a new “combined test” of the global null hypothesis of identical survival curves in each trial arm. The test, which combines the conventional Cox test with a new formulation, is based on the maximal standardized difference in restricted mean survival time (RMST) between the arms. The test statistic is based on evaluations of RMST over several preselected time points. The combined test involves the minimum p-value across the Cox and RMST-based tests, appropriately standardized to have the correct null distribution. In this article, I outline the combined test and introduce a command, stctest, that implements the combined test. I point the way to additional tools currently under development for power and sample-size calculation for the combined test.


2020 ◽  
Vol 29 (10) ◽  
pp. 2814-2829
Author(s):  
Laura Kerschke ◽  
Andreas Faldum ◽  
Rene Schmidt

The one-sample log-rank test allows to compare the survival of a single sample with a prefixed reference survival curve. It naturally applies in single-arm phase IIa trials with time-to-event endpoint. Several authors have described that the original one-sample log-rank test is conservative when sample size is small and have proposed strategies to correct the conservativeness. Here, we propose an alternative approach to improve the one-sample log-rank test. Our new one-sample log-rank statistic is based on the unique transformation of the underlying counting process martingale such that the moments of the limiting normal distribution have no shared parameters. Simulation results show that the new one-sample log-rank test gives type I error rate and power close to the nominal levels also when sample size is small, while relevantly reducing the required sample size to achieve the desired power as compared to current approaches to design studies to compare the survival outcome of a sample with a reference.


Author(s):  
Patrick Royston

Randomized controlled trials with a time-to-event outcome are usually designed and analyzed assuming proportional hazards (PH) of the treatment effect. The sample-size calculation is based on a log-rank test or the nearly identical Cox test, henceforth called the Cox/log-rank test. Nonproportional hazards (non-PH) has become more common in trials and is recognized as a potential threat to interpreting the trial treatment effect and the power of the log-rank test—hence to the success of the trial. To address the issue, in 2016, Royston and Parmar ( BMC Medical Research Methodology 16: 16) proposed a “combined test” of the global null hypothesis of identical survival curves in each trial arm. The Cox/log-rank test is combined with a new test derived from the maximal standardized difference in restricted mean survival time (RMST) between the trial arms. The test statistic is based on evaluations of the between-arm difference in RMST over several preselected time points. The combined test involves the minimum p-value across the Cox/log-rank and RMST-based tests, appropriately standardized to have the correct distribution under the global null hypothesis. In this article, I introduce a new command, power_ct, that uses simulation to implement power and sample-size calculations for the combined test. power_ct supports designs with PH or non-PH of the treatment effect. I provide examples in which the power of the combined test is compared with that of the Cox/log-rank test under PH and non-PH scenarios. I conclude by offering guidance for sample-size calculations in time-to-event trials to allow for possible non-PH.


Author(s):  
Jonathan A. Cook ◽  
Steven A. Julious ◽  
William Sones ◽  
Lisa V. Hampson ◽  
Catherine Hewitt ◽  
...  

The aim of this document is to provide practical guidance on the choice of target difference used in the sample size calculation of a randomised controlled trial (RCT). Guidance is provided with a definitive trial, one that seeks to provide a useful answer, in mind and not those of a more exploratory nature. The term “target difference” is taken throughout to refer to the difference that is used in the sample size calculation (the one that the study formally “targets”). Please see the glossary for definitions and clarification with regards other relevant concepts. In order to address the specification of the target difference, it is appropriate, and to some degree necessary, to touch on related statistical aspects of conducting a sample size calculation. Generally the discussion of other aspects and more technical details is kept to a minimum, with more technical aspects covered in the appendices and referencing of relevant sources provided for further reading.The main body of this guidance assumes a standard RCT design is used; formally, this can be described as a two-arm parallel-group superiority trial. Most RCTs test for superiority of the interventions, that is, whether or not one of the interventions is superior to the other (See Box 1 for a formal definition of superiority, and of the two most common alternative approaches). Some common alternative trial designs are considered in Appendix 3. Additionally, it is assumed in the main body of the text that the conventional (Neyman-Pearson) approach to the sample size calculation of an RCT is being used. Other approaches (Bayesian, precision and value of information) are briefly considered in Appendix 2 with reference to the specification of the target difference.


2020 ◽  
Vol 29 (10) ◽  
pp. 2958-2971 ◽  
Author(s):  
Maria Stark ◽  
Antonia Zapf

Introduction In a confirmatory diagnostic accuracy study, sensitivity and specificity are considered as co-primary endpoints. For the sample size calculation, the prevalence of the target population must be taken into account to obtain a representative sample. In this context, a general problem arises. With a low or high prevalence, the study may be overpowered in one subpopulation. One further issue is the correct pre-specification of the true prevalence. With an incorrect assumption about the prevalence, an over- or underestimated sample size will result. Methods To obtain the desired power independent of the prevalence, a method for an optimal sample size calculation for the comparison of a diagnostic experimental test with a prespecified minimum sensitivity and specificity is proposed. To face the problem of an incorrectly pre-specified prevalence, a blinded one-time re-estimation design of the sample size based on the prevalence and a blinded repeated re-estimation design of the sample size based on the prevalence are evaluated by a simulation study. Both designs are compared to a fixed design and additionally among each other. Results The type I error rates of both blinded re-estimation designs are not inflated. Their empirical overall power equals the desired theoretical power and both designs offer unbiased estimates of the prevalence. The repeated re-estimation design reveals no advantages concerning the mean squared error of the re-estimated prevalence or sample size compared to the one-time re-estimation design. The appropriate size of the internal pilot study in the one-time re-estimation design is 50% of the initially calculated sample size. Conclusions A one-time re-estimation design of the prevalence based on the optimal sample size calculation is recommended in single-arm diagnostic accuracy studies.


2016 ◽  
Vol 27 (7) ◽  
pp. 2132-2141 ◽  
Author(s):  
Guogen Shan

In an agreement test between two raters with binary endpoints, existing methods for sample size calculation are always based on asymptotic approaches that use limiting distributions of a test statistic under null and alternative hypotheses. These calculated sample sizes may be not reliable due to the unsatisfactory type I error control of asymptotic approaches. We propose a new sample size calculation based on exact approaches which control for the type I error rate. The two exact approaches are considered: one approach based on maximization and the other based on estimation and maximization. We found that the latter approach is generally more powerful than the one based on maximization. Therefore, we present the sample size calculation based on estimation and maximization. A real example from a clinical trial to diagnose low back pain of patients is used to illustrate the two exact testing procedures and sample size determination.


Sign in / Sign up

Export Citation Format

Share Document