Sample size determinations using logistic regression with pilot data

1993 ◽  
Vol 12 (11) ◽  
pp. 1079-1084 ◽  
Author(s):  
Virginia F. Flack ◽  
T. Lynn Eudey
2021 ◽  
pp. 174077452110101
Author(s):  
Jennifer Proper ◽  
John Connett ◽  
Thomas Murray

Background: Bayesian response-adaptive designs, which data adaptively alter the allocation ratio in favor of the better performing treatment, are often criticized for engendering a non-trivial probability of a subject imbalance in favor of the inferior treatment, inflating type I error rate, and increasing sample size requirements. The implementation of these designs using the Thompson sampling methods has generally assumed a simple beta-binomial probability model in the literature; however, the effect of these choices on the resulting design operating characteristics relative to other reasonable alternatives has not been fully examined. Motivated by the Advanced R2 Eperfusion STrategies for Refractory Cardiac Arrest trial, we posit that a logistic probability model coupled with an urn or permuted block randomization method will alleviate some of the practical limitations engendered by the conventional implementation of a two-arm Bayesian response-adaptive design with binary outcomes. In this article, we discuss up to what extent this solution works and when it does not. Methods: A computer simulation study was performed to evaluate the relative merits of a Bayesian response-adaptive design for the Advanced R2 Eperfusion STrategies for Refractory Cardiac Arrest trial using the Thompson sampling methods based on a logistic regression probability model coupled with either an urn or permuted block randomization method that limits deviations from the evolving target allocation ratio. The different implementations of the response-adaptive design were evaluated for type I error rate control across various null response rates and power, among other performance metrics. Results: The logistic regression probability model engenders smaller average sample sizes with similar power, better control over type I error rate, and more favorable treatment arm sample size distributions than the conventional beta-binomial probability model, and designs using the alternative randomization methods have a negligible chance of a sample size imbalance in the wrong direction. Conclusion: Pairing the logistic regression probability model with either of the alternative randomization methods results in a much improved response-adaptive design in regard to important operating characteristics, including type I error rate control and the risk of a sample size imbalance in favor of the inferior treatment.


1998 ◽  
Vol 40 (4) ◽  
pp. 307-312 ◽  
Author(s):  
Maxia Dong ◽  
Martin R. Petersen ◽  
Mark J. Mendell

2017 ◽  
Vol 28 (3) ◽  
pp. 822-834
Author(s):  
Mitchell H Gail ◽  
Sebastien Haneuse

Sample size calculations are needed to design and assess the feasibility of case-control studies. Although such calculations are readily available for simple case-control designs and univariate analyses, there is limited theory and software for multivariate unconditional logistic analysis of case-control data. Here we outline the theory needed to detect scalar exposure effects or scalar interactions while controlling for other covariates in logistic regression. Both analytical and simulation methods are presented, together with links to the corresponding software.


Author(s):  
El-Housainy A. Rady ◽  
Mohamed R. Abonazel ◽  
Mariam H. Metawe’e

Goodness of fit (GOF) tests of logistic regression attempt to find out the suitability of the model to the data. The null hypothesis of all GOF tests is the model fit. R as a free software package has many GOF tests in different packages. A Monte Carlo simulation has been conducted to study two situations; the first, studying the ability of each test, under its default settings, to accept the null hypothesis when the model truly fitted. The second, studying the power of these tests when assumptions of sufficient linear combination of the explanatory variables are violated (by omitting linear covariate term, quadratic term, or interaction term). Moreover, checking whether the same test in different R packages had the same results or not. As the sample size supposed to affect simulation results, so the pattern of change of GOF tests results under different sample sizes as well as different model settings was estimated. All tests accept the null hypothesis (more than 95% of simulation trials) when the model truly fitted except modified Hosmer-Lemeshow test in "LogisticDx" package under all different model settings and Osius and Rojek’s (OsRo) test when the true model had an interaction term between binary and categorical covariates. In addition, le Cessie-van Houwelingen-Copas-Hosmer unweighted sum of squares (CHCH) test gave unexpected different results under different packages. Concerning the power study, all tests had a very low power when a departure of missing covariate existed. Generally, stukel’s test (package ’LogisticDX) and CHCH test (package "RMS") reached a power in detecting a missing quadratic term greater than 80% under lower sample size while OsRo test (package ’LogisticDX’) was better in detecting missing interaction term. Beside the simulation study, we evaluated the performance of GOF tests using the breast cancer dataset.


2019 ◽  
Vol 29 (Supplement_4) ◽  
Author(s):  
B Mete ◽  
E Pehlivan ◽  
V Söyiler

Abstract Background The aim of this study was to determine the prevalence of smoking and abuse of substance among young people aged 14-18 in a city of Turkey and to determine the relationship between smoking and substance abuse risk. Methods This cross-sectional study was conducted on high school students studying in Bingöl city center. The universe of the study consists of 14000 students studying in 14 high schools. The minimum sample size required to be reached in the sample size analysis with reference to 80% power and 99% confidence interval was found to be 1235. According to the stratified sampling method, the students were randomly reached in schools and questionnaires were conducted under supervision by taking their consent. Chi-square test, Binary Logistic Regression test were used for data analysis. Results The mean age of the students was 15.71 ± 1.16 (min-max: 14-18) and 49.5% were male. The prevalence of smoking among all students is 15.8%, addictive substance use / trial frequency 5% except smoking. The prevalence of smoking among male students is 24.1%, in female students 7.7%. The rate of using addictive substance was found to be 8.2% for male students and 1.9% for female students except smoking. According to the results of Logistic Regression; substance abuse increases 8 (95% CI:3,32-19,95) fold in smokers (p = 0,001) and 2.5 (95% CI:1,10-5,38) fold in men (p = 0,027). The risk of substance use increases 1.05 (95% CI:1,02-1,08) fold as the number of cigarettes smoked daily (p = 0,001). Substance abuse risk of 18-year-olds shows increase 1.5 (95% CI:1,06-1,93) fold according to 14 years old (p = 0,021). Conclusions Smoking and addictive substance use in adolescents are particularly remarkable in male students (8.2%). This result is higher than the data reflecting İstanbul (7%). This may be due to the fact that the province is located at the crossing point of drug traffic. Smoking increases the risk of other addictive substances (marijuana, heroin, etc.). Key messages Smoking and substance abuse is an important health problem in adolescents according to this study. Male students smoke are at risk of substance abuse more than female.


2009 ◽  
Vol 9 (1) ◽  
Author(s):  
Szilard Nemes ◽  
Junmei Miao Jonasson ◽  
Anna Genell ◽  
Gunnar Steineck

2016 ◽  
Vol 2016 ◽  
pp. 1-8 ◽  
Author(s):  
Elahe Allahyari ◽  
Peyman Jafari ◽  
Zahra Bagheri

Objective.The present study uses simulated data to find what the optimal number of response categories is to achieve adequate power in ordinal logistic regression (OLR) model for differential item functioning (DIF) analysis in psychometric research.Methods.A hypothetical ten-item quality of life scale with three, four, and five response categories was simulated. The power and type I error rates of OLR model for detecting uniform DIF were investigated under different combinations of ability distribution (θ), sample size, sample size ratio, and the magnitude of uniform DIF across reference and focal groups.Results.Whenθwas distributed identically in the reference and focal groups, increasing the number of response categories from 3 to 5 resulted in an increase of approximately 8% in power of OLR model for detecting uniform DIF. The power of OLR was less than 0.36 when ability distribution in the reference and focal groups was highly skewed to the left and right, respectively.Conclusions.The clearest conclusion from this research is that the minimum number of response categories for DIF analysis using OLR is five. However, the impact of the number of response categories in detecting DIF was lower than might be expected.


Sign in / Sign up

Export Citation Format

Share Document