Reporting Frequency and Sample Size: Effects on Prediction, Confidence Levels and Confidence Intervals

Author(s):  
Terence J. Pitre
Biometrika ◽  
2020 ◽  
Author(s):  
Oliver Dukes ◽  
Stijn Vansteelandt

Summary Eliminating the effect of confounding in observational studies typically involves fitting a model for an outcome adjusted for covariates. When, as often, these covariates are high-dimensional, this necessitates the use of sparse estimators, such as the lasso, or other regularization approaches. Naïve use of such estimators yields confidence intervals for the conditional treatment effect parameter that are not uniformly valid. Moreover, as the number of covariates grows with the sample size, correctly specifying a model for the outcome is nontrivial. In this article we deal with both of these concerns simultaneously, obtaining confidence intervals for conditional treatment effects that are uniformly valid, regardless of whether the outcome model is correct. This is done by incorporating an additional model for the treatment selection mechanism. When both models are correctly specified, we can weaken the standard conditions on model sparsity. Our procedure extends to multivariate treatment effect parameters and complex longitudinal settings.


1989 ◽  
Vol 38 (1-2) ◽  
pp. 65-69 ◽  
Author(s):  
Yoko Imaizumi

AbstractNation-wide data in Japan on births and prenatal deaths of 16 sets of quintuplets during 1974-1985 were analysed. Among the 16 sets, 3 sets were liveborn, 8 were stillborn, and 5 were mixed, with a stillbirth rate of 0.64 (51/80). Effects of sex, maternal age and birth order on the stillbirth rate were not considered because of the small sample size. Effects of gestational age and birthweight on stillbirth rate were also examined. The mean weight of the 40 quintuplet individuals was 1,048 g.


PEDIATRICS ◽  
1989 ◽  
Vol 83 (3) ◽  
pp. A72-A72
Author(s):  
Student

The believer in the law of small numbers practices science as follows: 1. He gambles his research hypotheses on small samples without realizing that the odds against him are unreasonably high. He overestimates power. 2. He has undue confidence in early trends (e.g., the data of the first few subjects) and in the stability of observed patterns (e.g., the number and identity of significant results). He overestimates significance. 3. In evaluating replications, his or others', he has unreasonably high expectations about the replicability of significant results. He underestimates the breadth of confidence intervals. 4. He rarely attributes a deviation of results from expectations to sampling variability, because he finds a causal "explanation" for any discrepancy. Thus, he has little opportunity to recognize sampling variation in action. His belief in the law of small numbers, therefore, will forever remain intact.


2005 ◽  
Vol 35 (1) ◽  
pp. 1-20 ◽  
Author(s):  
G. K. Huysamen

Criticisms of traditional null hypothesis significance testing (NHST) became more pronounced during the 1960s and reached a climax during the past decade. Among others, NHST says nothing about the size of the population parameter of interest and its result is influenced by sample size. Estimation of confidence intervals around point estimates of the relevant parameters, model fitting and Bayesian statistics represent some major departures from conventional NHST. Testing non-nil null hypotheses, determining optimal sample size to uncover only substantively meaningful effect sizes and reporting effect-size estimates may be regarded as minor extensions of NHST. Although there seems to be growing support for the estimation of confidence intervals around point estimates of the relevant parameters, it is unlikely that NHST-based procedures will disappear in the near future. In the meantime, it is widely accepted that effect-size estimates should be reported as a mandatory adjunct to conventional NHST results.


1987 ◽  
Vol 22 (3) ◽  
pp. 123-127 ◽  
Author(s):  
P.V. Kirch ◽  
M.S. Allen ◽  
V.L. Butler ◽  
T.L. Hunt

2020 ◽  
Vol 34 (10) ◽  
pp. 1487-1505
Author(s):  
Katja Polotzek ◽  
Holger Kantz

Abstract Correlations in models for daily precipitation are often generated by elaborate numerics that employ a high number of hidden parameters. We propose a parsimonious and parametric stochastic model for European mid-latitude daily precipitation amounts with focus on the influence of correlations on the statistics. Our method is meta-Gaussian by applying a truncated-Gaussian-power (tGp) transformation to a Gaussian ARFIMA model. The speciality of this approach is that ARFIMA(1, d, 0) processes provide synthetic time series with long- (LRC), meaning the sum of all autocorrelations is infinite, and short-range (SRC) correlations by only one parameter each. Our model requires the fit of only five parameters overall that have a clear interpretation. For model time series of finite length we deduce an effective sample size for the sample mean, whose variance is increased due to correlations. For example the statistical uncertainty of the mean daily amount of 103 years of daily records at the Fichtelberg mountain in Germany equals the one of about 14 years of independent daily data. Our effective sample size approach also yields theoretical confidence intervals for annual total amounts and allows for proper model validation in terms of the empirical mean and fluctuations of annual totals. We evaluate probability plots for the daily amounts, confidence intervals based on the effective sample size for the daily mean and annual totals, and the Mahalanobis distance for the annual maxima distribution. For reproducing annual maxima the way of fitting the marginal distribution is more crucial than the presence of correlations, which is the other way round for annual totals. Our alternative to rainfall simulation proves capable of modeling daily precipitation amounts as the statistics of a random selection of 20 data sets is well reproduced.


2020 ◽  
Vol 82 (6) ◽  
pp. 396-401
Author(s):  
Michael Calver ◽  
Timothy Blake

Estimating population size is essential for many applications in population ecology, so capture–recapture techniques to do this are often taught in secondary school classrooms and introductory university units. However, few classroom simulations of capture–recapture consider the sensitivity of results to sampling intensity, the important concept that the population size calculated is an estimate with error attached, or the consequences of violating assumptions underpinning particular capture–recapture models. We describe a simple approach to teaching the Lincoln index method of capture–recapture using packs of playing cards. Students can trial different sampling intensities, calculate 95% confidence intervals for population estimates, and explore the consequences of violating specific assumptions.


Sign in / Sign up

Export Citation Format

Share Document