Reporting Frequency and Sample Size: Effects on Prediction, Confidence Levels and Confidence Intervals

Summary Eliminating the effect of confounding in observational studies typically involves fitting a model for an outcome adjusted for covariates. When, as often, these covariates are high-dimensional, this necessitates the use of sparse estimators, such as the lasso, or other regularization approaches. Naïve use of such estimators yields confidence intervals for the conditional treatment effect parameter that are not uniformly valid. Moreover, as the number of covariates grows with the sample size, correctly specifying a model for the outcome is nontrivial. In this article we deal with both of these concerns simultaneously, obtaining confidence intervals for conditional treatment effects that are uniformly valid, regardless of whether the outcome model is correct. This is done by incorporating an additional model for the treatment selection mechanism. When both models are correctly specified, we can weaken the standard conditions on model sparsity. Our procedure extends to multivariate treatment effect parameters and complex longitudinal settings.

Download Full-text

A comment on sample size calculations for binomial confidence intervals

Journal of Applied Statistics ◽

10.1080/02664763.2012.740629 ◽

2013 ◽

Vol 40 (2) ◽

pp. 311-319 ◽

Cited By ~ 2

Author(s):

Lai Wei ◽

Alan D. Hutson

Keyword(s):

Sample Size ◽

Confidence Intervals ◽

Sample Size Calculations

Download Full-text

Stillbirth Rate and Weight at Birth of Quintuplets in Japan

Acta geneticae medicae et gemellologiae twin research ◽

10.1017/s0001566000002841 ◽

1989 ◽

Vol 38 (1-2) ◽

pp. 65-69 ◽

Cited By ~ 6

Author(s):

Yoko Imaizumi

Keyword(s):

Sample Size ◽

Birth Order ◽

Gestational Age ◽

Size Effects ◽

Maternal Age ◽

Small Sample Size ◽

Small Sample ◽

Stillbirth Rate ◽

The Mean

AbstractNation-wide data in Japan on births and prenatal deaths of 16 sets of quintuplets during 1974-1985 were analysed. Among the 16 sets, 3 sets were liveborn, 8 were stillborn, and 5 were mixed, with a stillbirth rate of 0.64 (51/80). Effects of sex, maternal age and birth order on the stillbirth rate were not considered because of the small sample size. Effects of gestational age and birthweight on stillbirth rate were also examined. The mean weight of the 40 quintuplet individuals was 1,048 g.

Download Full-text

SMALL SAMPLE SIZE SCIENTIST

PEDIATRICS ◽

10.1542/peds.83.3.a72a ◽

1989 ◽

Vol 83 (3) ◽

pp. A72-A72

Author(s):

Student

Keyword(s):

Sample Size ◽

Confidence Intervals ◽

Causal Explanation ◽

Small Sample Size ◽

Small Sample ◽

Small Samples ◽

High Expectations ◽

Sampling Variation ◽

The Law ◽

The Stability

The believer in the law of small numbers practices science as follows: 1. He gambles his research hypotheses on small samples without realizing that the odds against him are unreasonably high. He overestimates power. 2. He has undue confidence in early trends (e.g., the data of the first few subjects) and in the stability of observed patterns (e.g., the number and identity of significant results). He overestimates significance. 3. In evaluating replications, his or others', he has unreasonably high expectations about the replicability of significant results. He underestimates the breadth of confidence intervals. 4. He rarely attributes a deviation of results from expectations to sampling variability, because he finds a causal "explanation" for any discrepancy. Thus, he has little opportunity to recognize sampling variation in action. His belief in the law of small numbers, therefore, will forever remain intact.

Download Full-text

- Confidence Intervals and Sample Size Determination

Probability, Statistics, and Reliability for Engineers and Scientists ◽

10.1201/b12161-15 ◽

2016 ◽

pp. 398-417

Keyword(s):

Sample Size ◽

Confidence Intervals ◽

Sample Size Determination ◽

Size Determination

Download Full-text

Null Hypothesis Significance Testing: Ramifications, Ruminations and Recommendations

South African Journal of Psychology ◽

10.1177/008124630503500101 ◽

2005 ◽

Vol 35 (1) ◽

pp. 1-20 ◽

Cited By ~ 2

Author(s):

G. K. Huysamen

Keyword(s):

Sample Size ◽

Confidence Intervals ◽

Effect Size ◽

Null Hypothesis ◽

Significance Testing ◽

Population Parameter ◽

Size Estimation ◽

Null Hypothesis Significance Testing ◽

Point Estimates ◽

Size Estimates

Criticisms of traditional null hypothesis significance testing (NHST) became more pronounced during the 1960s and reached a climax during the past decade. Among others, NHST says nothing about the size of the population parameter of interest and its result is influenced by sample size. Estimation of confidence intervals around point estimates of the relevant parameters, model fitting and Bayesian statistics represent some major departures from conventional NHST. Testing non-nil null hypotheses, determining optimal sample size to uncover only substantively meaningful effect sizes and reporting effect-size estimates may be regarded as minor extensions of NHST. Although there seems to be growing support for the estimation of confidence intervals around point estimates of the relevant parameters, it is unlikely that NHST-based procedures will disappear in the near future. In the meantime, it is widely accepted that effect-size estimates should be reported as a mandatory adjunct to conventional NHST results.

Download Full-text

Is there an early Far Western Lapita province? Sample size effects and new evidence from Eloaua Island

Archaeology in Oceania ◽

10.1002/j.1834-4453.1987.tb00176.x ◽

1987 ◽

Vol 22 (3) ◽

pp. 123-127 ◽

Cited By ~ 10

Author(s):

P.V. Kirch ◽

M.S. Allen ◽

V.L. Butler ◽

T.L. Hunt

Keyword(s):

Sample Size ◽

Size Effects ◽

New Evidence

Download Full-text

An ARFIMA-based model for daily precipitation amounts with direct access to fluctuations

Stochastic Environmental Research and Risk Assessment ◽

10.1007/s00477-020-01833-w ◽

2020 ◽

Vol 34 (10) ◽

pp. 1487-1505

Author(s):

Katja Polotzek ◽

Holger Kantz

Keyword(s):

Time Series ◽

Sample Size ◽

Confidence Intervals ◽

Daily Precipitation ◽

Rainfall Simulation ◽

Direct Access ◽

Effective Sample Size ◽

Sample Mean ◽

Daily Data ◽

Synthetic Time Series

Abstract Correlations in models for daily precipitation are often generated by elaborate numerics that employ a high number of hidden parameters. We propose a parsimonious and parametric stochastic model for European mid-latitude daily precipitation amounts with focus on the influence of correlations on the statistics. Our method is meta-Gaussian by applying a truncated-Gaussian-power (tGp) transformation to a Gaussian ARFIMA model. The speciality of this approach is that ARFIMA(1, d, 0) processes provide synthetic time series with long- (LRC), meaning the sum of all autocorrelations is infinite, and short-range (SRC) correlations by only one parameter each. Our model requires the fit of only five parameters overall that have a clear interpretation. For model time series of finite length we deduce an effective sample size for the sample mean, whose variance is increased due to correlations. For example the statistical uncertainty of the mean daily amount of 103 years of daily records at the Fichtelberg mountain in Germany equals the one of about 14 years of independent daily data. Our effective sample size approach also yields theoretical confidence intervals for annual total amounts and allows for proper model validation in terms of the empirical mean and fluctuations of annual totals. We evaluate probability plots for the daily amounts, confidence intervals based on the effective sample size for the daily mean and annual totals, and the Mahalanobis distance for the annual maxima distribution. For reproducing annual maxima the way of fitting the marginal distribution is more crucial than the presence of correlations, which is the other way round for annual totals. Our alternative to rainfall simulation proves capable of modeling daily precipitation amounts as the statistics of a random selection of 20 data sets is well reproduced.

Download Full-text

Marked Cards in the Pack: Using Playing Cards to Teach the Importance of Sample Size & Testing Assumptions in Capture–Recapture Estimations of Population Size

The American Biology Teacher ◽

10.1525/abt.2020.82.6.396 ◽

2020 ◽

Vol 82 (6) ◽

pp. 396-401

Author(s):

Michael Calver ◽

Timothy Blake

Keyword(s):

Secondary School ◽

Sample Size ◽

Population Size ◽

Confidence Intervals ◽

Population Ecology ◽

Simple Approach ◽

Population Estimates ◽

Index Method ◽

Capture Recapture ◽

Approach To Teaching

Estimating population size is essential for many applications in population ecology, so capture–recapture techniques to do this are often taught in secondary school classrooms and introductory university units. However, few classroom simulations of capture–recapture consider the sensitivity of results to sampling intensity, the important concept that the population size calculated is an estimate with error attached, or the consequences of violating assumptions underpinning particular capture–recapture models. We describe a simple approach to teaching the Lincoln index method of capture–recapture using packs of playing cards. Students can trial different sampling intensities, calculate 95% confidence intervals for population estimates, and explore the consequences of violating specific assumptions.

Download Full-text