Sample Size Choice: Charts for Experiments With Linear Models. (2nd ed.) (Vol. 122 in the Statistics: Textbooks and Monographs Series).

1992 ◽  
Vol 87 (418) ◽  
pp. 591
Author(s):  
Peter A. Lachenbruch ◽  
Robert E. Odeh ◽  
Martin Fox
Keyword(s):  
Metrika ◽  
2019 ◽  
Vol 83 (2) ◽  
pp. 243-254
Author(s):  
Mathias Lindholm ◽  
Felix Wahl

Abstract In the present note we consider general linear models where the covariates may be both random and non-random, and where the only restrictions on the error terms are that they are independent and have finite fourth moments. For this class of models we analyse the variance parameter estimator. In particular we obtain finite sample size bounds for the variance of the variance parameter estimator which are independent of covariate information regardless of whether the covariates are random or not. For the case with random covariates this immediately yields bounds on the unconditional variance of the variance estimator—a situation which in general is analytically intractable. The situation with random covariates is illustrated in an example where a certain vector autoregressive model which appears naturally within the area of insurance mathematics is analysed. Further, the obtained bounds are sharp in the sense that both the lower and upper bound will converge to the same asymptotic limit when scaled with the sample size. By using the derived bounds it is simple to show convergence in mean square of the variance parameter estimator for both random and non-random covariates. Moreover, the derivation of the bounds for the above general linear model is based on a lemma which applies in greater generality. This is illustrated by applying the used techniques to a class of mixed effects models.


2020 ◽  
Vol 99 (13) ◽  
pp. 1453-1460
Author(s):  
D. Qin ◽  
F. Hua ◽  
H. He ◽  
S. Liang ◽  
H. Worthington ◽  
...  

The objectives of this study were to assess the reporting quality and methodological quality of split-mouth trials (SMTs) published during the past 2 decades and to determine whether there has been an improvement in their quality over time. We searched the MEDLINE database via PubMed to identify SMTs published in 1998, 2008, and 2018. For each included SMT, we used the CONsolidated Standards Of Reporting Trials (CONSORT) 2010 guideline, CONSORT for within-person trial (WPT) extension, and a new 3-item checklist to assess its trial reporting quality (TRQ), WPT-specific reporting quality (WRQ), and SMT-specific methodological quality (SMQ), respectively. Multivariable generalized linear models were performed to analyze the quality of SMTs over time, adjusting for potential confounding factors. A total of 119 SMTs were included. The mean overall score for the TRQ (score range, 0 to 32), WRQ (0 to 15), and SMQ (0 to 3) was 15.77 (SD 4.51), 6.06 (2.06), and 1.12 (0.70), respectively. The primary outcome was clearly defined in only 28 SMTs (23.5%), and only 27 (22.7%) presented a replicable sample size calculation. Only 45 SMTs (37.8%) provided the rationale for using a split-mouth design. The correlation between body sites was reported in only 5 studies (4.2%) for sample size calculation and 4 studies (3.4%) for statistical results. Only 2 studies (1.7%) performed an appropriate sample size calculation, and 46 (38.7%) chose appropriate statistical methods, both accounting for the correlation among treatment groups and the clustering/multiplicity of measurements within an individual. Results of regression analyses suggested that the TRQ of SMTs improved significantly with time ( P < 0.001), while there was no evidence of improvement in WRQ or SMQ. Both the reporting quality and methodological quality of SMTs still have much room for improvement. Concerted efforts are needed to improve the execution and reporting of SMTs.


Technometrics ◽  
1993 ◽  
Vol 35 (2) ◽  
pp. 234-235
Author(s):  
Michael R. Emptage
Keyword(s):  

PeerJ ◽  
2021 ◽  
Vol 9 ◽  
pp. e11096
Author(s):  
Hannah L. Buckley ◽  
Nicola J. Day ◽  
Bradley S. Case ◽  
Gavin Lear

Effective and robust ways to describe, quantify, analyse, and test for change in the structure of biological communities over time are essential if ecological research is to contribute substantively towards understanding and managing responses to ongoing environmental changes. Structural changes reflect population dynamics, changes in biomass and relative abundances of taxa, and colonisation and extinction events observed in samples collected through time. Most previous studies of temporal changes in the multivariate datasets that characterise biological communities are based on short time series that are not amenable to data-hungry methods such as multivariate generalised linear models. Here, we present a roadmap for the analysis of temporal change in short-time-series, multivariate, ecological datasets. We discuss appropriate methods and important considerations for using them such as sample size, assumptions, and statistical power. We illustrate these methods with four case-studies analysed using the R data analysis environment.


2021 ◽  
Author(s):  
Wanderson Bucker Moraes ◽  
Laurence V Madden ◽  
Pierce A. Paul

Since Fusarium head blight (FHB) intensity is usually highly variable within a plot, the number of spikes rated for FHB index (IND) quantification must be considered when designing experiments. In addition, quantification of sources of IND heterogeneity is crucial for defining sampling protocols. Field experiments were conducted to quantify the variability of IND (‘field severity’) at different spatial scales and to investigate the effects of sample size on estimated plot-level mean IND and its accuracy. A total of 216 7-row x 6-m-long plots of a moderately resistant and a susceptible cultivar were spray inoculated with different Fusarium graminearum spore concentrations at anthesis to generate a range of IND levels. A one-stage cluster sampling approach was used to estimate IND, with an average of 32 spikes rated at each of 10 equally spaced points per plot. Plot-level mean IND ranged from 0.9 to 37.9%. Heterogeneity of IND, quantified by fitting unconditional hierarchical linear models, was higher among spikes within clusters than among clusters within plots or among plots. The projected relative error of mean IND increased as mean IND decreased, and as sample size decreased below 100 spikes per plot. Simple random samples were drawn with replacement 50,000 times from the original dataset for each plot and used to estimate the effects of sample sizes on mean IND. Samples of 100 or more spikes resulted in more precise estimates of mean IND than smaller samples. Poor sampling may result in inaccurate estimates of IND and poor interpretation of results.


Author(s):  
Chung-I Li ◽  
Yu Shyr

AbstractAs RNA-seq rapidly develops and costs continually decrease, the quantity and frequency of samples being sequenced will grow exponentially. With proteomic investigations becoming more multivariate and quantitative, determining a study’s optimal sample size is now a vital step in experimental design. Current methods for calculating a study’s required sample size are mostly based on the hypothesis testing framework, which assumes each gene count can be modeled through Poisson or negative binomial distributions; however, these methods are limited when it comes to accommodating covariates. To address this limitation, we propose an estimating procedure based on the generalized linear model. This easy-to-use method constructs a representative exemplary dataset and estimates the conditional power, all without requiring complicated mathematical approximations or formulas. Even more attractive, the downstream analysis can be performed with current R/Bioconductor packages. To demonstrate the practicability and efficiency of this method, we apply it to three real-world studies, and introduce our on-line calculator developed to determine the optimal sample size for a RNA-seq study.


Sign in / Sign up

Export Citation Format

Share Document