Mechanical Reliability Confidence Limits

D. Kececioglu; G. Lamarre

doi:10.1115/1.3453977

The Probability That a Measurement Falls within a Range of Standard Deviations from an Estimate of the Mean

ISRN Applied Mathematics ◽

10.5402/2012/710806 ◽

2012 ◽

Vol 2012 ◽

pp. 1-8 ◽

Cited By ~ 2

Author(s):

Louis M. Houston

Keyword(s):

Confidence Interval ◽

Sample Size ◽

General Equation ◽

Sample Sizes ◽

The Mean ◽

Standard Deviations ◽

Intermediate Value ◽

Theoretical Results

We derive a general equation for the probability that a measurement falls within a range of n standard deviations from an estimate of the mean. So, we provide a format that is compatible with a confidence interval centered about the mean that is naturally independent of the sample size. The equation is derived by interpolating theoretical results for extreme sample sizes. The intermediate value of the equation is confirmed with a computational test.

Download Full-text

Necessary Sample Size for Method Comparison Studies Based on Regression Analysis

Clinical Chemistry ◽

10.1093/clinchem/45.6.882 ◽

1999 ◽

Vol 45 (6) ◽

pp. 882-894 ◽

Cited By ~ 66

Author(s):

Kristian Linnet

Keyword(s):

Regression Analysis ◽

Sample Size ◽

Statistical Power ◽

Theoretical Calculations ◽

Method Comparison ◽

Sample Sizes ◽

Analytical Error ◽

Deming Regression ◽

Standard Deviations ◽

Range Of Values

Abstract Background: In method comparison studies, it is of importance to assure that the presence of a difference of medical importance is detected. For a given difference, the necessary number of samples depends on the range of values and the analytical standard deviations of the methods involved. For typical examples, the present study evaluates the statistical power of least-squares and Deming regression analyses applied to the method comparison data. Methods: Theoretical calculations and simulations were used to consider the statistical power for detection of slope deviations from unity and intercept deviations from zero. For situations with proportional analytical standard deviations, weighted forms of regression analysis were evaluated. Results: In general, sample sizes of 40–100 samples conventionally used in method comparison studies often must be reconsidered. A main factor is the range of values, which should be as wide as possible for the given analyte. For a range ratio (maximum value divided by minimum value) of 2, 544 samples are required to detect one standardized slope deviation; the number of required samples decreases to 64 at a range ratio of 10 (proportional analytical error). For electrolytes having very narrow ranges of values, very large sample sizes usually are necessary. In case of proportional analytical error, application of a weighted approach is important to assure an efficient analysis; e.g., for a range ratio of 10, the weighted approach reduces the requirement of samples by >50%. Conclusions: Estimation of the necessary sample size for a method comparison study assures a valid result; either no difference is found or the existence of a relevant difference is confirmed.

Download Full-text

Comparing the MCMC Efficiency of JAGS and Stan for the Multi-Level Intercept-Only Model in the Covariance- and Mean-Based and Classic Parametrization

Psych ◽

10.3390/psych3040048 ◽

2021 ◽

Vol 3 (4) ◽

pp. 751-779

Author(s):

Martin Hecht ◽

Sebastian Weirich ◽

Steffen Zitzmann

Keyword(s):

Sample Size ◽

Effective Sample Size ◽

Sampling Efficiency ◽

Model Estimation ◽

Estimation Technique ◽

Simulation Studies ◽

Sample Sizes ◽

Bayesian Mcmc ◽

Empirical Results ◽

Multi Level

Bayesian MCMC is a widely used model estimation technique, and software from the BUGS family, such as JAGS, have been popular for over two decades. Recently, Stan entered the market with promises of higher efficiency fueled by advanced and more sophisticated algorithms. With this study, we want to contribute empirical results to the discussion about the sampling efficiency of JAGS and Stan. We conducted three simulation studies in which we varied the number of warmup iterations, the prior informativeness, and sample sizes and employed the multi-level intercept-only model in the covariance- and mean-based and in the classic parametrization. The target outcome was MCMC efficiency measured as effective sample size per second (ESS/s). Based on our specific (and limited) study setup, we found that (1) MCMC efficiency is much higher for the covariance- and mean-based parametrization than for the classic parametrization, (2) Stan clearly outperforms JAGS when the covariance- and mean-based parametrization is used, and that (3) JAGS clearly outperforms Stan when the classic parametrization is used.

Download Full-text

Evaluation of satellite collar sample size requirements for mitigation of low-level military jet disturbance of the George River caribou herd

Rangifer ◽

10.7557/2.23.5.1713 ◽

2003 ◽

Vol 23 (5) ◽

pp. 297 ◽

Cited By ~ 1

Author(s):

Robert D. Otto ◽

Neal P.P. Simon ◽

Serge Couturier ◽

Isabelle Schmelzer

Keyword(s):

Sample Size ◽

Statistical Power ◽

Rangifer Tarandus ◽

Radio Telemetry ◽

A Priori ◽

Effective Sample Size ◽

Sample Sizes ◽

Low Level ◽

Department Of National Defence ◽

George River

Wildlife radio-telemetry and tracking projects often determine a priori required sample sizes by statistical means or default to the maximum number that can be maintained within a limited budget. After initiation of such projects, little attention is focussed on effective sample size requirements, resulting in lack of statistical power. The Department of National Defence operates a base in Labrador, Canada for low level jet fighter training activities, and maintain a sample of satellite collars on the George River caribou (Rangifer tarandus caribou) herd of the region for spatial avoidance mitiga¬tion purposes. We analysed existing location data, in conjunction with knowledge of life history, to develop estimates of satellite collar sample sizes required to ensure adequate mitigation of GRCH. We chose three levels of probability in each of six annual caribou seasons. Estimated number of collars required ranged from 15 to 52, 23 to 68, and 36 to 184 for 50%, 75%, and 90% probability levels, respectively, depending on season. Estimates can be used to make more informed decisions about mitigation of GRCH, and, generally, our approach provides a means to adaptively assess radio collar sam¬ple sizes for ongoing studies.

Download Full-text

The Implications of Alternative Allocation Criteria in Adaptive Design for Panel Surveys

Journal of Official Statistics ◽

10.1515/jos-2017-0036 ◽

2017 ◽

Vol 33 (3) ◽

pp. 781-799 ◽

Cited By ~ 3

Author(s):

Olena Kaminska ◽

Peter Lynn

Keyword(s):

Data Collection ◽

Sample Size ◽

Mixed Mode ◽

Response Rate ◽

Single Mode ◽

Quality Measure ◽

Alternative Methods ◽

Effective Sample Size ◽

Random Allocation ◽

Sample Sizes

AbstractAdaptive survey designs can be used to allocate sample elements to alternative data collection protocols in order to achieve a desired balance between some quality measure and survey costs. We compare four alternative methods for allocating sample elements to one of two data collection protocols. The methods differ in terms of the quality measure that they aim to optimize: response rate, R-indicator, coefficient of variation of the participation propensities, or effective sample size. Costs are also compared for a range of sample sizes. The data collection protocols considered are CAPI single-mode and web-CAPI sequential mixed-mode. We use data from a large experiment with random allocation to one of these two protocols. For each allocation method we predict outcomes in terms of several quality measures and costs. Although allocating the whole sample to single-mode CAPI produces a higher response rate than allocating the whole sample to the mixed-mode protocol, we find that two of the targeted allocations achieve a better response rate than single-mode CAPI at a lower cost. We also find that all four of the targeted designs out-perform both single-protocol designs in terms of representativity and effective sample size. For all but the smallest sample sizes, the adaptive designs bring cost savings relative to CAPI-only, though these are fairly modest in magnitude.

Download Full-text

A Pooled Analysis of Pharmacokinetic Variability Information for Common Probe Substrates Used in Drug-Drug Interaction Studies

Pharmacology ◽

10.1159/000485516 ◽

2018 ◽

Vol 101 (3-4) ◽

pp. 170-175

Author(s):

Chunsheng He ◽

Amber Griffies ◽

Xuan Liu ◽

Robert Adamczyk ◽

Shu-Pang Huang

Keyword(s):

Drug Interaction ◽

Sample Size ◽

Area Under The Curve ◽

Pooled Analysis ◽

Small Sample ◽

Effective Sample Size ◽

Maximum Plasma ◽

Sample Sizes ◽

Drug Drug Interaction ◽

Small Sample Sizes

Sample size estimates for drug-drug interaction (DDI) studies are often based on variability information from the literature or from historical studies, but small sample sizes in these sources may limit the precision of the estimates obtained. This project aimed to create an intra-subject variability library of the pharmacokinetic (PK) exposure parameters, area under the curve, and maximum plasma concentration, for probes commonly used in DDI studies. Data from 66 individual DDI studies in healthy subjects relating to 18 common probe substrates were pooled to increase the effective sample size for the identified probes by 1.5- to 9-fold, with corresponding improvements in precision of the intra-subject PK variability estimates in this library. These improved variability estimates will allow better assessment of the sample sizes needed for DDI studies in future.

Download Full-text

Statistical heartburn: An attempt to digest four pizza publications from the Cornell Food and Brand Lab

10.7287/peerj.preprints.2748v1 ◽

2017 ◽

Cited By ~ 1

Author(s):

Tim van der Zee ◽

Jordan Anaya ◽

Nicholas J L Brown

Keyword(s):

Sample Size ◽

Degrees Of Freedom ◽

Scientific Literature ◽

Summary Statistics ◽

Test Statistics ◽

Sample Sizes ◽

Standard Deviations ◽

Initial Results

We present the initial results of a reanalysis of four articles from the Cornell Food and Brand Lab based on data collected from diners at an Italian restaurant buffet. On a ﬁrst glance at these articles, we immediately noticed a number of apparent inconsistencies in the summary statistics. A thorough reading of the articles and careful reanalysis of the results revealed additional problems. The sample sizes for the number of diners in each condition are incongruous both within and between the four articles. In some cases, the degrees of freedom of between-participant test statistics are larger than the sample size, which is impossible. Many of the computed F and t statistics are inconsistent with the reported means and standard deviations. In some cases, the number of possible inconsistencies for a single statistic was such that we were unable to determine which of the components of that statistic were incorrect. We contacted the authors of the four articles, but they have thus far not agreed to share their data. The attached Appendix reports approximately 150 inconsistencies in these four articles, which we were able to identify from the reported statistics alone. We hope that our analysis will encourage readers, using and extending the simple methods that we describe, to undertake their own efforts to verify published results, and that such initiatives will improve the accuracy and reproducibility of the scientiﬁc literature.

Download Full-text

Statistical heartburn: An attempt to digest four pizza publications from the Cornell Food and Brand Lab

10.7287/peerj.preprints.2748 ◽

2017 ◽

Author(s):

Tim van der Zee ◽

Jordan Anaya ◽

Nicholas J L Brown

Keyword(s):

Sample Size ◽

Degrees Of Freedom ◽

Scientific Literature ◽

Summary Statistics ◽

Test Statistics ◽

Sample Sizes ◽

Standard Deviations ◽

Initial Results

We present the initial results of a reanalysis of four articles from the Cornell Food and Brand Lab based on data collected from diners at an Italian restaurant buffet. On a ﬁrst glance at these articles, we immediately noticed a number of apparent inconsistencies in the summary statistics. A thorough reading of the articles and careful reanalysis of the results revealed additional problems. The sample sizes for the number of diners in each condition are incongruous both within and between the four articles. In some cases, the degrees of freedom of between-participant test statistics are larger than the sample size, which is impossible. Many of the computed F and t statistics are inconsistent with the reported means and standard deviations. In some cases, the number of possible inconsistencies for a single statistic was such that we were unable to determine which of the components of that statistic were incorrect. We contacted the authors of the four articles, but they have thus far not agreed to share their data. The attached Appendix reports approximately 150 inconsistencies in these four articles, which we were able to identify from the reported statistics alone. We hope that our analysis will encourage readers, using and extending the simple methods that we describe, to undertake their own efforts to verify published results, and that such initiatives will improve the accuracy and reproducibility of the scientiﬁc literature.

Download Full-text

Reporting Frequency and Sample Size: Effects on Prediction, Confidence Levels and Confidence Intervals

SSRN Electronic Journal ◽

10.2139/ssrn.991307 ◽

2007 ◽

Cited By ~ 1

Author(s):

Terence J. Pitre

Keyword(s):

Sample Size ◽

Confidence Intervals ◽

Size Effects ◽

Confidence Levels ◽

Prediction Confidence ◽

Reporting Frequency

Download Full-text

Effects of Training Set Size on Supervised Machine-Learning Land-Cover Classification of Large-Area High-Resolution Remotely Sensed Data

Remote Sensing ◽

10.3390/rs13030368 ◽

2021 ◽

Vol 13 (3) ◽

pp. 368

Author(s):

Christopher A. Ramezan ◽

Timothy A. Warner ◽

Aaron E. Maxwell ◽

Bradley S. Price

Keyword(s):

Machine Learning ◽

Sample Size ◽

Remotely Sensed ◽

Training Data ◽

Supervised Machine Learning ◽

Sample Sizes ◽

Remotely Sensed Data ◽

Large Area ◽

Training Set ◽

Set Size

The size of the training data set is a major determinant of classification accuracy. Nevertheless, the collection of a large training data set for supervised classifiers can be a challenge, especially for studies covering a large area, which may be typical of many real-world applied projects. This work investigates how variations in training set size, ranging from a large sample size (n = 10,000) to a very small sample size (n = 40), affect the performance of six supervised machine-learning algorithms applied to classify large-area high-spatial-resolution (HR) (1–5 m) remotely sensed data within the context of a geographic object-based image analysis (GEOBIA) approach. GEOBIA, in which adjacent similar pixels are grouped into image-objects that form the unit of the classification, offers the potential benefit of allowing multiple additional variables, such as measures of object geometry and texture, thus increasing the dimensionality of the classification input data. The six supervised machine-learning algorithms are support vector machines (SVM), random forests (RF), k-nearest neighbors (k-NN), single-layer perceptron neural networks (NEU), learning vector quantization (LVQ), and gradient-boosted trees (GBM). RF, the algorithm with the highest overall accuracy, was notable for its negligible decrease in overall accuracy, 1.0%, when training sample size decreased from 10,000 to 315 samples. GBM provided similar overall accuracy to RF; however, the algorithm was very expensive in terms of training time and computational resources, especially with large training sets. In contrast to RF and GBM, NEU, and SVM were particularly sensitive to decreasing sample size, with NEU classifications generally producing overall accuracies that were on average slightly higher than SVM classifications for larger sample sizes, but lower than SVM for the smallest sample sizes. NEU however required a longer processing time. The k-NN classifier saw less of a drop in overall accuracy than NEU and SVM as training set size decreased; however, the overall accuracies of k-NN were typically less than RF, NEU, and SVM classifiers. LVQ generally had the lowest overall accuracy of all six methods, but was relatively insensitive to sample size, down to the smallest sample sizes. Overall, due to its relatively high accuracy with small training sample sets, and minimal variations in overall accuracy between very large and small sample sets, as well as relatively short processing time, RF was a good classifier for large-area land-cover classifications of HR remotely sensed data, especially when training data are scarce. However, as performance of different supervised classifiers varies in response to training set size, investigating multiple classification algorithms is recommended to achieve optimal accuracy for a project.

Download Full-text