Decisions Based on P-Values and Significance Levels

Cavalier Use of Inferential Statistics Is a Major Source of False and Irreproducible Scientific Findings

Mathematics ◽

10.3390/math9060603 ◽

2021 ◽

Vol 9 (6) ◽

pp. 603

Author(s):

Leonid Hanin

Keyword(s):

Sample Size ◽

Gaussian Approximation ◽

Statistical Significance ◽

Statistical Analyses ◽

Random Sample Size ◽

P Values ◽

The Central Limit Theorem ◽

Fixed Sample ◽

Large Numbers ◽

Significance Levels

I uncover previously underappreciated systematic sources of false and irreproducible results in natural, biomedical and social sciences that are rooted in statistical methodology. They include the inevitably occurring deviations from basic assumptions behind statistical analyses and the use of various approximations. I show through a number of examples that (a) arbitrarily small deviations from distributional homogeneity can lead to arbitrarily large deviations in the outcomes of statistical analyses; (b) samples of random size may violate the Law of Large Numbers and thus are generally unsuitable for conventional statistical inference; (c) the same is true, in particular, when random sample size and observations are stochastically dependent; and (d) the use of the Gaussian approximation based on the Central Limit Theorem has dramatic implications for p-values and statistical significance essentially making pursuit of small significance levels and p-values for a fixed sample size meaningless. The latter is proven rigorously in the case of one-sided Z test. This article could serve as a cautionary guidance to scientists and practitioners employing statistical methods in their work.

Download Full-text

Inference for econometric modeling in antidumping, countervailing duty and safeguard investigations

World Trade Review ◽

10.1017/s147474560999005x ◽

2009 ◽

Vol 8 (4) ◽

pp. 545-557 ◽

Cited By ~ 1

Author(s):

JAMES J. FETZER

Keyword(s):

Confidence Intervals ◽

Econometric Models ◽

Econometric Modeling ◽

P Values ◽

Flexible Approach ◽

Fixed Level ◽

Point Estimates ◽

Level Of Confidence ◽

Significance Levels

AbstractThis paper examines how to make inferences from econometric models prepared for antidumping, countervailing duty, and safeguard investigations. Analysis of these models has typically entailed drawing inferences from point estimates that are significantly different from zero at a fixed level of confidence. This paper suggests a more flexible approach of drawing inferences using confidence intervals at various significance levels and reporting p-values for the relevant test of injury. Use of confidence intervals and p-values to identify insights and data patterns would have more impact on USITC trade remedy determinations than definitive conclusions about injury based on whether estimates are statistically significant.

Download Full-text

Potential pitfalls in the use of p-values and in interpretation of significance levels

Radiotherapy and Oncology ◽

10.1016/0167-8140(94)90072-8 ◽

1994 ◽

Vol 33 (2) ◽

pp. 171-176 ◽

Cited By ~ 50

Author(s):

Hans-Peter Beck-Bornholdt ◽

Hans-Hermann Dubben

Keyword(s):

P Values ◽

Significance Levels

Download Full-text

Visualization Strategies for Regression Estimates with Randomization Inference

10.31235/osf.io/bsd7g ◽

2019 ◽

Author(s):

Marshall A. Taylor

Keyword(s):

Confidence Interval ◽

Confidence Intervals ◽

Regression Models ◽

Statistical Significance ◽

Permutation Tests ◽

P Value ◽

P Values ◽

Alpha Level ◽

Significance Levels ◽

Nonprobability Sample

Coefficient plots are a popular tool for visualizing regression estimates. The appeal of these plots is that they visualize confidence intervals around the estimates and generally center the plot around zero, meaning that any estimate that crosses zero is statistically non-significant at at least the alpha-level around which the confidence intervals are constructed. For models with statistical significance levels determined via randomization models of inference and for which there is no standard error or confidence intervals for the estimate itself, these plots appear less useful. In this paper, I illustrate a variant of the coefficient plot for regression models with p-values constructed using permutation tests. These visualizations plot each estimate's p-value and its associated confidence interval in relation to a specified alpha-level. These plots can help the analyst interpret and report both the statistical and substantive significance of their models. Illustrations are provided using a nonprobability sample of activists and participants at a 1962 anti-Communism school.

Download Full-text

Branching characteristics of coronary arteries in rats

Canadian Journal of Physiology and Pharmacology ◽

10.1139/y84-241 ◽

1984 ◽

Vol 62 (12) ◽

pp. 1453-1459 ◽

Cited By ~ 25

Author(s):

M. Zamir ◽

S. Phipps ◽

B. L. Langille ◽

T. H. Wonnacott

Keyword(s):

Cardiovascular System ◽

Coronary Arteries ◽

P Values ◽

High Significance ◽

Significance Levels ◽

Very High

The purpose of this study is to examine quantitatively the branching characteristics of the coronary arteries. Branching angles and vessel diameters were measured in a total of 175 arterial bifurcations in the coronary beds of rats, and the results are compared with those of 350 bifurcations in other parts of the cardiovascular system of the same species. Significant differences are found in the values of branch diameters and branching angles, both being found generally lower in the coronary bed than in other parts of the system. On statistical grounds these differences are found to have very high significance levels, with P values less than 0.02 in the case of branching angles and much less than 0.001 in the ease of branch diameters. On physiological grounds, the differences are such as to place the coronary arteries further away from the "theoretical optimum" than are vessels in other parts of the cardiovascular system. The theoretical optimum represents branching angles and branch diameters which make arterial bifurcations more efficient physiologically.

Download Full-text

Visualization strategies for regression estimates with randomization inference

The Stata Journal Promoting communications on statistics and Stata ◽

10.1177/1536867x20930999 ◽

2020 ◽

Vol 20 (2) ◽

pp. 309-335

Author(s):

Marshall A. Taylor

Keyword(s):

Confidence Interval ◽

Confidence Intervals ◽

Regression Models ◽

Statistical Significance ◽

Permutation Tests ◽

P Value ◽

P Values ◽

Alpha Level ◽

Significance Levels ◽

Nonprobability Sample

Coefficient plots are a popular tool for visualizing regression estimates. The appeal of these plots is that they visualize confidence intervals around the estimates and generally center the plot around zero, meaning that any estimate that crosses zero is statistically nonsignificant at least at the alpha level around which the confidence intervals are constructed. For models with statistical significance levels determined via randomization models of inference and for which there is no standard error or confidence intervals for the estimate itself, these plots appear less useful. In this article, I illustrate a variant of the coefficient plot for regression models with p-values constructed using permutation tests. These visualizations plot each estimate’s p-value and its associated confidence interval in relation to a specified alpha level. These plots can help the analyst interpret and report the statistical and substantive significances of their models. I illustrate using a nonprobability sample of activists and participants at a 1962 anticommunism school.

Download Full-text

Identification of best indicators of peptide-spectrum match using a permutation resampling approach

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720014400010 ◽

2014 ◽

Vol 12 (05) ◽

pp. 1440001 ◽

Cited By ~ 3

Author(s):

Malik N. Akhtar ◽

Bruce R. Southey ◽

Per E. Andrén ◽

Jonathan V. Sweedler ◽

Sandra L. Rodriguez-Zas

Keyword(s):

Mass Spectra ◽

Statistical Significance ◽

Permutation Tests ◽

Database Search ◽

Theoretical Spectrum ◽

P Values ◽

Tandem Mass Spectra ◽

Wide Range ◽

Significance Levels ◽

Peptide Match

Various indicators of observed-theoretical spectrum matches were compared and the resulting statistical significance was characterized using permutation resampling. Novel decoy databases built by resampling the terminal positions of peptide sequences were evaluated to identify the conditions for accurate computation of peptide match significance levels. The methodology was tested on real and manually curated tandem mass spectra from peptides across a wide range of sizes. Spectra match indicators from complementary database search programs were profiled and optimal indicators were identified. The combination of the optimal indicator and permuted decoy databases improved the calculation of the peptide match significance compared to the approaches currently implemented in the database search programs that rely on distributional assumptions. Permutation tests using p-values obtained from software-dependent matching scores and E-values outperformed permutation tests using all other indicators. The higher overlap in matches between the database search programs when using end permutation compared to existing approaches confirmed the superiority of the end permutation method to identify peptides. The combination of effective match indicators and the end permutation method is recommended for accurate detection of peptides.

Download Full-text

Inferential, Nonparametric Statistics to Assess the Quality of Probabilistic Forecast Systems

Monthly Weather Review ◽

10.1175/mwr3291.1 ◽

2007 ◽

Vol 135 (2) ◽

pp. 351-362 ◽

Cited By ~ 16

Author(s):

Alinede H. N. Maia ◽

Holger Meinke ◽

Sarah Lennox ◽

Roger Stone

Keyword(s):

Southern Oscillation ◽

Statistical Tests ◽

Quality Measures ◽

Nonparametric Tests ◽

P Values ◽

Skill Scores ◽

Significance Levels ◽

Nonparametric Statistical ◽

Forecast Quality

Abstract Many statistical forecast systems are available to interested users. To be useful for decision making, these systems must be based on evidence of underlying mechanisms. Once causal connections between the mechanism and its statistical manifestation have been firmly established, the forecasts must also provide some quantitative evidence of “quality.” However, the quality of statistical climate forecast systems (forecast quality) is an ill-defined and frequently misunderstood property. Often, providers and users of such forecast systems are unclear about what quality entails and how to measure it, leading to confusion and misinformation. A generic framework is presented that quantifies aspects of forecast quality using an inferential approach to calculate nominal significance levels (p values), which can be obtained either by directly applying nonparametric statistical tests such as Kruskal–Wallis (KW) or Kolmogorov–Smirnov (KS) or by using Monte Carlo methods (in the case of forecast skill scores). Once converted to p values, these forecast quality measures provide a means to objectively evaluate and compare temporal and spatial patterns of forecast quality across datasets and forecast systems. The analysis demonstrates the importance of providing p values rather than adopting some arbitrarily chosen significance levels such as 0.05 or 0.01, which is still common practice. This is illustrated by applying nonparametric tests (such as KW and KS) and skill scoring methods [linear error in the probability space (LEPS) and ranked probability skill score (RPSS)] to the five-phase Southern Oscillation index classification system using historical rainfall data from Australia, South Africa, and India. The selection of quality measures is solely based on their common use and does not constitute endorsement. It is found that nonparametric statistical tests can be adequate proxies for skill measures such as LEPS or RPSS. The framework can be implemented anywhere, regardless of dataset, forecast system, or quality measure. Eventually such inferential evidence should be complemented by descriptive statistical methods in order to fully assist in operational risk management.

Download Full-text