Statistical Power and Effect Size in Informative Retrieval Experiments

Proceedings of the Annual Conference of CAIS / Actes du congrès annuel de l'ACSI ◽

10.29173/cais437 ◽

2013 ◽

Author(s):

Michael J. Nelson

Keyword(s):

Information Retrieval ◽

Null Hypothesis ◽

Error Probability ◽

Statistical Power ◽

Retrieval System ◽

Statistical Tests ◽

Type Ii ◽

Type Ii Error ◽

Indexing Method ◽

Type Ii Error Probability

Statistical tests are used in information retrieval to test various hypotheses such as which indexing method is better or which retrieval system is better. Sometimes when using these statistical tests there is not enough evidence to reject the null hypothesis. Then either we have correctly discovered a true null hypothesis or made a type II error (probability denoted by b) and falsely accepted a . . .

Download Full-text

The Power of Replicated Measures to Increase Statistical Power

Advances in Methods and Practices in Psychological Science ◽

10.1177/2515245919849434 ◽

2019 ◽

Vol 2 (3) ◽

pp. 199-213 ◽

Cited By ~ 4

Author(s):

Marc-André Goulet ◽

Denis Cousineau

Keyword(s):

Sample Size ◽

Null Hypothesis ◽

Statistical Power ◽

Statistical Tests ◽

Cognitive Tasks ◽

Type Ii ◽

Type Ii Error ◽

Multiple Measures ◽

Multiple Measurements ◽

Sufficient Statistical Power

When running statistical tests, researchers can commit a Type II error, that is, fail to reject the null hypothesis when it is false. To diminish the probability of committing a Type II error (β), statistical power must be augmented. Typically, this is done by increasing sample size, as more participants provide more power. When the estimated effect size is small, however, the sample size required to achieve sufficient statistical power can be prohibitive. To alleviate this lack of power, a common practice is to measure participants multiple times under the same condition. Here, we show how to estimate statistical power by taking into account the benefit of such replicated measures. To that end, two additional parameters are required: the correlation between the multiple measures within a given condition and the number of times the measure is replicated. An analysis of a sample of 15 studies (total of 298 participants and 38,404 measurements) suggests that in simple cognitive tasks, the correlation between multiple measures is approximately .14. Although multiple measurements increase statistical power, this effect is not linear, but reaches a plateau past 20 to 50 replications (depending on the correlation). Hence, multiple measurements do not replace the added population representativeness provided by additional participants.

Download Full-text

When Studies are in Error: Basic Statistical Vocabulary Needed to Understand Clinical Studies

Journal of Cutaneous Medicine and Surgery ◽

10.1177/120347549600100108 ◽

1996 ◽

Vol 1 (1) ◽

pp. 25-28 ◽

Cited By ~ 1

Author(s):

Martin A. Weinstock

Keyword(s):

Null Hypothesis ◽

Statistical Power ◽

Critical Appraisal ◽

Type I Error ◽

Statistical Significance ◽

P Value ◽

Type I ◽

Type Ii ◽

Type Ii Error ◽

Error Type

Background: Accurate understanding of certain basic statistical terms and principles is key to critical appraisal of published literature. Objective: This review describes type I error, type II error, null hypothesis, p value, statistical significance, a, two-tailed and one-tailed tests, effect size, alternate hypothesis, statistical power, β, publication bias, confidence interval, standard error, and standard deviation, while including examples from reports of dermatologic studies. Conclusion: The application of the results of published studies to individual patients should be informed by an understanding of certain basic statistical concepts.

Download Full-text

Statistical power and design requirements for environmental monitoring

Marine and Freshwater Research ◽

10.1071/mf9910555 ◽

1991 ◽

Vol 42 (5) ◽

pp. 555 ◽

Cited By ~ 191

Author(s):

PG Fairweather

Keyword(s):

Low Power ◽

Environmental Monitoring ◽

Null Hypothesis ◽

Power Analysis ◽

Statistical Power ◽

Type I ◽

Type Ii ◽

Type Ii Error ◽

Type I Errors ◽

Type Ii Errors

This paper discusses, from a philosophical perspective, the reasons for considering the power of any statistical test used in environmental biomonitoring. Power is inversely related to the probability of making a Type II error (i.e. low power indicates a high probability of Type II error). In the context of environmental monitoring, a Type II error is made when it is concluded that no environmental impact has occurred even though one has. Type II errors have been ignored relative to Type I errors (the mistake of concluding that there is an impact when one has not occurred), the rates of which are stipulated by the a values of the test. In contrast, power depends on the value of α, the sample size used in the test, the effect size to be detected, and the variability inherent in the data. Although power ideas have been known for years, only recently have these issues attracted the attention of ecologists and have methods been available for calculating power easily. Understanding statistical power gives three ways to improve environmental monitoring and to inform decisions about actions arising from monitoring. First, it allows the most sensitive tests to be chosen from among those applicable to the data. Second, preliminary power analysis can be used to indicate the sample sizes necessary to detect an environmental change. Third, power analysis should be used after any nonsignificant result is obtained in order to judge whether that result can be interpreted with confidence or the test was too weak to examine the null hypothesis properly. Power procedures are concerned with the statistical significance of tests of the null hypothesis, and they lend little insight, on their own, into the workings of nature. Power analyses are, however, essential to designing sensitive tests and correctly interpreting their results. The biological or environmental significance of any result, including whether the impact is beneficial or harmful, is a separate issue. The most compelling reason for considering power is that Type II errors can be more costly than Type I errors for environmental management. This is because the commitment of time, energy and people to fighting a false alarm (a Type I error) may continue only in the short term until the mistake is discovered. In contrast, the cost of not doing something when in fact it should be done (a Type II error) will have both short- and long-term costs (e.g. ensuing environmental degradation and the eventual cost of its rectification). Low power can be disastrous for environmental monitoring programmes.

Download Full-text

Group sequential designs using both type I and type II error probability spending functions

Communication in Statistics- Theory and Methods ◽

10.1080/03610929808832161 ◽

1998 ◽

Vol 27 (6) ◽

pp. 1323-1339 ◽

Cited By ~ 28

Author(s):

Myron N. Chang ◽

Irving K. Hwang ◽

Weichung J. Shin

Keyword(s):

Error Probability ◽

Type I ◽

Type Ii ◽

Type Ii Error ◽

Sequential Designs ◽

Group Sequential ◽

Group Sequential Designs ◽

Type Ii Error Probability

Download Full-text

On the Exponential Approximation of Type II Error Probability of Distributed Test of Independence

IEEE Transactions on Signal and Information Processing over Networks ◽

10.1109/tsipn.2021.3133192 ◽

2021 ◽

pp. 1-1

Author(s):

Sebastian Andres Espinosa ◽

Jorge F Silva ◽

Pablo Piantanida

Keyword(s):

Error Probability ◽

Type Ii ◽

Type Ii Error ◽

Exponential Approximation ◽

Test Of Independence ◽

Type Ii Error Probability ◽

Distributed Test

Download Full-text

The Type II Error Probability of a Group Sequential Test of Efficacy and Futility, and Considerations for Power and Sample Size

Journal of Biopharmaceutical Statistics ◽

10.1080/10543406.2011.617229 ◽

2013 ◽

Vol 23 (2) ◽

pp. 378-393 ◽

Cited By ~ 3

Author(s):

Thomas W. Dobbins

Keyword(s):

Sample Size ◽

Error Probability ◽

Sequential Test ◽

Type Ii ◽

Type Ii Error ◽

Group Sequential ◽

Group Sequential Test ◽

Type Ii Error Probability

Download Full-text

Performance of Three-Stage Sequential Estimation of the Normal Inverse Coefficient of Variation Under Type II Error Probability: A Monte Carlo Simulation Study

Frontiers in Physics ◽

10.3389/fphy.2020.00071 ◽

2020 ◽

Vol 8 ◽

Cited By ~ 1

Author(s):

Ali Yousef

Keyword(s):

Monte Carlo Simulation ◽

Monte Carlo ◽

Simulation Study ◽

Error Probability ◽

Sequential Estimation ◽

Type Ii ◽

Type Ii Error ◽

Monte Carlo Simulation Study ◽

Type Ii Error Probability ◽

Inverse Coefficient

Download Full-text

In Reply: Statistical Power

PEDIATRICS ◽

10.1542/peds.83.4.634a ◽

1989 ◽

Vol 83 (4) ◽

pp. 634-634

Author(s):

JOHN S. LOVERING

Keyword(s):

Statistical Analysis ◽

Statistical Power ◽

Type Ii ◽

Type Ii Error ◽

Negative Results ◽

Mean Variance

Dr. Mauro is obviously knowledgeable in the area of statistical analysis and raises a valid point regarding the importance of evaluating the likelihood of a type II error in studies with negative results. Although one does not wish to detract from the main point of a study with extensive details of the statistical analysis (two pages in this case), some readers may desire more mathematical information than values of mean, variance, t, and P, and do not wish to make their own calculations, to reassure themselves that a reasonable conclusion has been drawn by the authors and their statisticians.

Download Full-text

Első- és másodfajú etikai kudarcok (Type I and type II ethical errors)

Vezetéstudomány / Budapest Management Review ◽

10.14267/veztud.2012.10.05 ◽

2012 ◽

pp. 56-63

Author(s):

Zsuzsanna Győri

Keyword(s):

Common Good ◽

Null Hypothesis ◽

Type I Error ◽

Holistic Approach ◽

Type I ◽

Type Ii ◽

Type Ii Error ◽

Opportunistic Behaviour ◽

Self Interest ◽

The Government

A cikkben a szerző a piac és a kormányzat kudarcaiból kiindulva azonosítja a közjó elérését célzó harmadik rendszer, az etikai felelősség kudarcait. Statisztikai analógiát használva elsőfajú kudarcként azonosítja, mikor az etikát nem veszik figyelembe, pedig szükség van rá. Ugyanakkor másodfajú kudarcként kezeli az etika profitnövelést célzó használatát, mely megtéveszti az érintetteteket, így még szélesebb utat enged az opportunista üzleti tevékenységnek. Meglátása szerint a három rendszer egymást nemcsak kiegészíti, de kölcsönösen korrigálja is. Ez az elsőfajú kudarc esetében általánosabb, a másodfajú kudarc megoldásához azonban a gazdasági élet alapvetéseinek átfogalmazására, az önérdek és az egydimenziós teljesítményértékelés helyett egy új, holisztikusabb szemléletű közgazdaságra van szükség. _______ In the article the author identifies the errors of ethical responsibility. That is the third system to attain common good, but have similar failures like the other two: the hands of the market and the government. Using statistical analogy the author identifies Type I error when ethics are not considered but it should be (null hypothesis is rejected however it’s true). She treats the usage of ethics to extend profit as Type II error. This misleads the stakeholders and makes room for opportunistic behaviour in business (null hypothesis is accepted in turn it’s false). In her opinion the three systems: the hand of the market, the government and the ethical management not only amend but interdependently correct each other. In the case of Type I error it is more general. Nevertheless to solve the Type II error we have to redefine the core principles of business. We need a more holistic approach in economics instead of self-interest and one-dimensional interpretation of value.

Download Full-text

Flaws in Study of Management of Children with Febrile Illness

PEDIATRICS ◽

10.1542/peds.71.5.867 ◽

1983 ◽

Vol 71 (5) ◽

pp. 867-867

Author(s):

D. G. LEDUC ◽

I. BARRY PLESS

Keyword(s):

Statistical Power ◽

Journal Club ◽

Febrile Illness ◽

Small Samples ◽

Type Ii ◽

Type Ii Error ◽

Bottom Line

In Reply.— In general, we agree with the criticism raised by Soman and by the Journal Club in Minneapolis. The "bottom line" of our paper is that there were no significant differences between the outcomes in the two groups. However, whenever a study involves relatively small samples the possibility of a type II error, due to lack of statistical power, must always be considered. The differences to which we called the readers' attention would have been statistically significant (with β set at .5) if the samples were as large as 375.

Download Full-text