Statistical Significance, Effect Size Reporting, and Confidence Intervals: Best Reporting Strategies

Robert M. Capraro

doi:10.2307/30034803

Statistical Primer for Athletic Trainers: Using Confidence Intervals and Effect Sizes to Evaluate Clinical Meaningfulness

Journal of Athletic Training ◽

10.4085/1062-6050-51.12.14 ◽

2016 ◽

Vol 51 (12) ◽

pp. 1045-1048 ◽

Cited By ~ 8

Author(s):

Monica Lininger ◽

Bryan L. Riemann

Keyword(s):

Confidence Intervals ◽

Effect Size ◽

Research Study ◽

Statistical Significance ◽

Effect Sizes ◽

Population Parameter ◽

Significant Difference ◽

Sample Data ◽

The Difference ◽

Clinical Meaningfulness

Objective: To describe confidence intervals (CIs) and effect sizes and provide practical examples to assist clinicians in assessing clinical meaningfulness. Background: As discussed in our first article in 2015, which addressed the difference between statistical significance and clinical meaningfulness, evaluating the clinical meaningfulness of a research study remains a challenge to many readers. In this paper, we will build on this topic by examining CIs and effect sizes. Description: A CI is a range estimated from sample data (the data we collect) that is likely to include the population parameter (value) of interest. Conceptually, this constitutes the lower and upper limits of the sample data, which would likely include, for example, the mean from the unknown population. An effect size is the magnitude of difference between 2 means. When a statistically significant difference exists between 2 means, effect size is used to describe how large or small that difference actually is. Confidence intervals and effect sizes enhance the practical interpretation of research results. Recommendations: Along with statistical significance, the CI and effect size can assist practitioners in better understanding the clinical meaningfulness of a research study.

Download Full-text

Understanding Statistical and Clinical Significance: Hypothesis Testing

Journal of Pharmacy Practice ◽

10.1177/089719009801100309 ◽

1998 ◽

Vol 11 (3) ◽

pp. 181-195 ◽

Cited By ~ 2

Author(s):

H. Glenn Anderson ◽

Michael G. Kendrach ◽

Shana Trice

Keyword(s):

Decision Making ◽

Hypothesis Testing ◽

Confidence Intervals ◽

Clinical Significance ◽

Effect Size ◽

Statistical Significance ◽

Central Tendency ◽

Step Process ◽

Research Results

This primer reviews a number of statistical concepts integral to the hypothesis testing process and its role in decision making. Concepts of variables, scales of measure, and measures of central tendency and dispersion are discussed, and a 5-step process of hypothesis testing is presented. Finally, a discussion of the statistical and clinical significance of research results is presented, along with the concept of confidence intervals as a method of conveying information about the effect size as well as the statistical significance of a difference between groups.

Download Full-text

Statistical Significance, Effect Size, and Confidence Intervals

Statistics in Plain English ◽

10.4324/9780203851173-11 ◽

2011 ◽

pp. 68-85

Keyword(s):

Confidence Intervals ◽

Effect Size ◽

Statistical Significance

Download Full-text

Effect Size and Effect Uncertainty in Organizational Research Methods

Oxford Research Encyclopedia of Business and Management ◽

10.1093/acrefore/9780190224851.013.238 ◽

2021 ◽

Author(s):

Scott B. Morris ◽

Arash Shokri

Keyword(s):

Confidence Intervals ◽

Effect Size ◽

Sampling Error ◽

Statistical Significance ◽

Scientific Progress ◽

Effect Sizes ◽

Practical Significance ◽

Significance Tests ◽

Wide Range ◽

Research Findings

To understand and communicate research findings, it is important for researchers to consider two types of information provided by research results: the magnitude of the effect and the degree of uncertainty in the outcome. Statistical significance tests have long served as the mainstream method for statistical inferences. However, the widespread misinterpretation and misuse of significance tests has led critics to question their usefulness in evaluating research findings and to raise concerns about the far-reaching effects of this practice on scientific progress. An alternative approach involves reporting and interpreting measures of effect size along with confidence intervals. An effect size is an indicator of magnitude and direction of a statistical observation. Effect size statistics have been developed to represent a wide range of research questions, including indicators of the mean difference between groups, the relative odds of an event, or the degree of correlation among variables. Effect sizes play a key role in evaluating practical significance, conducting power analysis, and conducting meta-analysis. While effect sizes summarize the magnitude of an effect, the confidence intervals represent the degree of uncertainty in the result. By presenting a range of plausible alternate values that might have occurred due to sampling error, confidence intervals provide an intuitive indicator of how strongly researchers should rely on the results from a single study.

Download Full-text

WHEN AND WHERE P-VALUE IS REQUIRED: MISNOMERS IN CLINICALAND STATISTICAL SIGNIFICANCE

10.36106/ijsr/9300808 ◽

2021 ◽

pp. 1-2

Author(s):

Sukhvinder Singh Oberoi ◽

Mansi Atri

Keyword(s):

Data Analysis ◽

Confidence Intervals ◽

Effect Size ◽

Statistical Significance ◽

Significance Testing ◽

P Value ◽

Practical Application ◽

Statistical Software ◽

P Values ◽

Long Time

The interpretation of the p-value has been an arena for discussion making it difficult for many researchers. The p-value was introduced in 1900 by Pearson. Though, it is very difficult to comment about the demerits of the p-values and significance testing which has not been spoken in a long time because of the practical application of it as a measure of interpretation in clinical research. The usage of the confidence intervals around the sample statistics and effect size should be given more importance than relying solely upon the statistical significance. The researchers, should be consulting a statistician in the initial stages of the planning of the study for avoidance of the misinterpretation of the P-value especially if they are using statistical software for their data analysis.

Download Full-text

Potential Therapeutic Effects of Curcumin on Glycemic and Lipid Profile in Uncomplicated Type 2 Diabetes—A Meta-Analysis of Randomized Controlled Trial

Nutrients ◽

10.3390/nu13020404 ◽

2021 ◽

Vol 13 (2) ◽

pp. 404

Author(s):

Emma Altobelli ◽

Paolo Matteo Angeletti ◽

Ciro Marziliano ◽

Marianna Mastrodomenico ◽

Anna Rita Giuliani ◽

...

Keyword(s):

Type 2 Diabetes ◽

Lipid Profile ◽

Effect Size ◽

Meta Analysis ◽

Statistical Significance ◽

Glycosylated Hemoglobin ◽

Cochrane Library ◽

Therapeutic Effects ◽

Model Assessment

Diabetes mellitus is an important issue for public health, and it is growing in the world. In recent years, there has been a growing research interest on efficacy evidence of the curcumin use in the regulation of glycemia and lipidaemia. The molecular structure of curcumins allows to intercept reactive oxygen species (ROI) that are particularly harmful in chronic inflammation and tumorigenesis models. The aim of our study performed a systematic review and meta-analysis to evaluate the effect of curcumin on glycemic and lipid profile in subjects with uncomplicated type 2 diabetes. The papers included in the meta-analysis were sought in the MEDLINE, EMBASE, Scopus, Clinicaltrials.gov, Web of Science, and Cochrane Library databases as of October 2020. The sizes were pooled across studies in order to obtain an overall effect size. A random effects model was used to account for different sources of variation among studies. Cohen’s d, with 95% confidence interval (CI) was used as a measure of the effect size. Heterogeneity was assessed while using Q statistics. The ANOVA-Q test was used to value the differences among groups. Publication bias was analyzed and represented by a funnel plot. Curcumin treatment does not show a statistically significant reduction between treated and untreated patients. On the other hand, glycosylated hemoglobin, homeostasis model assessment (HOMA), and low-density lipoprotein (LDL) showed a statistically significant reduction in subjects that were treated with curcumin, respectively (p = 0.008, p < 0.001, p = 0.021). When considering HBA1c, the meta-regressions only showed statistical significance for gender (p = 0.034). Our meta-analysis seems to confirm the benefits on glucose metabolism, with results that appear to be more solid than those of lipid metabolism. However, further studies are needed in order to test the efficacy and safety of curcumin in uncomplicated type 2 diabetes.

Download Full-text

Significance tests: Necessary but not sufficient

Behavioral and Brain Sciences ◽

10.1017/s0140525x98521164 ◽

1998 ◽

Vol 21 (2) ◽

pp. 221-222

Author(s):

Louis G. Tassinary

Keyword(s):

Effect Size ◽

Scientific Community ◽

Statistical Significance ◽

Significance Tests ◽

Experimental Control

Chow (1996) offers a reconceptualization of statistical significance that is reasoned and comprehensive. Despite a somewhat rough presentation, his arguments are compelling and deserve to be taken seriously by the scientific community. It is argued that his characterization of literal replication, types of research, effect size, and experimental control are in need of revision.

Download Full-text

Abstract 3789: Nessun Dorma : Have the Risks of Rosiglitazone been Exaggerated?

Circulation ◽

10.1161/circ.116.suppl_16.ii_861-c ◽

2007 ◽

Vol 116 (suppl_16) ◽

Author(s):

George A Diamond ◽

Sanjay Kaul

Keyword(s):

Confidence Intervals ◽

Effect Size ◽

Fixed Effects ◽

Meta Analysis ◽

Cardiovascular Death ◽

Sensitivity Analyses ◽

Odds Ratios ◽

True Effect Size ◽

Index Study ◽

The Stability

Background A highly publicized meta-analysis of 42 clinical trials comprising 27,844 diabetics ignited a firestorm of controversy by charging that treatment with rosiglitazone was associated with a “…worrisome…” 43% greater risk of myocardial infarction ( p =0.03) and a 64% greater risk of cardiovascular death ( p =0.06). Objective The investigators excluded 4 trials from the infarction analysis and 19 trials from the mortality analysis in which no events were observed. We sought to determine if these exclusions biased the results. Methods We compared the index study to a Bayesian meta-analysis of the entire 42 trials (using odds ratio as the measure of effect size) and to fixed-effects and random-effects analyses with and without a continuity correction that adjusts for values of zero. Results The odds ratios and confidence intervals for the analyses are summarized in the Table . Odds ratios for infarction ranged from 1.43 to 1.22 and for death from 1.64 to 1.13. Corrected models resulted in substantially smaller odds ratios and narrower confidence intervals than did uncorrected models. Although corrected risks remain elevated, none are statistically significant (*p<0.05). Conclusions Given the fragility of the effect sizes and confidence intervals, the charge that roziglitazone increases the risk of adverse events is not supported by these additional analyses. The exaggerated values observed in the index study are likely the result of excluding the zero-event trials from analysis. Continuity adjustments mitigate this error and provide more consistent and reliable assessments of true effect size. Transparent sensitivity analyses should therefore be performed over a realistic range of the operative assumptions to verify the stability of such assessments especially when outcome events are rare. Given the relatively wide confidence intervals, additional data will be required to adjudicate these inconclusive results.

Download Full-text

The Other Half of the Story: Effect Size Analysis in Quantitative Research

CBE—Life Sciences Education ◽

10.1187/cbe.13-04-0082 ◽

2013 ◽

Vol 12 (3) ◽

pp. 345-351 ◽

Cited By ~ 134

Author(s):

Jessica Middlemis Maher ◽

Jonathan C. Markey ◽

Diane Ebert-May

Keyword(s):

Educational Research ◽

Effect Size ◽

Quantitative Research ◽

Statistical Significance ◽

The Other ◽

Practical Significance ◽

Significance Testing ◽

Size Analysis ◽

Significance Tests ◽

Statistical Significance Testing

Statistical significance testing is the cornerstone of quantitative research, but studies that fail to report measures of effect size are potentially missing a robust part of the analysis. We provide a rationale for why effect size measures should be included in quantitative discipline-based education research. Examples from both biological and educational research demonstrate the utility of effect size for evaluating practical significance. We also provide details about some effect size indices that are paired with common statistical significance tests used in educational research and offer general suggestions for interpreting effect size measures. Finally, we discuss some inherent limitations of effect size measures and provide further recommendations about reporting confidence intervals.

Download Full-text

Using confidence intervals to estimate the response of salmon populations (Oncorhynchus spp.) to experimental habitat alterations

Canadian Journal of Fisheries and Aquatic Sciences ◽

10.1139/f05-179 ◽

2005 ◽

Vol 62 (12) ◽

pp. 2716-2726 ◽

Cited By ~ 24

Author(s):

Michael J Bradford ◽

Josh Korman ◽

Paul S Higgins

Keyword(s):

Confidence Intervals ◽

Effect Size ◽

Null Hypothesis ◽

Statistical Power ◽

Habitat Restoration ◽

Coho Salmon ◽

Fish Habitat ◽

Monitoring Program ◽

Considerable Uncertainty ◽

Monitoring Programs

There is considerable uncertainty about the effectiveness of fish habitat restoration programs, and reliable monitoring programs are needed to evaluate them. Statistical power analysis based on traditional hypothesis tests are usually used for monitoring program design, but here we argue that effect size estimates and their associated confidence intervals are more informative because results can be compared with both the null hypothesis of no effect and effect sizes of interest, such as restoration goals. We used a stochastic simulation model to compare alternative monitoring strategies for a habitat alteration that would change the productivity and capacity of a coho salmon (Oncorhynchus kisutch) producing stream. Estimates of the effect size using a freshwater stockrecruit model were more precise than those from monitoring the abundance of either spawners or smolts. Less than ideal monitoring programs can produce ambiguous results, which are cases in which the confidence interval includes both the null hypothesis and the effect size of interest. Our model is a useful planning tool because it allows the evaluation of the utility of different types of monitoring data, which should stimulate discussion on how the results will ultimately inform decision-making.

Download Full-text