Violating the normality assumption may be the lesser of two evils

Mapping Intimacies ◽

10.1101/498931 ◽

2018 ◽

Cited By ~ 6

Author(s):

Ulrich Knief ◽

Wolfgang Forstmeier

Keyword(s):

Parameter Estimation ◽

Count Data ◽

Mixed Models ◽

Type I Error ◽

Type I ◽

List Type ◽

Gaussian Models ◽

Normality Assumption ◽

Wide Range ◽

Non Gaussian

AbstractWhen data are not normally distributed (e.g. skewed, zero-inflated, binomial, or count data) researchers are often uncertain whether it may be legitimate to use tests that assume Gaussian errors (e.g. regression, t-test, ANOVA, Gaussian mixed models), or whether one has to either model a more specific error structure or use randomization techniques.Here we use Monte Carlo simulations to explore the pros and cons of fitting Gaussian models to non-normal data in terms of risk of type I error, power and utility for parameter estimation.We find that Gaussian models are remarkably robust to non-normality over a wide range of conditions, meaning that P-values remain fairly reliable except for data with influential outliers judged at strict alpha levels. Gaussian models also perform well in terms of power and they can be useful for parameter estimation but usually not for extrapolation. Transformation of data before analysis is often advisable and visual inspection for outliers and heteroscedasticity is important for assessment. In strong contrast, some non-Gaussian models and randomization techniques bear a range of risks that are often insufficiently known. High rates of false-positive conclusions can arise for instance when overdispersion in count data is not controlled appropriately or when randomization procedures ignore existing non-independencies in the data.Overall, we argue that violating the normality assumption bears risks that are limited and manageable, while several more sophisticated approaches are relatively error prone and difficult to check during peer review. Hence, as long as scientists and reviewers are not fully aware of the risks, science might benefit from preferentially trusting Gaussian mixed models in which random effects account for non-independencies in the data in a transparent way.Tweetable abstractGaussian models are remarkably robust to even dramatic violations of the normality assumption.

Download Full-text

Violating the normality assumption may be the lesser of two evils

Behavior Research Methods ◽

10.3758/s13428-021-01587-5 ◽

2021 ◽

Author(s):

Ulrich Knief ◽

Wolfgang Forstmeier

Keyword(s):

Mixed Models ◽

Type I Error ◽

Parameter Estimates ◽

Type I ◽

Gaussian Models ◽

Normality Assumption ◽

Wide Range ◽

Pros And Cons ◽

Non Gaussian ◽

Specific Error

AbstractWhen data are not normally distributed, researchers are often uncertain whether it is legitimate to use tests that assume Gaussian errors, or whether one has to either model a more specific error structure or use randomization techniques. Here we use Monte Carlo simulations to explore the pros and cons of fitting Gaussian models to non-normal data in terms of risk of type I error, power and utility for parameter estimation. We find that Gaussian models are robust to non-normality over a wide range of conditions, meaning that p values remain fairly reliable except for data with influential outliers judged at strict alpha levels. Gaussian models also performed well in terms of power across all simulated scenarios. Parameter estimates were mostly unbiased and precise except if sample sizes were small or the distribution of the predictor was highly skewed. Transformation of data before analysis is often advisable and visual inspection for outliers and heteroscedasticity is important for assessment. In strong contrast, some non-Gaussian models and randomization techniques bear a range of risks that are often insufficiently known. High rates of false-positive conclusions can arise for instance when overdispersion in count data is not controlled appropriately or when randomization procedures ignore existing non-independencies in the data. Hence, newly developed statistical methods not only bring new opportunities, but they can also pose new threats to reliability. We argue that violating the normality assumption bears risks that are limited and manageable, while several more sophisticated approaches are relatively error prone and particularly difficult to check during peer review. Scientists and reviewers who are not fully aware of the risks might benefit from preferentially trusting Gaussian mixed models in which random effects account for non-independencies in the data.

Download Full-text

Balancing Type I error and power in linear mixed models

Journal of Memory and Language ◽

10.1016/j.jml.2017.01.001 ◽

2017 ◽

Vol 94 ◽

pp. 305-315 ◽

Cited By ~ 324

Author(s):

Hannes Matuschek ◽

Reinhold Kliegl ◽

Shravan Vasishth ◽

Harald Baayen ◽

Douglas Bates

Keyword(s):

Mixed Models ◽

Type I Error ◽

Linear Mixed Models ◽

Type I

Download Full-text

Is It Really Robust?

Methodology ◽

10.1027/1614-2241/a000016 ◽

2010 ◽

Vol 6 (4) ◽

pp. 147-151 ◽

Cited By ~ 385

Author(s):

Emanuel Schmider ◽

Matthias Ziegler ◽

Erik Danay ◽

Luzi Beyer ◽

Markus Bühner

Keyword(s):

Goodness Of Fit ◽

Type I Error ◽

Effect Sizes ◽

Random Numbers ◽

Type I ◽

Type Ii ◽

Type Ii Error ◽

Normality Assumption ◽

Different Types ◽

Factor Type

Empirical evidence to the robustness of the analysis of variance (ANOVA) concerning violation of the normality assumption is presented by means of Monte Carlo methods. High-quality samples underlying normally, rectangularly, and exponentially distributed basic populations are created by drawing samples which consist of random numbers from respective generators, checking their goodness of fit, and allowing only the best 10% to take part in the investigation. A one-way fixed-effect design with three groups of 25 values each is chosen. Effect-sizes are implemented in the samples and varied over a broad range. Comparing the outcomes of the ANOVA calculations for the different types of distributions, gives reason to regard the ANOVA as robust. Both, the empirical type I error α and the empirical type II error β remain constant under violation. Moreover, regression analysis identifies the factor “type of distribution” as not significant in explanation of the ANOVA results.

Download Full-text

Location Tests for Biomarker Studies: A Comparison Using Simulations for the Two-sample Case

Methods of Information in Medicine ◽

10.3414/me12-02-0014 ◽

2013 ◽

Vol 52 (04) ◽

pp. 351-359 ◽

Cited By ~ 1

Author(s):

M. O. Scheinhardt ◽

A. Ziegler

Keyword(s):

Heavy Tails ◽

Type I Error ◽

Adaptive Methods ◽

Type I ◽

Skewed Data ◽

Error Level ◽

Adaptive Tests ◽

Wide Range ◽

Heavy Tailed ◽

Non Parametric

Summary Background: Gene, protein, or metabolite expression levels are often non-normally distributed, heavy tailed and contain outliers. Standard statistical approaches may fail as location tests in this situation. Objectives: In three Monte-Carlo simulation studies, we aimed at comparing the type I error levels and empirical power of standard location tests and three adaptive tests [O’Gorman, Can J Stat 1997; 25: 269 –279; Keselman et al., Brit J Math Stat Psychol 2007; 60: 267– 293; Szymczak et al., Stat Med 2013; 32: 524 – 537] for a wide range of distributions. Methods: We simulated two-sample scena -rios using the g-and-k-distribution family to systematically vary tail length and skewness with identical and varying variability between groups. Results: All tests kept the type I error level when groups did not vary in their variability. The standard non-parametric U-test per -formed well in all simulated scenarios. It was outperformed by the two non-parametric adaptive methods in case of heavy tails or large skewness. Most tests did not keep the type I error level for skewed data in the case of heterogeneous variances. Conclusions: The standard U-test was a powerful and robust location test for most of the simulated scenarios except for very heavy tailed or heavy skewed data, and it is thus to be recommended except for these cases. The non-parametric adaptive tests were powerful for both normal and non-normal distributions under sample variance homogeneity. But when sample variances differed, they did not keep the type I error level. The parametric adaptive test lacks power for skewed and heavy tailed distributions.

Download Full-text

Outcome reporting bias in clinical trials

Epidemiologia e Psichiatria Sociale ◽

10.1017/s1121189x00001408 ◽

2009 ◽

Vol 18 (1) ◽

pp. 17-18 ◽

Cited By ~ 2

Author(s):

Eleonora Esposito ◽

Andrea Cipriani ◽

Corrado Barbui

Keyword(s):

Statistical Power ◽

Primary Outcome ◽

Type I Error ◽

Sample Size Calculation ◽

Clinical Importance ◽

Type I ◽

List Type ◽

Outcome Reporting Bias ◽

Secondary Outcomes ◽

Randomised Controlled

Randomised controlled trials (RCTs) are designed and powered to measure one single outcome, calledprimary outcome(Sibbald & Roland, 1998; Barbuiet al., 2007). The primary outcome is the pre-specified outcome of greatest clinical importance and is usually the one used in the sample size calculation (Accordini, 2007). In addition to the primary outcome, RCTs may have several other outcomes, calledsecondary outcomes. In contrast with the analysis of the primary outcome, the analysis of secondary outcomes and its interpretation may be complicated by at least two factors:1)the trial may not have enough statistical power to detect differences (so it is possible to incur in a type II error, that is failing to see a difference that is present);2)increasing the number of secondary outcomes generates the problem of multiplicity of analyses, that is the proliferation of possible comparisons in a trial (and increasing the number of comparisons increases the possibility to incur in a type I error, that is detecting significant differences by chance). For all these reasons, the results of the analysis of primary outcomes is considered less susceptible to bias than the analysis of secondary outcomes.

Download Full-text

On the Conditional and Unconditional Type I Error Rates and Power of Tests in Linear Models with Heteroscedastic Errors

Journal of Modern Applied Statistical Methods ◽

10.22237/jmasm/1551966828 ◽

2019 ◽

Vol 17 (2) ◽

Cited By ~ 1

Author(s):

Patrick J. Rosopa ◽

Alice M. Brawley ◽

Theresa P. Atkinson ◽

Stephen A. Robertson

Keyword(s):

Linear Models ◽

Type I Error ◽

Weighted Least Squares ◽

Error Rates ◽

Type I ◽

Least Squares Regression ◽

Type I Error Rates ◽

Power Of Tests ◽

Wide Range ◽

Heteroscedastic Errors

Preliminary tests for homoscedasticity may be unnecessary in general linear models. Based on Monte Carlo simulations, results suggest that when testing for differences between independent slopes, the unconditional use of weighted least squares regression and HC4 regression performed the best across a wide range of conditions.

Download Full-text

Meta-Analysis with Heteroscedastic Effects

Journal of Marketing Research ◽

10.1177/002224379303000209 ◽

1993 ◽

Vol 30 (2) ◽

pp. 246-255 ◽

Cited By ~ 1

Author(s):

Murali Chandrashekaran ◽

Beth A. Walker

Keyword(s):

Simulation Experiment ◽

Type I Error ◽

Meta Analysis ◽

Marketing Research ◽

Estimation Procedure ◽

Type I ◽

Moderator Variables ◽

Analytic Procedure ◽

Wide Range ◽

Analytic Dataset

To enhance the utility of meta-analysis as an integrative tool for marketing research, heteroscedastic MLE (HMLE), a maximum-likelihood-based estimation procedure, is proposed as a method that overcomes heteroscedasticity, a problem known to impair OLS estimates and threaten the validity of meta-analytic findings. The results of a Monté Carlo simulation experiment reveal that, under a wide range of heteroscedastic conditions, HMLE is more efficient and powerful than OLS and achieves these performance advantages without inflating type I error. Further, the relative performance of HMLE increases as heteroscedasticity becomes more severe. An empirical analysis of a meta-analytic dataset in marketing confirmed and extended these findings by illustrating how the enhanced efficiency and power of HMLE improve the ability to detect moderator variables and by demonstrating how the theoretical generalizations emerging from a meta-analysis are affected by the choice of the analytic procedure.

Download Full-text

The Use of Mixed Models for the Analysis of Mediated Data with Time-Dependent Predictors

Journal of Environmental and Public Health ◽

10.1155/2011/435078 ◽

2011 ◽

Vol 2011 ◽

pp. 1-12 ◽

Cited By ~ 5

Author(s):

Emily A. Blood ◽

Debbie M. Cheng

Keyword(s):

Longitudinal Data ◽

Mixed Models ◽

Structural Equation ◽

Structural Equation Models ◽

Type I Error ◽

Time Dependent ◽

Type I ◽

Delayed Effects ◽

Causal Pathways ◽

Observational Cohort

Linear mixed models (LMMs) are frequently used to analyze longitudinal data. Although these models can be used to evaluate mediation, they do not directly model causal pathways. Structural equation models (SEMs) are an alternative technique that allows explicit modeling of mediation. The goal of this paper is to evaluate the performance of LMMs relative to SEMs in the analysis of mediated longitudinal data with time-dependent predictors and mediators. We simulated mediated longitudinal data from an SEM and specified delayed effects of the predictor. A variety of model specifications were assessed, and the LMMs and SEMs were evaluated with respect to bias, coverage probability, power, and Type I error. Models evaluated in the simulation were also applied to data from an observational cohort of HIV-infected individuals. We found that when carefully constructed, the LMM adequately models mediated exposure effects that change over time in the presence of mediation, even when the data arise from an SEM.

Download Full-text

Reliable hypotheses testing in animal social network analyses: global index, index of interactions and residual regression

10.1101/2021.12.14.472534 ◽

2021 ◽

Author(s):

Sebastian Sosa ◽

Cristian Pasquaretta ◽

Ivan Puga-Gonzalez ◽

F Stephen Dobson ◽

Vincent A Viblanc ◽

...

Keyword(s):

Social Network ◽

Type I Error ◽

Statistical Hypothesis ◽

Type I ◽

Global Index ◽

Statistical Hypothesis Testing ◽

False Negatives ◽

Network Analyses ◽

Wide Range ◽

Social Network Analyses

Animal social network analyses (ASNA) have led to a foundational shift in our understanding of animal sociality that transcends the disciplinary boundaries of genetics, spatial movements, epidemiology, information transmission, evolution, species assemblages and conservation. However, some analytical protocols (i.e., permutation tests) used in ASNA have recently been called into question due to the unacceptable rates of false negatives (type I error) and false positives (type II error) they generate in statistical hypothesis testing. Here, we show that these rates are related to the way in which observation heterogeneity is accounted for in association indices. To solve this issue, we propose a method termed the "global index" (GI) that consists of computing the average of individual associations indices per unit of time. In addition, we developed an "index of interactions" (II) that allows the use of the GI approach for directed behaviours. Our simulations show that GI: 1) returns more reasonable rates of false negatives and positives, with or without observational biases in the collected data, 2) can be applied to both directed and undirected behaviours, 3) can be applied to focal sampling, scan sampling or "gambit of the group" data collection protocols, and 4) can be applied to first- and second-order social network measures. Finally, we provide a method to control for non-social biological confounding factors using linear regression residuals. By providing a reliable approach for a wide range of scenarios, we propose a novel methodology in ASNA with the aim of better understanding social interactions from a mechanistic, ecological and evolutionary perspective.

Download Full-text

Maybe maximal: Good enough mixed models optimize power while controlling Type I error

10.31234/osf.io/xmhfr ◽

2019 ◽

Cited By ~ 1

Author(s):

Michael Seedorff ◽

Jacob Oleson ◽

Bob McMurray

Keyword(s):

Random Effects ◽

Mixed Models ◽

Repeated Measures ◽

Fixed Effects ◽

Degrees Of Freedom ◽

Linear Models ◽

Type I Error ◽

Logistic Models ◽

Information Criteria ◽

Type I

Mixed effects models have become a critical tool in all areas of psychology and allied fields. This is due to their ability to account for multiple random factors, and their ability to handle proportional data in repeated measures designs. While substantial research has addressed how to structure fixed effects in such models there is less understanding of appropriate random effects structures. Recent work with linear models suggests the choice of random effects structures affects Type I error in such models (Barr, Levy, Scheepers, & Tily, 2013; Matuschek, Kliegl, Vasishth, Baayen, & Bates, 2017). This has not been examined for between subject effects, which are crucial for many areas of psychology, nor has this been examined in logistic models. Moreover, mixed models expose a number of researcher degrees of freedom: the decision to aggregate data or not, the manner in which degrees of freedom are computed, and what to do when models do not converge. However, the implications of these choices for power and Type I error are not well known. To address these issues, we conducted a series of Monte Carlo simulations which examined linear and logistic models in a mixed design with crossed random effects. These suggest that a consideration of the entire space of possible models using simple information criteria such as AIC leads to optimal power while holding Type I error constant. They also suggest data aggregation and the d.f, computation have minimal effects on Type I Error and Power, and they suggest appropriate approaches for dealing with non-convergence.

Download Full-text