Perils and pitfalls of mixed-effects regression models in biology

PeerJ ◽

10.7717/peerj.9522 ◽

2020 ◽

Vol 8 ◽

pp. e9522 ◽

Cited By ~ 2

Author(s):

Matthew J. Silk ◽

Xavier A. Harrison ◽

David J. Hodgson

Keyword(s):

Biological Sciences ◽

Regression Models ◽

Mixed Model ◽

Mixed Effects ◽

Biological Research ◽

Functional Relationships ◽

Fixed And Random Effects ◽

Statistical Relationships ◽

Biological Literature ◽

Selection Of

Biological systems, at all scales of organisation from nucleic acids to ecosystems, are inherently complex and variable. Biologists therefore use statistical analyses to detect signal among this systemic noise. Statistical models infer trends, find functional relationships and detect differences that exist among groups or are caused by experimental manipulations. They also use statistical relationships to help predict uncertain futures. All branches of the biological sciences now embrace the possibilities of mixed-effects modelling and its flexible toolkit for partitioning noise and signal. The mixed-effects model is not, however, a panacea for poor experimental design, and should be used with caution when inferring or deducing the importance of both fixed and random effects. Here we describe a selection of the perils and pitfalls that are widespread in the biological literature, but can be avoided by careful reflection, modelling and model-checking. We focus on situations where incautious modelling risks exposure to these pitfalls and the drawing of incorrect conclusions. Our stance is that statements of significance, information content or credibility all have their place in biological research, as long as these statements are cautious and well-informed by checks on the validity of assumptions. Our intention is to reveal potential perils and pitfalls in mixed model estimation so that researchers can use these powerful approaches with greater awareness and confidence. Our examples are ecological, but translate easily to all branches of biology.

Download Full-text

Factors associated with parasite dominance in fishes from Brazil

Revista Brasileira de Parasitologia Veterinária ◽

10.1590/s1984-29612016040 ◽

2016 ◽

Vol 25 (2) ◽

pp. 225-230

Author(s):

Cristina Fernandes do Amarante ◽

Wagner de Souza Tassinari ◽

Jose Luis Luque ◽

Maria Julia Salim Pereira

Keyword(s):

Linear Regression ◽

Body Length ◽

Regression Models ◽

Mixed Model ◽

Multiple Linear Regression Model ◽

Mixed Effects ◽

The Body ◽

Linear Regression Models ◽

Biological Aspects ◽

The Relationship

Abstract The present study used regression models to evaluate the existence of factors that may influence the numerical parasite dominance with an epidemiological approximation. A database including 3,746 fish specimens and their respective parasites were used to evaluate the relationship between parasite dominance and biotic characteristics inherent to the studied hosts and the parasite taxa. Multivariate, classical, and mixed effects linear regression models were fitted. The calculations were performed using R software (95% CI). In the fitting of the classical multiple linear regression model, freshwater and planktivorous fish species and body length, as well as the species of the taxa Trematoda, Monogenea, and Hirudinea, were associated with parasite dominance. However, the fitting of the mixed effects model showed that the body length of the host and the species of the taxa Nematoda, Trematoda, Monogenea, Hirudinea, and Crustacea were significantly associated with parasite dominance. Studies that consider specific biological aspects of the hosts and parasites should expand the knowledge regarding factors that influence the numerical dominance of fish in Brazil. The use of a mixed model shows, once again, the importance of the appropriate use of a model correlated with the characteristics of the data to obtain consistent results.

Download Full-text

Double Penalized H-Likelihood for Selection of Fixed and Random Effects in Mixed Effects Models

Statistics in Biosciences ◽

10.1007/s12561-013-9105-x ◽

2013 ◽

Vol 7 (1) ◽

pp. 108-128 ◽

Cited By ~ 2

Author(s):

Peirong Xu ◽

Tao Wang ◽

Hongtu Zhu ◽

Lixing Zhu

Keyword(s):

Random Effects ◽

Mixed Effects ◽

Mixed Effects Models ◽

Fixed And Random Effects ◽

Selection Of

Download Full-text

An iterative algorithm for joint covariate and random effect selection in mixed effects models

The International Journal of Biostatistics ◽

10.1515/ijb-2019-0082 ◽

2020 ◽

Vol 0 (0) ◽

Author(s):

Maud Delattre ◽

Marie-Anne Poursat

Keyword(s):

Random Effects ◽

Nonlinear Models ◽

Random Effect ◽

Mixed Effects ◽

Information Criteria ◽

Mixed Effects Models ◽

Selection Algorithm ◽

Fixed And Random Effects ◽

Low Dimension ◽

Selection Of

AbstractWe consider joint selection of fixed and random effects in general mixed-effects models. The interpretation of estimated mixed-effects models is challenging since changing the structure of one set of effects can lead to different choices of important covariates in the model. We propose a stepwise selection algorithm to perform simultaneous selection of the fixed and random effects. It is based on Bayesian Information criteria whose penalties are adapted to mixed-effects models. The proposed procedure performs model selection in both linear and nonlinear models. It should be used in the low-dimension setting where the number of ovariates and the number of random effects are moderate with respect to the total number of observations. The performance of the algorithm is assessed via a simulation study, which includes also a comparative study with alternatives when available in the literature. The use of the method is illustrated in the clinical study of an antibiotic agent kinetics.

Download Full-text

A Bayesian Approach to Mixed Gamma Regression Models

Revista Colombiana de Estadística ◽

10.15446/rce.v42n1.69334 ◽

2019 ◽

Vol 42 (1) ◽

pp. 81-99

Author(s):

Marta Lucia Corrales ◽

Edilberto Cepeda-Cuervo

Keyword(s):

Bayesian Approach ◽

Regression Models ◽

Shape Parameter ◽

Real Data ◽

Mixed Effects ◽

Continuous Variables ◽

Positive Real ◽

Fixed And Random Effects ◽

Computational Implementation ◽

The Mean

Gamma regression models are a suitable choice to model continuous variables that take positive real values. This paper presents a gamma regression model with mixed effects from a Bayesian approach. We use the parametrisation of the gamma distribution in terms of the mean and the shape parameter, both of which are modelled through regression structures that may involve fixed and random effects. A computational implementation via Gibbs sampling is provided and illustrative examples (simulated and real data) are presented.

Download Full-text

Power of Modified Brown-Forsythe and Mixed-Model Approaches in Split-Plot Designs

Methodology ◽

10.1027/1614-2241/a000124 ◽

2017 ◽

Vol 13 (1) ◽

pp. 9-22 ◽

Cited By ~ 1

Author(s):

Pablo Livacic-Rojas ◽

Guillermo Vallejo ◽

Paula Fernández ◽

Ellián Tuero-Herrero

Keyword(s):

Repeated Measures ◽

Statistical Power ◽

Mixed Model ◽

Covariance Structure ◽

Simulation Method ◽

Future Research ◽

Repeated Measures Design ◽

Fixed And Random Effects ◽

Split Plot ◽

High Level

Abstract. Low precision of the inferences of data analyzed with univariate or multivariate models of the Analysis of Variance (ANOVA) in repeated-measures design is associated to the absence of normality distribution of data, nonspherical covariance structures and free variation of the variance and covariance, the lack of knowledge of the error structure underlying the data, and the wrong choice of covariance structure from different selectors. In this study, levels of statistical power presented the Modified Brown Forsythe (MBF) and two procedures with the Mixed-Model Approaches (the Akaike’s Criterion, the Correctly Identified Model [CIM]) are compared. The data were analyzed using Monte Carlo simulation method with the statistical package SAS 9.2, a split-plot design, and considering six manipulated variables. The results show that the procedures exhibit high statistical power levels for within and interactional effects, and moderate and low levels for the between-groups effects under the different conditions analyzed. For the latter, only the Modified Brown Forsythe shows high level of power mainly for groups with 30 cases and Unstructured (UN) and Autoregressive Heterogeneity (ARH) matrices. For this reason, we recommend using this procedure since it exhibits higher levels of power for all effects and does not require a matrix type that underlies the structure of the data. Future research needs to be done in order to compare the power with corrected selectors using single-level and multilevel designs for fixed and random effects.

Download Full-text

Inference about the fixed and random effects in a mixed-effects linear model: an approximate Bayesian approach

10.31274/rtd-180813-11736 ◽

1993 ◽

Author(s):

Alan George Zimmermann

Keyword(s):

Linear Model ◽

Random Effects ◽

Bayesian Approach ◽

Mixed Effects ◽

Fixed And Random Effects ◽

Approximate Bayesian

Download Full-text

Deep Learning in Disease Diagnosis: Models and Datasets

Current Bioinformatics ◽

10.2174/1574893615999201002124021 ◽

2020 ◽

Vol 15 ◽

Author(s):

Deeksha Saxena ◽

Mohammed Haris Siddiqui ◽

Rajnish Kumar

Keyword(s):

Biological Sciences ◽

Machine Learning ◽

Deep Learning ◽

Disease Diagnosis ◽

Learning Models ◽

Data Types ◽

Related Data ◽

Abstract Level ◽

Experimental Validations ◽

Selection Of

Background: Deep learning (DL) is an Artificial neural network-driven framework with multiple levels of representation for which non-linear modules combined in such a way that the levels of representation can be enhanced from lower to a much abstract level. Though DL is used widely in almost every field, it has largely brought a breakthrough in biological sciences as it is used in disease diagnosis and clinical trials. DL can be clubbed with machine learning, but at times both are used individually as well. DL seems to be a better platform than machine learning as the former does not require an intermediate feature extraction and works well with larger datasets. DL is one of the most discussed fields among the scientists and researchers these days for diagnosing and solving various biological problems. However, deep learning models need some improvisation and experimental validations to be more productive. Objective: To review the available DL models and datasets that are used in disease diagnosis. Methods: Available DL models and their applications in disease diagnosis were reviewed discussed and tabulated. Types of datasets and some of the popular disease related data sources for DL were highlighted. Results: We have analyzed the frequently used DL methods, data types and discussed some of the recent deep learning models used for solving different biological problems. Conclusion: The review presents useful insights about DL methods, data types, selection of DL models for the disease diagnosis.

Download Full-text

Bayesian quantile semiparametric mixed-effects double regression models

Statistical Theory and Related Fields ◽

10.1080/24754269.2021.1877961 ◽

2021 ◽

pp. 1-13

Author(s):

Duo Zhang ◽

Liucang Wu ◽

Keying Ye ◽

Min Wang

Keyword(s):

Regression Models ◽

Mixed Effects

Download Full-text

Optimization of integrated fuzzy decision tree and regression models for selection of oil spill response method in the Arctic

Knowledge-Based Systems ◽

10.1016/j.knosys.2020.106676 ◽

2021 ◽

Vol 213 ◽

pp. 106676

Author(s):

Saeed Mohammadiun ◽

Guangji Hu ◽

Abdorreza Alavi Gharahbagh ◽

Reza Mirshahi ◽

Jianbing Li ◽

...

Keyword(s):

Decision Tree ◽

Oil Spill ◽

Regression Models ◽

The Arctic ◽

Fuzzy Decision ◽

Fuzzy Decision Tree ◽

Oil Spill Response ◽

Response Method ◽

Selection Of

Download Full-text

Selection of coffee progenies for resistance to nematode Meloidogyne paranaensis in infested area

Cropp Breeding and Applied Biotechnology ◽

10.1590/1984-70332014v14n2a17 ◽

2014 ◽

Vol 14 (2) ◽

pp. 94-101 ◽

Cited By ~ 11

Author(s):

Sonia Maria Lima Salgado ◽

Juliana Costa de Rezende ◽

José Airton Rodrigues Nunes

Keyword(s):

Genetic Variability ◽

Coffea Arabica ◽

Mixed Model ◽

High Rate ◽

Germplasm Bank ◽

Genotypic Association ◽

Coffee Plants ◽

Plagiotropic Branches ◽

Mixed Model Methodology ◽

Selection Of

The purpose of this study was to select Coffea arabica progenies for resistance to M. paranaensis in an infested coffee growing area using Henderson's mixed model methodology. Forty-one genotypes were selected at the Coffee Active Germplasm Bank of Minas Gerais, and evaluated in regard to stem diameter, number of plagiotropic branches, reaction to the nematode, and yield per plant. There was genetic variability among the genotypes studied for all the traits evaluated, and among the populations studied for yield and reaction to the nematode, indicating possibilities for obtaining genetic gains through selection in this population. There was high rate of genotypic association between all the traits studied. Coffee plants of Timor Hybrid UFV408-01 population, and F3 progenies derived from crossing Catuaí Vermelho and Amphillo MR 2161 were the most promising in the area infested by M. paranaensis.

Download Full-text