Abstract
Linear mixed-effects models are powerful tools for analysing complex datasets with repeated or clustered observations, a common data structure in ecology and evolution. Mixed-effects models involve complex fitting procedures and make several assumptions, in particular about the distribution of residual and random effects. Violations of these assumptions are common in real datasets, yet it is not always clear how much these violations matter to accurate and unbiased estimation. Here we address the consequences of violations in distributional assumptions and the impact of missing random effect components on model estimates. In particular, we evaluate the effects of skewed, bimodal and heteroscedastic random effect and residual variances, of missing random effect terms and of correlated fixed effect predictors. We focus on bias and prediction error on estimates of fixed and random effects. Model estimates were usually robust to violations of assumptions, with the exception of slight upward biases in estimates of random effect variance if the generating distribution was bimodal but was modelled by Gaussian error distributions. Further, estimates for (random effect) components that violated distributional assumptions became less precise but remained unbiased. However, this particular problem did not affect other parameters of the model. The same pattern was found for strongly correlated fixed effects, which led to imprecise, but unbiased estimates, with uncertainty estimates reflecting imprecision. Unmodelled sources of random effect variance had predictable effects on variance component estimates. The pattern is best viewed as a cascade of hierarchical grouping factors. Variances trickle down the hierarchy such that missing higher-level random effect variances pool at lower levels and missing lower-level and crossed random effect variances manifest as residual variance. Overall, our results show remarkable robustness of mixed-effects models that should allow researchers to use mixed-effects models even if the distributional assumptions are objectively violated. However, this does not free researchers from careful evaluation of the model. Estimates that are based on data that show clear violations of key assumptions should be treated with caution because individual datasets might give highly imprecise estimates, even if they will be unbiased on average across datasets.
| Original language | English |
|---|---|
| Pages (from-to) | 1141-1152 |
| Number of pages | 12 |
| Journal | Methods in Ecology and Evolution |
| Volume | 11 |
| Issue number | 9 |
| DOIs | |
| State | Published - Sep 1 2020 |
Bibliographical note
Publisher Copyright:© 2020 The Authors. Methods in Ecology and Evolution published by John Wiley & Sons Ltd on behalf of British Ecological Society
Funding
This manuscript is a result of the SQuID Working Group that convened at multiple workshops, which were financially supported by the International Max Planck Research School for Organismal Biology, the Centre for Population Biology at the Norwegian University of Science and Technology, the Centre d'Ecologie Fonctionelle and Evolutive at the University of Montpellier and the Centre for Ecological Research of the Hungarian Academy of Sciences. We thank Roger Mundry, Henrik Singmann and an anonymous reviewer for very helpful comments on the manuscript. H.S. was supported by the German Research Foundation (DFG) as part of the SFB TRR 212 (NC; funding INST 215/543‐1, 396782608), NJD was funded by DFG grant DI 1694/1‐1, DFW was supported by the U.S. National Science Foundation (IOS1257718), NAD was supported by the U.S. National Science Foundation (IOS1557951). LZG was supported by the Hungarian National Research, Development and Innovation Office (K129215). 3 This manuscript is a result of the SQuID Working Group that convened at multiple workshops, which were financially supported by the International Max Planck Research School for Organismal Biology, the Centre for Population Biology at the Norwegian University of Science and Technology, the Centre d'Ecologie Fonctionelle and Evolutive at the University of Montpellier and the Centre for Ecological Research of the Hungarian Academy of Sciences. We thank Roger Mundry, Henrik Singmann and an anonymous reviewer for very helpful comments on the manuscript. H.S. was supported by the German Research Foundation (DFG) as part of the SFB TRR 212 (NC3; funding INST 215/543-1, 396782608), NJD was funded by DFG grant DI 1694/1-1, DFW was supported by the U.S. National Science Foundation (IOS1257718), NAD was supported by the U.S. National Science Foundation (IOS1557951). LZG was supported by the Hungarian National Research, Development and Innovation Office (K129215).
| Funders | Funder number |
|---|---|
| International Max Planck Research School for Organismal Biology | |
| U.S. National Science Foundation (NSF) | |
| National Science Foundation Arctic Social Science Program | IOS1557951, IOS1257718 |
| Norges Teknisk-Naturvitenskapelige Universitet | |
| Deutsche Forschungsgemeinschaft | 396782608, SFB TRR 212, INST 215/543‐1, DI 1694/1‐1 |
| Magyar Tudományos Akadémia | |
| Université de Montpellier | |
| Nemzeti Kutatási Fejlesztési és Innovációs Hivatal | |
| International Max Planck Research School for Advanced Methods in Process and Systems Engineering | |
| Hungarian National Research Fund/National Research, Development and Innovation Office | K129215 |
Keywords
- biostatistics
- correlated predictors
- distributional assumptions
- linear mixed-effects models
- missing random effects
- statistical quantification of individual differences (SQuID)
ASJC Scopus subject areas
- Ecology, Evolution, Behavior and Systematics
- Ecological Modeling