Collision between biological process and statistical analysis revealed by mean centring

David F. Westneat, Yimen G. Araya-Ajoy, Hassen Allegue, Barbara Class, Niels Dingemanse, Ned A. Dochtermann, László Zsolt Garamszegi, Julien G.A. Martin, Shinichi Nakagawa, Denis Réale, Holger Schielzeth

Research output: Contribution to journalArticlepeer-review

17 Scopus citations

Abstract

Animal ecologists often collect hierarchically structured data and analyse these with linear mixed-effects models. Specific complications arise when the effect sizes of covariates vary on multiple levels (e.g. within vs. among subjects). Mean centring of covariates within subjects offers a useful approach in such situations, but is not without problems. A statistical model represents a hypothesis about the underlying biological process. Mean centring within clusters assumes that the lower level responses (e.g. within subjects) depend on the deviation from the subject mean (relative) rather than on the absolute scale of the covariate. This may or may not be biologically realistic. We show that mismatch between the nature of the generating (i.e. biological) process and the form of the statistical analysis produce major conceptual and operational challenges for empiricists. We explored the consequences of mismatches by simulating data with three response-generating processes differing in the source of correlation between a covariate and the response. These data were then analysed by three different analysis equations. We asked how robustly different analysis equations estimate key parameters of interest and under which circumstances biases arise. Mismatches between generating and analytical equations created several intractable problems for estimating key parameters. The most widely misestimated parameter was the among-subject variance in response. We found that no single analysis equation was robust in estimating all parameters generated by all equations. Importantly, even when response-generating and analysis equations matched mathematically, bias in some parameters arose when sampling across the range of the covariate was limited. Our results have general implications for how we collect and analyse data. They also remind us more generally that conclusions from statistical analysis of data are conditional on a hypothesis, sometimes implicit, for the process(es) that generated the attributes we measure. We discuss strategies for real data analysis in face of uncertainty about the underlying biological process.

Original languageEnglish
Pages (from-to)2813-2824
Number of pages12
JournalJournal of Animal Ecology
Volume89
Issue number12
DOIs
StatePublished - Dec 2020

Bibliographical note

Funding Information:
This is publication No. 3 of the SQuID group, formed in 2013 to investigate the many ways linear mixed models can be used to reveal new biology and the variety of conditions where the models may go astray. We declare here that we have no conflicts of interest to report. We thank funders over multiple workshops we have held where the ideas in this paper were discussed, including the Volkswagen Foundation in 2014, the International Max Planck Research School for Organismal Biology in 2015, the Centre for Population Biology at the Norwegian University of Science and Technology in 2016, the Centre d'Ecologie Fonctionelle and Evolutive at the University of Montpellier in 2018 and the ‘Ecosystem GINOP’ project (2016–2020) of the Centre for Ecological Research, supported by the European Regional Development Fund and the Hungarian Government, in 2019. D.F.W. was supported by the U. S. National Science Foundation (IOS1257718), Y.G.A.‐A. by the Research Council of Norway through its Centres of Excellence funding scheme (SFF‐III 223257/F50), N.J.D. by the German Science Foundation (grant no. DI 1694/1‐1), N.A.D. by the US National Science Foundation (NSF IOS 1557951), S.N. by an ARC Discovery Fellowship (DP180100818), D.R. by a Discovery grant from the Natural Science and Engineering Research Council of Canada and H.S. by the German Science Foundation (DFG) as part of the SFB TRR 212 (NC) (INST 215/543‐1, 396782608). We thank the Westneat group, Jarrod Hadfield, Ally Phillimore and an anonymous reviewer for much helpful feedback improving the manuscript. 3

Publisher Copyright:
© 2020 British Ecological Society

Keywords

  • bivariate models
  • environmental effects
  • hierarchical causation
  • linear mixed-effects models
  • model design
  • parameter misestimation
  • phenotypic plasticity

ASJC Scopus subject areas

  • Ecology, Evolution, Behavior and Systematics
  • Animal Science and Zoology

Fingerprint

Dive into the research topics of 'Collision between biological process and statistical analysis revealed by mean centring'. Together they form a unique fingerprint.

Cite this