Collision between biological process and statistical analysis revealed by mean centring

David F. Westneat, Yimen G. Araya-Ajoy, Hassen Allegue, Barbara Class, Niels Dingemanse, Ned A. Dochtermann, László Zsolt Garamszegi, Julien G.A. Martin, Shinichi Nakagawa, Denis Réale, Holger Schielzeth

Research output: Contribution to journalArticlepeer-review

26 Scopus citations

Abstract

Animal ecologists often collect hierarchically structured data and analyse these with linear mixed-effects models. Specific complications arise when the effect sizes of covariates vary on multiple levels (e.g. within vs. among subjects). Mean centring of covariates within subjects offers a useful approach in such situations, but is not without problems. A statistical model represents a hypothesis about the underlying biological process. Mean centring within clusters assumes that the lower level responses (e.g. within subjects) depend on the deviation from the subject mean (relative) rather than on the absolute scale of the covariate. This may or may not be biologically realistic. We show that mismatch between the nature of the generating (i.e. biological) process and the form of the statistical analysis produce major conceptual and operational challenges for empiricists. We explored the consequences of mismatches by simulating data with three response-generating processes differing in the source of correlation between a covariate and the response. These data were then analysed by three different analysis equations. We asked how robustly different analysis equations estimate key parameters of interest and under which circumstances biases arise. Mismatches between generating and analytical equations created several intractable problems for estimating key parameters. The most widely misestimated parameter was the among-subject variance in response. We found that no single analysis equation was robust in estimating all parameters generated by all equations. Importantly, even when response-generating and analysis equations matched mathematically, bias in some parameters arose when sampling across the range of the covariate was limited. Our results have general implications for how we collect and analyse data. They also remind us more generally that conclusions from statistical analysis of data are conditional on a hypothesis, sometimes implicit, for the process(es) that generated the attributes we measure. We discuss strategies for real data analysis in face of uncertainty about the underlying biological process.

Original languageEnglish
Pages (from-to)2813-2824
Number of pages12
JournalJournal of Animal Ecology
Volume89
Issue number12
DOIs
StatePublished - Dec 2020

Bibliographical note

Publisher Copyright:
© 2020 British Ecological Society

Keywords

  • bivariate models
  • environmental effects
  • hierarchical causation
  • linear mixed-effects models
  • model design
  • parameter misestimation
  • phenotypic plasticity

ASJC Scopus subject areas

  • Ecology, Evolution, Behavior and Systematics
  • Animal Science and Zoology

Fingerprint

Dive into the research topics of 'Collision between biological process and statistical analysis revealed by mean centring'. Together they form a unique fingerprint.

Cite this