Finite mixture-of-gamma distributions: estimation, inference, and model-based clustering

Derek S. Young, Xi Chen, Dilrukshi C. Hewage, Ricardo Nilo-Poyanco

Research output: Contribution to journalArticlepeer-review

14 Scopus citations

Abstract

Finite mixtures of (multivariate) Gaussian distributions have broad utility, including their usage for model-based clustering. There is increasing recognition of mixtures of asymmetric distributions as powerful alternatives to traditional mixtures of Gaussian and mixtures of t distributions. The present work contributes to that assertion by addressing some facets of estimation and inference for mixtures-of-gamma distributions, including in the context of model-based clustering. Maximum likelihood estimation of mixtures of gammas is performed using an expectation–conditional–maximization (ECM) algorithm. The Wilson–Hilferty normal approximation is employed as part of an effective starting value strategy for the ECM algorithm, as well as provides insight into an effective model-based clustering strategy. Inference regarding the appropriateness of a common-shape mixture-of-gammas distribution is motivated by theory from research on infant habituation. We provide extensive simulation results that demonstrate the strong performance of our routines as well as analyze two real data examples: an infant habituation dataset and a whole genome duplication dataset.

Original languageEnglish
Pages (from-to)1053-1082
Number of pages30
JournalAdvances in Data Analysis and Classification
Volume13
Issue number4
DOIs
StatePublished - Dec 1 2019

Bibliographical note

Publisher Copyright:
© 2019, Springer-Verlag GmbH Germany, part of Springer Nature.

Keywords

  • ECM algorithms
  • Finite mixture models
  • Identifiability
  • Mixturegram
  • Multivariate Gaussian copula
  • Starting values

ASJC Scopus subject areas

  • Applied Mathematics
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Finite mixture-of-gamma distributions: estimation, inference, and model-based clustering'. Together they form a unique fingerprint.

Cite this