Abstract
Certain practical and theoretical challenges surround the estimation of finite mixture models. One such challenge is how to determine the number of components when this is not assumed a priori. Available methods in the literature are primarily numerical and lack any substantial visualization component. Traditional numerical methods include the calculation of information criteria and bootstrapping approaches; however, such methods have known technical issues regarding the necessary regularity conditions for testing the number of components. The ability to visualize an appropriate number of components for a finite mixture model could serve to supplement the results from traditional methods or provide visual evidence when results from such methods are inconclusive. Our research fills this gap through development of a visualization tool, which we call a mixturegram. This tool is easy to implement and provides a quick way for researchers to assess the number of components for their hypothesized mixture model. Mixtures of univariate or multivariate data can be assessed. We validate our visualization assessments by comparing with results from information criteria and an ad hoc selection criterion based on calculations used for the mixturegram. We also construct the mixturegram for two datasets.
Original language | English |
---|---|
Pages (from-to) | 564-575 |
Number of pages | 12 |
Journal | Journal of Computational and Graphical Statistics |
Volume | 27 |
Issue number | 3 |
DOIs | |
State | Published - Jul 3 2018 |
Bibliographical note
Publisher Copyright:© 2018, © 2018 American Statistical Association, Institute of Mathematical Statistics, and Interface Foundation of North America.
Keywords
- Cluster analysis
- EM algorithm
- Identifiability
- Parallel coordinates
- Principal components
ASJC Scopus subject areas
- Statistics and Probability
- Statistics, Probability and Uncertainty
- Discrete Mathematics and Combinatorics