Convex clustering analysis for histogram-valued data

Cheolwoo Park, Hosik Choi, Chris Delcher, Yanning Wang, Young Joo Yoon

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

In recent years, there has been increased interest in symbolic data analysis, including for exploratory analysis, supervised and unsupervised learning, time series analysis, etc. Traditional statistical approaches that are designed to analyze single-valued data are not suitable because they cannot incorporate the additional information on data structure available in symbolic data, and thus new techniques have been proposed for symbolic data to bridge this gap. In this article, we develop a regularized convex clustering approach for grouping histogram-valued data. The convex clustering is a relaxation of hierarchical clustering methods, where prototypes are grouped by having exactly the same value in each group via penalization of parameters. We apply two different distance metrics to measure (dis)similarity between histograms. Various numerical examples confirm that the proposed method shows better performance than other competitors.

Original languageEnglish
Pages (from-to)603-612
Number of pages10
JournalBiometrics
Volume75
Issue number2
DOIs
StatePublished - 2019

Bibliographical note

Funding Information:
The authors thank Dr Jaejik Kim for providing us with the R code. Dr Choi's research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (2017R1D1A1B05028565). Dr Delcher's research was supported by the Bureau of Justice Assistance (2016-PM-BX-K005). Dr Yoon's research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (2017R1D1A1B03028121).

Publisher Copyright:
© 2019 International Biometric Society

Keywords

  • Wassertein-Kantorovich metric
  • clustering
  • histogram-valued data
  • quantiles
  • regularization

ASJC Scopus subject areas

  • Statistics and Probability
  • Biochemistry, Genetics and Molecular Biology (all)
  • Immunology and Microbiology (all)
  • Agricultural and Biological Sciences (all)
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'Convex clustering analysis for histogram-valued data'. Together they form a unique fingerprint.

Cite this