Poclustering: Lossless clustering of dissimilarity data

Jinze Liu, Qi Zhang, Wei Wang, Leonard McMillan, Jan Prins

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

10 Scopus citations

Abstract

Given a set of objects V with a dissimilarity measure between pairs of objects in V, a PoGluster is a collection of sets P c powerset(V) partially ordered by the C relation such that S C T if the maximal dissimilarity among objects in S is less than the maximal dissimilarity among objects in T. PoChisters capture categorizations of objects that are not strictly hierarchical, such as those found in ontologies. PoChisters can not, in general, be constructed using hierarchical clustering algorithms. In this paper, we examine the relationship between PoChisters and dissimilarity matrices and prove that PoChisters are in one-to-one correspondence with the set of dissimilarity matrices. The PoChistering problem is NP-Complete, and we present a heuristic algorithm for it in this paper. Experiments on both synthetic and real datasets demonstrate the quality and scalability of the algorithms.

Original languageEnglish
Title of host publicationProceedings of the 7th SIAM International Conference on Data Mining
Pages557-562
Number of pages6
DOIs
StatePublished - 2007
Event7th SIAM International Conference on Data Mining - Minneapolis, MN, United States
Duration: Apr 26 2007Apr 28 2007

Publication series

NameProceedings of the 7th SIAM International Conference on Data Mining

Conference

Conference7th SIAM International Conference on Data Mining
Country/TerritoryUnited States
CityMinneapolis, MN
Period4/26/074/28/07

ASJC Scopus subject areas

  • General Engineering

Fingerprint

Dive into the research topics of 'Poclustering: Lossless clustering of dissimilarity data'. Together they form a unique fingerprint.

Cite this