Revealing true subspace clusters in high dimensions

Jinze Liu, Karl Strohmaier, Wei Wang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Scopus citations

Abstract

Subspace clustering is one of the best approaches for discovering meaningful clusters in high dimensional space. One cluster in high dimensional space may be transcribed into multiple distinct maximal clusters by projecting onto different subspaces. A direct consequence of clustering independently in each subspace is an overwhelmingly large set of overlapping clusters which may be significantly similar. To reveal the true underlying clusters, we propose a similarity measurement of the overlapping clusters. We adopt the model of Gaussian tailed hyper-rectangles to capture the distribution of any subspace cluster. A set of experiments on a synthetic dataset demonstrates the effectiveness of our approach. Application to real gene expression data also reveals impressive meta-clusters expected by biologists.

Original languageEnglish
Title of host publicationProceedings - Fourth IEEE International Conference on Data Mining, ICDM 2004
EditorsR. Rastogi, K. Morik, M. Bramer, X. Wu
Pages463-466
Number of pages4
StatePublished - 2004
EventProceedings - Fourth IEEE International Conference on Data Mining, ICDM 2004 - Brighton, United Kingdom
Duration: Nov 1 2004Nov 4 2004

Publication series

NameProceedings - Fourth IEEE International Conference on Data Mining, ICDM 2004

Conference

ConferenceProceedings - Fourth IEEE International Conference on Data Mining, ICDM 2004
Country/TerritoryUnited Kingdom
CityBrighton
Period11/1/0411/4/04

Keywords

  • Adhesion
  • Cluster Intersection
  • Gaussian Tails
  • Gene Expression
  • Local Grid
  • Overlapping Cluster
  • Subspace Clustering

ASJC Scopus subject areas

  • General Engineering

Fingerprint

Dive into the research topics of 'Revealing true subspace clusters in high dimensions'. Together they form a unique fingerprint.

Cite this