Abstract
The proliferation of video content on the web makes similarity detection an indispensable tool in web data management, searching, and navigation. We have previously proposed a compact representation of video clips, called video signature, for retrieving similar video clips in large database. In this paper, we propose a new signature clustering algorithm to further improve retrieval performance. The algorithm treats all the signatures as an abstract threshold graph, where the threshold is determined based on local data statistics. Similar clusters are identified as highly connected regions in the graph. This algorithm outperforms simple thresholding and hierarchical clustering techniques in identifying a set of manually-determined similar clusters from a dataset of 46,356 web video clips. At 95% precision, our algorithm attains 85% recall while simple thresholding and complete-link hierarchical scheme attain 67% and 75% recall respectively. Applying our algorithm to the entire dataset, 6,900 similar clusters are identified, with an average cluster size of 2.81 video clips. The distribution of cluster sizes follows a power-law distribution, which has been shown to describe many web phenomena.
Original language | English |
---|---|
Pages | 649-652 |
Number of pages | 4 |
State | Published - 2001 |
Event | IEEE International Conference on Image Processing (ICIP) - Thessaloniki, Greece Duration: Oct 7 2001 → Oct 10 2001 |
Conference
Conference | IEEE International Conference on Image Processing (ICIP) |
---|---|
Country/Territory | Greece |
City | Thessaloniki |
Period | 10/7/01 → 10/10/01 |
ASJC Scopus subject areas
- Hardware and Architecture
- Computer Vision and Pattern Recognition
- Electrical and Electronic Engineering