Abstract
The proliferation of video content on the Web makes similarity detection an indispensable tool in Web data management, searching, and navigation. In this paper, we propose a number of algorithms to efficiently measure video similarity. We define video as a set of frames, which are represented as high dimensional vectors in a feature space. Our goal is to measure ideal video similarity (IVS), defined as the percentage of clusters of similar frames shared between two video sequences. Since IVS is too complex to be deployed in large database applications, we approximate it with Voronoi video similarity (VVS), defined as the volume of the intersection between Voronoi cells of similar clusters. We propose a class of randomized algorithms to estimate WS by first summarizing each video with a small set of its sampled frames, called the video signature (ViSig), and then calculating the distances between corresponding frames from the two ViSigs. By generating samples with a probability distribution that describes the video statistics, and ranking them based upon their likelihood of making an error in the estimation, we show analytically that ViSig can provide an unbiased estimate of IVS. Experimental results on a large dataset of Web video and a set of MPEG-7 test sequences with artificially generated similar versions are provided to demonstrate the retrieval performance of our proposed techniques.
Original language | English |
---|---|
Pages (from-to) | 59-74 |
Number of pages | 16 |
Journal | IEEE Transactions on Circuits and Systems for Video Technology |
Volume | 13 |
Issue number | 1 |
DOIs | |
State | Published - Jan 2003 |
Bibliographical note
Funding Information:This work was supported by the Air Force Office of Scientific Research (AFOSR) Grant F49620-00-1-0327 and by National Science Foundation S.(NSF)Panchanathan.Grant ANI-9905799. This paper was recommended by Associate Editor HE AMOUNT of information on the World Wide Web has TS. S. Cheung was with the University of California, Berkeley, CAgrown enormously since its creation in 1990. By February LawrenceLivermoreNationalLaboratory,Livermore,CA94551USA(e-mail:94720USA.He is now withtheCenterofAppliedScientificComputing, 2000, the Web had over one billion uniquely indexed pages and sccheung@ieee.org). 30 million audio, video, and image links [1]. Since there is no A.ZakhoriswiththeDepartmentofElectricalEngineeringandComputer central management on the Web, duplication of content is in-avz@eecs.berkeley.edu).Sciences,UniversityofCalifornia, Berkeley, CA 94720 USA (e-mail: evitable. A study performed in 1998 estimated that about 46% Digital Object Identifier 10.1109/TCSVT.2002.808080 of all the text documents on the Web have at least one “near-
Keywords
- Randomized algorithm
- Video similarity
- Video summarization
- Voronoi diagram
ASJC Scopus subject areas
- Media Technology
- Electrical and Electronic Engineering