TY - GEN
T1 - Mining arbitrary-length repeated patterns in television broadcast
AU - Cheung, Sen Ching S.
AU - Nguyen, Thinh P.
N1 - Copyright:
Copyright 2011 Elsevier B.V., All rights reserved.
PY - 2005
Y1 - 2005
N2 - Mining repeated patterns in television broadcast is important to advertisers in tracking a large number of television commercials. It can also benefit long-term archival of television because historically significant events are usually marked by repeated airing of the same video clips or sound-bytes. In this paper, we describe a system that can efficiently mine repeated patterns of arbitrary lengths from television broadcast. Compared with existing work, our system has two main innovations: first, our system is robust against minor temporal variations among repeated patterns. This is important as broadcasters often perform temporal editing on commercials so as to fit them into different time slots. Second, our system does not rely on any temporal segmentation algorithm, which may lead to over- or under-segmentation of important patterns. Instead, our system scans the television broadcast with a fixed-size sliding window, summarizes each window into a hash value, and maintains a running frequency count and a reference time-stamp on each hash value. The boundaries of a repeated pattern are identified by the changes in frequency counts and reference time-stamps. Initial experiments show that our system is very efficient in identifying all the repeated commercials from 12 hours of television broadcast.
AB - Mining repeated patterns in television broadcast is important to advertisers in tracking a large number of television commercials. It can also benefit long-term archival of television because historically significant events are usually marked by repeated airing of the same video clips or sound-bytes. In this paper, we describe a system that can efficiently mine repeated patterns of arbitrary lengths from television broadcast. Compared with existing work, our system has two main innovations: first, our system is robust against minor temporal variations among repeated patterns. This is important as broadcasters often perform temporal editing on commercials so as to fit them into different time slots. Second, our system does not rely on any temporal segmentation algorithm, which may lead to over- or under-segmentation of important patterns. Instead, our system scans the television broadcast with a fixed-size sliding window, summarizes each window into a hash value, and maintains a running frequency count and a reference time-stamp on each hash value. The boundaries of a repeated pattern are identified by the changes in frequency counts and reference time-stamps. Initial experiments show that our system is very efficient in identifying all the repeated commercials from 12 hours of television broadcast.
UR - http://www.scopus.com/inward/record.url?scp=33749242242&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33749242242&partnerID=8YFLogxK
U2 - 10.1109/ICIP.2005.1530358
DO - 10.1109/ICIP.2005.1530358
M3 - Conference contribution
AN - SCOPUS:33749242242
SN - 0780391349
SN - 9780780391345
T3 - Proceedings - International Conference on Image Processing, ICIP
SP - 181
EP - 184
BT - IEEE International Conference on Image Processing 2005, ICIP 2005
T2 - IEEE International Conference on Image Processing 2005, ICIP 2005
Y2 - 11 September 2005 through 14 September 2005
ER -