Share Email Print
cover

Proceedings Paper

Procedure for audio-assisted browsing of news video using generalized sound recognition
Author(s): Ajay Divakaran; Regunathan Radhakrishnan; Ziyou Xiong; Michael Casey
Format Member Price Non-Member Price
PDF $14.40 $18.00

Paper Abstract

In Casey describes a generalized sound recognition framework based on reduced rank spectra and Minimum-Entropy Priors. This approach enables successful recognition of a wide variety of sounds such as male speech, female speech, music, animal sounds etc. In this work, we apply this recognition framework to news video to enable quick video browsing. We identify speaker change positions in the broadcast news using the sound recognition framework. We combine the speaker change position with color & motion cues from video and are able to locate the beginning of each of the topics covered by the news video. We can thus skim the video by merely playing a small portion starting from each of the locations where one of the principal cast begins to speak. In combination with our motion-based video browsing approach, our technique provides simple automatic news video browsing. While similar work has been done before, our approach is simpler and faster than competing techniques, and provides a rich framework for further analysis and description of content.

Paper Details

Date Published: 10 January 2003
PDF: 7 pages
Proc. SPIE 5021, Storage and Retrieval for Media Databases 2003, (10 January 2003); doi: 10.1117/12.476294
Show Author Affiliations
Ajay Divakaran, Mitsubishi Electric Research Labs. (United States)
Regunathan Radhakrishnan, Mitsubishi Electric Research Labs. (United States)
Ziyou Xiong, Univ. of Illinois/Urbana-Champaign (United States)
Michael Casey, Mitsubishi Electric Research Labs. (United States)


Published in SPIE Proceedings Vol. 5021:
Storage and Retrieval for Media Databases 2003
Minerva M. Yeung; Rainer W. Lienhart; Chung-Sheng Li, Editor(s)

© SPIE. Terms of Use
Back to Top