Share Email Print
cover

Proceedings Paper

Stochastic modeling of soundtrack for efficient segmentation and indexing of video
Format Member Price Non-Member Price
PDF $14.40 $18.00

Paper Abstract

Tools for efficient and intelligent management of digital content are essential for digital video data management. An extremely challenging research area in this context is that of multimedia analysis and understanding. The capabilities of audio analysis in particular for video data management are yet to be fully exploited. We present a novel scheme for indexing and segmentation of video by analyzing the audio track. This analysis is then applied to the segmentation and indexing of movies. We build models for some interesting events in the motion picture soundtrack. The models built include music, human speech and silence. We propose the use of hidden Markov models to model the dynamics of the soundtrack and detect audio-events. Using these models we segment and index the soundtrack. A practical problem in motion picture soundtracks is that the audio in the track is of a composite nature. This corresponds to the mixing of sounds from different sources. Speech in foreground and music in background are common examples. The coexistence of multiple individual audio sources forces us to model such events explicitly. Experiments reveal that explicit modeling gives better result than modeling individual audio events separately.

Paper Details

Date Published: 23 December 1999
PDF: 9 pages
Proc. SPIE 3972, Storage and Retrieval for Media Databases 2000, (23 December 1999); doi: 10.1117/12.373546
Show Author Affiliations
Milind Ramesh Naphade, Univ. of Illinois/Urbana-Champaign (United States)
Thomas S. Huang, Univ. of Illinois/Urbana-Champaign (United States)


Published in SPIE Proceedings Vol. 3972:
Storage and Retrieval for Media Databases 2000
Minerva M. Yeung; Boon-Lock Yeo; Charles A. Bouman, Editor(s)

© SPIE. Terms of Use
Back to Top