Share Email Print

Proceedings Paper

Extracting movie scenes based on multimodal information
Author(s): Ying Li; C.-C. Jay Kuo
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

This research addresses the problem of automatically extracting semantic video scenes from daily movies using multimodal information. A 3-stage scene detection scheme is proposed. In the first stage, we use pure visual information to extract a coarse-level scene structure based on generated shot sinks. In the second stage, the audio cue is integrated to further refine scene detection results by considering various kinds of audio scenarios. Finally, in the third stage, we allow users to directly interact with the system so as to fine-tune the detection results to their own satisfaction. The generated scene structure can provide a compact yet meaningful abstraction of the video data, which will apparently facilitate the content access. Preliminary experiments on integrating multiple media cues for movie scene extraction have yielded encouraging results.

Paper Details

Date Published: 19 December 2001
PDF: 12 pages
Proc. SPIE 4676, Storage and Retrieval for Media Databases 2002, (19 December 2001); doi: 10.1117/12.451109
Show Author Affiliations
Ying Li, Univ. of Southern California (United States)
C.-C. Jay Kuo, Univ. of Southern California (United States)

Published in SPIE Proceedings Vol. 4676:
Storage and Retrieval for Media Databases 2002
Minerva M. Yeung; Chung-Sheng Li; Rainer W. Lienhart, Editor(s)

© SPIE. Terms of Use
Back to Top
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research
Forgot your username?