Share Email Print
cover

Proceedings Paper

Context-enhanced video understanding
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

Many recent efforts have been made to automatically index multimedia content with the aim of bridging the semantic gap between syntax and semantics. In this paper, we propose a novel framework to automatically index video using context for video understanding. First we discuss the notion of context and how it relates to video understanding. Then we present the framework we are constructing, which is modeled as an expert system that uses a rule-based engine, domain knowledge, visual detectors (for objects and scenes), and different data sources available with the video (metadata, text from automatic speech recognition, etc.). We also describe our approach to align text from speech recognition and video segments, and present experiments using a simple implementation of our framework. Our experiments show that context can be used to improve the performance of visual detectors.

Paper Details

Date Published: 10 January 2003
PDF: 12 pages
Proc. SPIE 5021, Storage and Retrieval for Media Databases 2003, (10 January 2003); doi: 10.1117/12.479745
Show Author Affiliations
Alejandro Jaimes, Columbia Univ. and IBM Thomas J. Watson Research Ctr. (United States)
Columbia Univ. (United States)
Milind Ramesh Naphade, IBM Thomas J. Watson Research Ctr. (United States)
Harriet Nock, IBM Thomas J. Watson Research Ctr. (United States)
John R. Smith, IBM Thomas J. Watson Research Ctr. (United States)
Belle L. Tseng, IBM Thomas J. Watson Research Ctr. (United States)


Published in SPIE Proceedings Vol. 5021:
Storage and Retrieval for Media Databases 2003
Minerva M. Yeung; Rainer W. Lienhart; Chung-Sheng Li, Editor(s)

© SPIE. Terms of Use
Back to Top