Share Email Print

Proceedings Paper

Improving multimedia retrieval with a video OCR
Author(s): Dipanjan Das; Datong Chen; Alexander G. Hauptmann
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

We present a set of experiments with a video OCR system (VOCR) tailored for video information retrieval and establish its importance in multimedia search in general and for some specific queries in particular. The system, inspired by an existing work on text detection and recognition in images, has been developed using techniques involving detailed analysis of video frames producing candidate text regions. The text regions are then binarized and sent to a commercial OCR resulting in ASCII text, that is finally used to create search indexes. The system is evaluated using the TRECVID data. We compare the system's performance from an information retrieval perspective with another VOCR developed using multi-frame integration and empirically demonstrate that deep analysis on individual video frames result in better video retrieval. We also evaluate the effect of various textual sources on multimedia retrieval by combining the VOCR outputs with automatic speech recognition (ASR) transcripts. For general search queries, the VOCR system coupled with ASR sources outperforms the other system by a very large extent. For search queries that involve named entities, especially people names, the VOCR system even outperforms speech transcripts, demonstrating that source selection for particular query types is extremely essential.

Paper Details

Date Published: 28 January 2008
PDF: 12 pages
Proc. SPIE 6820, Multimedia Content Access: Algorithms and Systems II, 68200B (28 January 2008); doi: 10.1117/12.766931
Show Author Affiliations
Dipanjan Das, Carnegie Mellon Univ. (United States)
Datong Chen, Carnegie Mellon Univ. (United States)
Alexander G. Hauptmann, Carnegie Mellon Univ. (United States)

Published in SPIE Proceedings Vol. 6820:
Multimedia Content Access: Algorithms and Systems II
Theo Gevers; Ramesh C. Jain; Simone Santini, Editor(s)

© SPIE. Terms of Use
Back to Top