Share Email Print

Proceedings Paper

Automatic categorization design for broadcast news
Author(s): Huitao Luo; Qian Huang
Format Member Price Non-Member Price
PDF $14.40 $18.00

Paper Abstract

This paper discusses our work on automatic categorization of broadcast news based on close caption texts. The multimedia news data under study are first segmented into story units based on video and audio signals with our previous developed algorithms. Based on the time stamp information, close caption texts are segmented into text units corresponding to each story unit. A Bayes network is then trained to automatically classify the story units into fourteen categories. The major contribution of this paper is the idea of category, which represents a higher level of semantic generalization as compared with traditional topics. We discusses in detail the administrated bottom-up clustering algorithm to generate semantically meaningful category framework as well as the training procedures to build the brief network that covers the large broadcast news data set. Using LDC (Linguistic Data Consortium)'s CSR LM 1996 data set, we designed a number of experiments to discuss the relationship between categorization design and the classification performance.

Paper Details

Date Published: 19 December 2001
PDF: 11 pages
Proc. SPIE 4676, Storage and Retrieval for Media Databases 2002, (19 December 2001); doi: 10.1117/12.451099
Show Author Affiliations
Huitao Luo, Hewlett-Packard Labs. (United States)
Qian Huang, AT&T Research Labs. (United States)

Published in SPIE Proceedings Vol. 4676:
Storage and Retrieval for Media Databases 2002
Minerva M. Yeung; Chung-Sheng Li; Rainer W. Lienhart, Editor(s)

© SPIE. Terms of Use
Back to Top