Share Email Print

Proceedings Paper

Multimodal approach for speaker identification in news programs
Format Member Price Non-Member Price
PDF $14.40 $18.00

Paper Abstract

The process of identifying speakers in a news program is difficult using only text information. We propose a system that will first perform text and video processing separately to identify the start of speech of a speaker. These start of speech locations are aligned and used to identify a change of speaker in the program. An analysis is performed to identify the contribution of the text and video information. It will be be shown that the change of speaker locations identified by our alignment algorithm is more accurate then either mode individually.

Paper Details

Date Published: 17 January 2005
PDF: 9 pages
Proc. SPIE 5682, Storage and Retrieval Methods and Applications for Multimedia 2005, (17 January 2005); doi: 10.1117/12.587870
Show Author Affiliations
Anthony F. Martone, Purdue Univ. (United States)
Cuneyt M. Taskiran, Purdue Univ. (United States)
Edward J. Delp, Purdue Univ. (United States)

Published in SPIE Proceedings Vol. 5682:
Storage and Retrieval Methods and Applications for Multimedia 2005
Rainer W. Lienhart; Noboru Babaguchi; Edward Y. Chang, Editor(s)

© SPIE. Terms of Use
Back to Top