Share Email Print

Proceedings Paper

Video classification using speaker identification
Author(s): Nilesh V. Patel; Ishwar K. Sethi
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

Video content characterization is a challenging problem in video databases. The aim of such characterization is to generate indices that can describe a video clip in terms of objects and their actions in the clip. Generally, such indices are extracted by performing image analysis on the video clips. Many such indices can also be generated by analyzing the embedded audio information of video clips. Indices pertaining to context, scene emotion, and actors or characters present in a video clip appear especially suitable for generation via audio analysis techniques of keyword spotting, and speech and speaker recognition. In this paper, we examine the potential of speaker identification techniques for characterizing video clips in terms of actors present in them. We describe a three-stage processing system consisting of a shot boundary detection stage, an audio classification stage, and a speaker identification stage to determine the presence of different actors in isolated shots. Experimental results using the movie A Few Good Men are presented to show the efficacy of speaker identification for labeling video clips in terms of persons present in them.

Paper Details

Date Published: 15 January 1997
PDF: 8 pages
Proc. SPIE 3022, Storage and Retrieval for Image and Video Databases V, (15 January 1997); doi: 10.1117/12.263411
Show Author Affiliations
Nilesh V. Patel, Wayne State Univ. (United States)
Ishwar K. Sethi, Wayne State Univ. (United States)

Published in SPIE Proceedings Vol. 3022:
Storage and Retrieval for Image and Video Databases V
Ishwar K. Sethi; Ramesh C. Jain, Editor(s)

© SPIE. Terms of Use
Back to Top
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research
Forgot your username?