Share Email Print

Proceedings Paper

Integrated audiovisual processing for object localization and tracking
Author(s): Gopal Sarma Pingali
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

This paper presents a system that combines audio and visual cues for locating and tracking an object, typically a person, in real time. It is shown that combining a speech source localization algorithm with a video-based head tracking algorithm results in a more accurate and robust tracker than that obtained using any one of the audio or visual modalities. Performance evaluation results are presented with a system that runs in real time on a general purpose processor. The multimodal tracker has several applications such as teleconferencing, multimedia kiosks and interactive games.

Paper Details

Date Published: 29 December 1997
PDF: 8 pages
Proc. SPIE 3310, Multimedia Computing and Networking 1998, (29 December 1997); doi: 10.1117/12.298421
Show Author Affiliations
Gopal Sarma Pingali, Lucent Technologies/Bell Labs. (United States)

Published in SPIE Proceedings Vol. 3310:
Multimedia Computing and Networking 1998
Kevin Jeffay; Dilip D. Kandlur; Timothy Roscoe, Editor(s)

© SPIE. Terms of Use
Back to Top