Share Email Print

Proceedings Paper

Multiscale audio-video analysis and processing: segmentations and arrangements
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

We propose a multi-scale and multi-modal analysis and processing scheme for audio-video data. Using a non-linear scale-space technique audio-video is analyzed and processed such that it is invariant under various imaging and hearing conditions. Degradations due to Lyapunov and structural instabilities are suppressed by this scale-space technique without destroying essential semantic relations. On the basis of an audio-video segmentation its arrangements are quantified in terms of spatio-temporal inclusion relations and dynamic ordening relations by means of scaling connectivity relations. These relations infer a topological structure on top of the audio-video scale-space inducing a unimodal and multi-modal semantics. Our scheme is illustrated separately for video, audio and audio-video material the latter pointing out the added value of integrating audio and video.

Paper Details

Date Published: 20 July 2001
PDF: 12 pages
Proc. SPIE 4519, Internet Multimedia Management Systems II, (20 July 2001); doi: 10.1117/12.434277
Show Author Affiliations
Raango Aldershoff, Telematica Instituut (Netherlands)
Alfons H. Salden, Telematica Instituut (Netherlands)

Published in SPIE Proceedings Vol. 4519:
Internet Multimedia Management Systems II
John R. Smith; Sethuraman Panchanathan; C.-C. Jay Kuo; Chinh Le, Editor(s)

© SPIE. Terms of Use
Back to Top