Share Email Print

Proceedings Paper

Survey of compressed domain audio features and their expressiveness
Author(s): Silvia Pfeiffer; Thomas Vincent
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

We give an overview of existing audio analysis approaches in the compressed domain and incorporate them into a coherent formal structure. After examining the kinds of information accessible in an MPEG-1 compressed audio stream, we describe a coherent approach to determine features from them and report on a number of applications they enable. Most of them aim at creating an index to the audio stream by segmenting the stream into temporally coherent regions, which may be classified into pre-specified types of sounds such as music, speech, speakers, animal sounds, sound effects, or silence. Other applications centre around sound recognition such as gender, beat or speech recognition.

Paper Details

Date Published: 10 January 2003
PDF: 15 pages
Proc. SPIE 5021, Storage and Retrieval for Media Databases 2003, (10 January 2003); doi: 10.1117/12.476300
Show Author Affiliations
Silvia Pfeiffer, CSIRO (Australia)
Thomas Vincent, CSIRO (France)

Published in SPIE Proceedings Vol. 5021:
Storage and Retrieval for Media Databases 2003
Minerva M. Yeung; Rainer W. Lienhart; Chung-Sheng Li, Editor(s)

© SPIE. Terms of Use
Back to Top