Share Email Print

Proceedings Paper

Content-based classification and retrieval of audio
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

An on-line audio classification and segmentation system is presented in this research, where audio recordings are classified and segmented into speech, music, several types of environmental sounds and silence based on audio content analysis. This is the first step of our continuing work towards a general content-based audio classification and retrieval system. The extracted audio features include temporal curves of the energy function,the average zero- crossing rate, the fundamental frequency of audio signals, as well as statistical and morphological features of these curves. The classification result is achieved through a threshold-based heuristic procedure. The audio database that we have built, details of feature extraction, classification and segmentation procedures, and experimental results are described. It is shown that, with the proposed new system, audio recordings can be automatically segmented and classified into basic types in real time with an accuracy of over 90 percent. Outlines of further classification of audio into finer types and a query-by-example audio retrieval system on top of the coarse classification are also introduced.

Paper Details

Date Published: 2 October 1998
PDF: 12 pages
Proc. SPIE 3461, Advanced Signal Processing Algorithms, Architectures, and Implementations VIII, (2 October 1998); doi: 10.1117/12.325703
Show Author Affiliations
Tong Zhang, Univ. of Southern California (United States)
C.-C. Jay Kuo, Univ. of Southern California (United States)

Published in SPIE Proceedings Vol. 3461:
Advanced Signal Processing Algorithms, Architectures, and Implementations VIII
Franklin T. Luk, Editor(s)

© SPIE. Terms of Use
Back to Top
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research
Forgot your username?