Share Email Print

Proceedings Paper

The speech scale, the Mel scale, and the tube model for speech
Author(s): Srinivasan Umesh; Leon Cohen; Douglas J. Nelson
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

We use the tube model of speech production to study the speech-hearing connection. Recently, using real speech we showed that sounds made by different individuals and perceived to be the same can be transformed into each other by a universal warping function. We call the transformation function the speech scale and we have shown that it is similar to the Mel scale. Thus experimentally establishing the speech-hearing connection. In this paper we explore the possible origins of the speech scale and attempt to understand it from the point of view of the tube model of speech. We use the two-tube model for various vowels and study the effect of varying the lengths of the tubes on the location of formant frequencies. We show that if we use the commonly used assumption that the length of the front-tube does not change significantly when compared to the back tube for different individuals enunciating the same sound, then their corresponding formant frequencies are non-uniformly scaled. Using the same method we used for real speech we compute the warping function.

Paper Details

Date Published: 6 December 2002
PDF: 17 pages
Proc. SPIE 4791, Advanced Signal Processing Algorithms, Architectures, and Implementations XII, (6 December 2002); doi: 10.1117/12.456493
Show Author Affiliations
Srinivasan Umesh, Indian Institute of Technology Kanpur (India)
Leon Cohen, CUNY/Hunter College (United States)
Douglas J. Nelson, U.S. Dept. of Defense (United States)

Published in SPIE Proceedings Vol. 4791:
Advanced Signal Processing Algorithms, Architectures, and Implementations XII
Franklin T. Luk, Editor(s)

© SPIE. Terms of Use
Back to Top