Share Email Print
cover

Proceedings Paper

Warping functions in speech
Author(s): S. Umesh; Leon Cohen; Douglas J. Nelson
Format Member Price Non-Member Price
PDF $14.40 $18.00

Paper Abstract

We describe experiments that we have performed that address the issue of the relation between the same enunciations by different speakers. Our previous work indicated that frequencies are approximately scaled uniformity. In this paper we report results addressing possible corrections to uniform scaling. Our results show that the scaling is non uniform, that is the format frequencies of different speakers scale differently at different frequencies. We discuss how this leads to the mathematical issue of separating the spectrum into a speaker dependent and speaker independent parts. We introduce the concept of a universal scaling function that is aimed at achieving this separation. The fundamental idea is to find a frequency axis transformation (warping function) which transforms the energy density spectrum (the squared absolute value of the Fourier transform of the enunciation) in such a way that a further Fourier transform of the resulting function achieves this separation. We discuss this procedure and relate it to the scale transform. Using real speech data we obtain such a transformation function. The resulting function is very similar to the Mel scale, which has been previously obtained only from psychoacoustic (hearing based) experiments. That similar scales are obtained from both hearing and speech production (as reported here) is fundamental to the understanding of speech and hearing.

Paper Details

Date Published: 19 October 1998
PDF: 16 pages
Proc. SPIE 3458, Wavelet Applications in Signal and Imaging Processing VI, (19 October 1998); doi: 10.1117/12.328137
Show Author Affiliations
S. Umesh, Indian Institute of Technology (India)
Leon Cohen, CUNY/Hunter College (United States)
Douglas J. Nelson, U.S. Dept. of Defense (United States)


Published in SPIE Proceedings Vol. 3458:
Wavelet Applications in Signal and Imaging Processing VI
Andrew F. Laine; Michael A. Unser; Akram Aldroubi, Editor(s)

© SPIE. Terms of Use
Back to Top