Share Email Print
cover

Proceedings Paper

Combining LVQ with continuous-density hidden Markov models in speech recognition
Author(s): Mikko Kurimo; Kari Torkkola
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

We propose the use of self-organizing maps (SOMs) and learning vector quantization (LVQ) as an initialization method for the training of the continuous observation density hidden Markov models (CDHMMs). We apply CDHMMs to model phonemes in the transcription of speech into phoneme sequences. The Baum-Welch maximum likelihood estimation method is very sensitive to the initial parameter values if the observation densities are represented by mixtures of many Gaussian density functions. We suggest the training of CDHMMs to be done in two phases. First the vector quantization methods are applied to find suitable placements for the means of Gaussian density functions to represent the observed training data. The maximum likelihood estimation is then used to find the mixture weights and state transition probabilities and to re-estimate the Gaussians to get the best possible models. The result of initializing the means of distributions by SOMs or LVQ is that good recognition results can be achieved using essentially fewer Baum-Welch iterations than are needed with random initial values. Also, in the segmental K-means algorithm the number of iterations can be remarkably reduced with a suitable initialization. We experiment, furthermore, to enhance the discriminatory power of the phoneme models by adaptively training the state output distributions using the LVQ-algorithm.

Paper Details

Date Published: 16 December 1992
PDF: 9 pages
Proc. SPIE 1766, Neural and Stochastic Methods in Image and Signal Processing, (16 December 1992); doi: 10.1117/12.130880
Show Author Affiliations
Mikko Kurimo, Helsinki Univ. of Technology (Finland)
Kari Torkkola, IDIAP (Switzerland)


Published in SPIE Proceedings Vol. 1766:
Neural and Stochastic Methods in Image and Signal Processing
Su-Shing Chen, Editor(s)

© SPIE. Terms of Use
Back to Top