Share Email Print
cover

Proceedings Paper

Combination of GMM-UBM and DTW for voice command authentication system
Author(s): Evelyn Kurniawati; Sasiraj Somarajan
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

In this paper, we present a combination of statistical and template based pattern matching to solve the problem of authentication with very short command words. Same features are used in both methods to reduce the computational weight. The first method uses GMM-UBM (Gaussian Mixture Model with Universal Background Model) which is well known in speaker recognition field, but lacks the ability to model the temporal aspect of speech. The second method provides a remedy for this, with the classical DTW (Dynamic Time Warping) on the cepstrum features. Two scheme of combining the model is explored; firstly with layer design when DTW distance is calculated only if GMM-UBM accepts the speaker, and secondly by weighting the DTW distance using the confidence of GMM-UBM result. With this combination, a 23% and 17% improvement in EER was observed respectively, each with differing characteristics on 3 different error types that is investigated. The experiment was conducted on evaluation set of RSR2015 database part 2, which contains short words meant for command and control task. Performance analysis is done using Detection Error Tradeoff curve (DET) and Equal Error Rate (EER).

Paper Details

Date Published: 17 April 2019
PDF: 9 pages
Proc. SPIE 11071, Tenth International Conference on Signal Processing Systems, 110710D (17 April 2019); doi: 10.1117/12.2520442
Show Author Affiliations
Evelyn Kurniawati, Merry Electronics, Pte. Ltd. (Singapore)
Sasiraj Somarajan, Merry Electronics, Pte. Ltd. (Singapore)


Published in SPIE Proceedings Vol. 11071:
Tenth International Conference on Signal Processing Systems
Kezhi Mao; Xudong Jiang, Editor(s)

© SPIE. Terms of Use
Back to Top