Share Email Print

Proceedings Paper

I-vectors for image classification
Author(s): David C. Smith
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

Recent state-of-the-art work on speaker recognition and verification uses a simple factor analysis to derive a low-dimensional total variability space" which simultaneously captures speaker and channel variability. This approach simplified earlier work using joint factor analysis to separately model speaker and channel differences. Here we adapt this "i-vector" method to image classification by replacing speakers with image categories, voice cuts with images, and cepstral features with SURF local descriptors, and where the role of channel variability is attributed to differences in image backgrounds or lighting conditions. A Universal Gaussian mixture model (UGMM) is trained (unsupervised) on SURF descriptors extracted from a varied and extensive image corpus. Individual images are modeled by additively perturbing the supervector of stacked means of this UGMM by the product of a low-rank total variability matrix (TVM) and a normally distributed hidden random vector, X. The TVM is learned by applying an EM algorithm to maximize the sum of log-likelihoods of descriptors extracted from training images, where the likelihoods are computed with respect to the GMM obtained by perturbing the UGMM means via the TVM as above, and leaving UGMM covariances unchanged. Finally, the low-dimensional i-vector representation of an image is the expected value of the posterior distribution of X conditioned on the image's descriptors, and is computed via straightforward matrix manipulations involving the TVM and image-specific Baum-Welch statistics. We compare classification rates found with (i) i-vectors (ii) PCA (iii) Discriminant Attribute Projection (the last two trained on Gaussian MAP-adapted supervector image representations), and (iv) replacing the TVM with the matrix of dominant PCA eigenvectors before i-vector extraction.

Paper Details

Date Published: 23 September 2014
PDF: 12 pages
Proc. SPIE 9217, Applications of Digital Image Processing XXXVII, 92170F (23 September 2014); doi: 10.1117/12.2060207
Show Author Affiliations
David C. Smith, U.S. Dept. of Defense (United States)

Published in SPIE Proceedings Vol. 9217:
Applications of Digital Image Processing XXXVII
Andrew G. Tescher, Editor(s)

© SPIE. Terms of Use
Back to Top
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research
Forgot your username?