Share Email Print
cover

Proceedings Paper

Learning one-to-many mapping functions for audio-visual integrated perception
Author(s): Jung-Hui Lim; Do-Kwan Oh; Soo-Young Lee
Format Member Price Non-Member Price
PDF $14.40 $18.00

Paper Abstract

In noisy environment the human speech perception utilizes visual lip-reading as well as audio phonetic classification. This audio-visual integration may be done by combining the two sensory features at the early stage. Also, the top-down attention may integrate the two modalities. For the sensory feature fusion we introduce mapping functions between the audio and visual manifolds. Especially, we present an algorithm to provide one-to-many mapping function for the videoto- audio mapping. The top-down attention is also presented to integrate both the sensory features and classification results of both modalities, which is able to explain McGurk effect. Each classifier is separately implemented by the Hidden-Markov Model (HMM), but the two classifiers are combined at the top level and interact by the top-down attention.

Paper Details

Date Published: 12 April 2010
PDF: 6 pages
Proc. SPIE 7703, Independent Component Analyses, Wavelets, Neural Networks, Biosystems, and Nanoengineering VIII, 77030E (12 April 2010); doi: 10.1117/12.855241
Show Author Affiliations
Jung-Hui Lim, Korea Advanced Institute of Science and Technology (Korea, Republic of)
Do-Kwan Oh, Korea Advanced Institute of Science and Technology (Korea, Republic of)
Soo-Young Lee, Korea Advanced Institute of Science and Technology (Korea, Republic of)


Published in SPIE Proceedings Vol. 7703:
Independent Component Analyses, Wavelets, Neural Networks, Biosystems, and Nanoengineering VIII
Harold H. Szu; F. Jack Agee, Editor(s)

© SPIE. Terms of Use
Back to Top