
Proceedings Paper
Visual words for lip-readingFormat | Member Price | Non-Member Price |
---|---|---|
$17.00 | $21.00 |
Paper Abstract
In this paper, the automatic lip reading problem is investigated, and an innovative approach to providing solutions to this
problem has been proposed. This new VSR approach is dependent on the signature of the word itself, which is obtained
from a hybrid feature extraction method dependent on geometric, appearance, and image transform features. The
proposed VSR approach is termed "visual words".
The visual words approach consists of two main parts, 1) Feature extraction/selection, and 2) Visual speech feature
recognition. After localizing face and lips, several visual features for the lips where extracted. Such as the height and
width of the mouth, mutual information and the quality measurement between the DWT of the current ROI and the DWT
of the previous ROI, the ratio of vertical to horizontal features taken from DWT of ROI, The ratio of vertical edges to
horizontal edges of ROI, the appearance of the tongue and the appearance of teeth. Each spoken word is represented by 8
signals, one of each feature. Those signals maintain the dynamic of the spoken word, which contains a good portion of
information. The system is then trained on these features using the KNN and DTW.
This approach has been evaluated using a large database for different people, and large experiment sets. The evaluation
has proved the visual words efficiency, and shown that the VSR is a speaker dependent problem.
Paper Details
Date Published: 28 April 2010
PDF: 12 pages
Proc. SPIE 7708, Mobile Multimedia/Image Processing, Security, and Applications 2010, 77080B (28 April 2010); doi: 10.1117/12.850635
Published in SPIE Proceedings Vol. 7708:
Mobile Multimedia/Image Processing, Security, and Applications 2010
Sos S. Agaian; Sabah A. Jassim, Editor(s)
PDF: 12 pages
Proc. SPIE 7708, Mobile Multimedia/Image Processing, Security, and Applications 2010, 77080B (28 April 2010); doi: 10.1117/12.850635
Show Author Affiliations
Ahmad B. A. Hassanat, Univ. of Buckingham (United Kingdom)
Sabah Jassim, Univ. of Buckingham (United Kingdom)
Published in SPIE Proceedings Vol. 7708:
Mobile Multimedia/Image Processing, Security, and Applications 2010
Sos S. Agaian; Sabah A. Jassim, Editor(s)
© SPIE. Terms of Use
