Share Email Print

Proceedings Paper

Complexity constrained rate-distortion optimization of sign language video using an objective intelligibility metric
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

Sign language users are eager for the freedom and convenience of video communication over cellular devices. Compression of sign language video in this setting offers unique challenges. The low bitrates available make encoding decisions extremely important, while the power constraints of the device limit the encoder complexity. The ultimate goal is to maximize the intelligibility of the conversation given the rate-constrained cellular channel and power constrained encoding device. This paper uses an objective measure of intelligibility, based on subjective testing with members of the Deaf community, for rate-distortion optimization of sign language video within the H.264 framework. Performance bounds are established by using the intelligibility metric in a Lagrangian cost function along with a trellis search to make optimal mode and quantizer decisions for each macroblock. The optimal QP values are analyzed and the unique structure of sign language is exploited in order to reduce complexity by three orders of magnitude relative to the trellis search technique with no loss in rate-distortion performance. Further reductions in complexity are made by eliminating rarely occuring modes in the encoding process. The low complexity SL optimization technique increases the measured intelligibility up to 3.5 dB, at fixed rates, and reduces rate by as much as 60% at fixed levels of intelligibility with respect to a rate control algorithm designed for aesthetic distortion as measured by MSE.

Paper Details

Date Published: 28 January 2008
PDF: 10 pages
Proc. SPIE 6822, Visual Communications and Image Processing 2008, 682213 (28 January 2008); doi: 10.1117/12.768053
Show Author Affiliations
Frank M. Ciaramello, Cornell Univ. (United States)
Sheila S. Hemami, Cornell Univ. (United States)

Published in SPIE Proceedings Vol. 6822:
Visual Communications and Image Processing 2008
William A. Pearlman; John W. Woods; Ligang Lu, Editor(s)

© SPIE. Terms of Use
Back to Top