Share Email Print

Proceedings Paper

Video and audio data integration for conferencing
Author(s): Thrasyvoulos N. Pappas; Raynard O. Hinds
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

In videoconferencing applications the perceived quality of the video signal is affected by the presence of an audio signal (speech). To achieve high compression rates, video coders must compromise image quality in terms of spatial resolution, grayscale resolution, and frame rate, and may introduce various kinds of artifact.s We consider tradeoffs in grayscale resolution and frame rate, and use subjective evaluations to assess the perceived quality of the video signal in the presence of speech. In particular we explore the importance of lip synchronization. In our experiment we used an original grayscale sequence at QCIF resolution, 30 frames/second, and 256 gray levels. We compared the 256-level sequence at different frame rates with a two-level version of the sequence at 30 frames/sec. The viewing distance was 20 image heights, or roughly two feet from an SGI workstation. We used uncoded speech. To obtain the two-level sequence we used an adaptive clustering algorithm for segmentation of video sequences. The binary sketches it creates move smoothly and preserve the main characteristics of the face, so that it is easily recognizable. More importantly, the rendering of lip and eye movements is very accurate. The test results indicate that when the frame rate of the full grayscale sequence is low (less than 5 frames/sec), most observers prefer the two-level sequence.

Paper Details

Date Published: 20 April 1995
PDF: 8 pages
Proc. SPIE 2411, Human Vision, Visual Processing, and Digital Display VI, (20 April 1995); doi: 10.1117/12.207533
Show Author Affiliations
Thrasyvoulos N. Pappas, AT&T Bell Labs. (United States)
Raynard O. Hinds, Massachusetts Institute of Technology (United States)

Published in SPIE Proceedings Vol. 2411:
Human Vision, Visual Processing, and Digital Display VI
Bernice E. Rogowitz; Jan P. Allebach, Editor(s)

© SPIE. Terms of Use
Back to Top