Share Email Print

Proceedings Paper

Illumination-invariant video segmentation by hierarchical robust thresholding
Author(s): Jie Wei; Mark S. Drew; Ze-Nian Li
Format Member Price Non-Member Price
PDF $14.40 $18.00

Paper Abstract

Many methods for video segmentation rely upon the setting and tuning of thresholds for classifying interframe distances under various difference measures. An approach that has been used with some success has been to establish statistical measures for each new video and identify camera cuts as difference values far from the mean. For this type of strategy the mean and dispersion for some interframe distance measure must be calculated for each new video as a whole. Here we eliminate this statistical characterization step and at the same time allow for segmentation of streaming video by introducing a preprocessing step for illumination-invariance that concomitantly reduces input values to a uniform scale. The preprocessing step provides a solution to the problem that simple changes of illumination in a scene, such as an actor emerging from a shadow, can trigger a false positive transition, no matter whether intensity alone or chrominance is used in a distance measure. Our means of discounting lighting change for color constancy consists of the simple yet effective operation of normalizing each color channel to length 1 (when viewed as a long, length-N vector). We then reduce the dimensionality of color to two-dimensional chromaticity, with values which are in 0..1. Chromaticity histograms can be treated as images, and effectively low-pass filtered by wavelet-based reduction, followed by DCT and zonal coding. This results in an indexing scheme based on only 36 numbers, and lends itself to a binary search approach to transition detection. To this end we examine distributions for intra-clip and inter-clip distances separately, characterizing each using robust statistics, for temporal intervals from 32 frames to 1 frame by powers of 2. Then combining transition and non-transition distributions for each frame internal, we seek the valley between them, again robustly, for each threshold. Using the present method values of precision and recall are increased over previous methods. Moreover, illumination change produces very few false positives.

Paper Details

Date Published: 23 December 1997
PDF: 14 pages
Proc. SPIE 3312, Storage and Retrieval for Image and Video Databases VI, (23 December 1997); doi: 10.1117/12.298442
Show Author Affiliations
Jie Wei, Simon Fraser Univ. (United States)
Mark S. Drew, Simon Fraser Univ. (Canada)
Ze-Nian Li, Simon Fraser Univ. (Canada)

Published in SPIE Proceedings Vol. 3312:
Storage and Retrieval for Image and Video Databases VI
Ishwar K. Sethi; Ramesh C. Jain, Editor(s)

© SPIE. Terms of Use
Back to Top