Share Email Print
cover

Proceedings Paper

3D shape inferencing and modeling for semantic video retrieval
Author(s): Zhibin Lei; Yun-Tin Lin
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

In this paper we present a geometry based indexing method for the semantic retrieval of large video databases. It combines two separate modules i.e., 3D object shape inferencing from a video sequence and geometric modeling from the reconstructed shape, to achieve better performance. First, a motion-based segmentation algorithm employing feature block tracking and hierarchical principal component split is used for multi-moving-object motion classification and segmentation. After segmentation, feature blocks for an individual moving scene or object can be used to reconstruct the 3D motion and shape structure of this scene or object by a factorization method. We assume object is rigid and relatively far away from the camera so that perspective distortion can be ignored. The estimated shape structure and motion parameters are then used to generate the implicit polynomial (IP) representation for the object. The system starts with a very coarse representation of the 3D shape. When more frames are available from the video stream and are properly segmented and classified, the IP representation will change accordingly by varying the coefficients of the implicit polynomial to minimize the estimation error. This process will stop when enough information is obtained to generate a reliable IP shape representation or until the video stream runs out. The semantic retrieval of the video databases is achieved by using the geometric structure of the objects and their spatial relationship. We generalize the 2D sting concept to 3D to compactly encode the spatial relationship among objects. The algebraic invariants of the implicit polynomial are used as the geometric feature vector for the object. A similarity value can be computed for two sets of objects or two video sequences to allow fast retrieval of video databases.

Paper Details

Date Published: 1 November 1996
PDF: 12 pages
Proc. SPIE 2916, Multimedia Storage and Archiving Systems, (1 November 1996); doi: 10.1117/12.257292
Show Author Affiliations
Zhibin Lei, Brown Univ. (United States)
Yun-Tin Lin, Princeton Univ. (United States)


Published in SPIE Proceedings Vol. 2916:
Multimedia Storage and Archiving Systems
C.-C. Jay Kuo, Editor(s)

© SPIE. Terms of Use
Back to Top