Share Email Print

Proceedings Paper

Influence of data set splitting method on similarity indexing performance
Author(s): Xuesheng Bai; Guang-you Xu; Yuanchun Shi; Shi-Qiang Yang
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

Similarity indexing is the supporting technology for fast content-based retrieval of large media databases, and many similarity index structures have been proposed. Compared with the many structures present, less attention has been paid to performance evaluation of index structures and theoretic analysis son factors influencing index performance. In this paper, we attempt to solve part of the problem and focus our research on analyzing the influence of data splitting methods. To give a formal definition for index structure performance evaluation, we introduce the query distribution probability concept and propose using average search cost to evaluate the performance of a similarity indexing structure. We choose the simplest case of similarity indexing - nearest-neighbor search in our discussion and deduce an expression for the average search cost function. Based on analysis of the expression, we proposed some criteria that may be useful in index design and implementation. Then we extend these conclusions to the general similarity indexing case and use these criteria as general rules in index design and implementation. Basic thoughts and analysis are detailed, as well as experiment results.

Paper Details

Date Published: 23 December 1999
PDF: 8 pages
Proc. SPIE 3972, Storage and Retrieval for Media Databases 2000, (23 December 1999); doi: 10.1117/12.373594
Show Author Affiliations
Xuesheng Bai, Tsinghua Univ. (China)
Guang-you Xu, Tsinghua Univ. (China)
Yuanchun Shi, Tsinghua Univ. (China)
Shi-Qiang Yang, Tsinghua Univ. (China)

Published in SPIE Proceedings Vol. 3972:
Storage and Retrieval for Media Databases 2000
Minerva M. Yeung; Boon-Lock Yeo; Charles A. Bouman, Editor(s)

© SPIE. Terms of Use
Back to Top