Share Email Print

Proceedings Paper

Probabilistic analysis of the RNN-CLINK clustering algorithm
Author(s): Sheau-Dong Lang; Li-Jen Mao; Wen-Lin Hsu
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

Clustering is among the oldest techniques used in data mining applications. Typical implementations of the hierarchical agglomerative clustering methods (HACM) require an amount of O(N2)-space, when there are N data objects, making such algorithms impractical for problems involving large datasets. The well-known clustering algorithm RNN- CLINK requires only O(N)-space, but O(N3)-time in the worst case, although the average time appears to be O(N2-log N). We provide a probabilistic interpretation of the average time complexity of the algorithm. We also report experimental results, using the randomly generated bit vectors, and using the NETNEWS articles as the input, to support our theoretical analysis.

Paper Details

Date Published: 25 February 1999
PDF: 8 pages
Proc. SPIE 3695, Data Mining and Knowledge Discovery: Theory, Tools, and Technology, (25 February 1999); doi: 10.1117/12.339988
Show Author Affiliations
Sheau-Dong Lang, Univ. of Central Florida (United States)
Li-Jen Mao, Univ. of Central Florida (United States)
Wen-Lin Hsu, Univ. of Central Florida (United States)

Published in SPIE Proceedings Vol. 3695:
Data Mining and Knowledge Discovery: Theory, Tools, and Technology
Belur V. Dasarathy, Editor(s)

© SPIE. Terms of Use
Back to Top