Share Email Print

Proceedings Paper

Probabilistic analysis of the RNN-CLINK clustering algorithm
Author(s): Sheau-Dong Lang; Li-Jen Mao; Wen-Lin Hsu
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

Clustering is among the oldest techniques used in data mining applications. Typical implementations of the hierarchical agglomerative clustering methods (HACM) require an amount of O(N2)-space, when there are N data objects, making such algorithms impractical for problems involving large datasets. The well-known clustering algorithm RNN- CLINK requires only O(N)-space, but O(N3)-time in the worst case, although the average time appears to be O(N2-log N). We provide a probabilistic interpretation of the average time complexity of the algorithm. We also report experimental results, using the randomly generated bit vectors, and using the NETNEWS articles as the input, to support our theoretical analysis.

Paper Details

Date Published: 25 February 1999
PDF: 8 pages
Proc. SPIE 3695, Data Mining and Knowledge Discovery: Theory, Tools, and Technology, (25 February 1999); doi: 10.1117/12.339988
Show Author Affiliations
Sheau-Dong Lang, Univ. of Central Florida (United States)
Li-Jen Mao, Univ. of Central Florida (United States)
Wen-Lin Hsu, Univ. of Central Florida (United States)

Published in SPIE Proceedings Vol. 3695:
Data Mining and Knowledge Discovery: Theory, Tools, and Technology
Belur V. Dasarathy, Editor(s)

© SPIE. Terms of Use
Back to Top
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research
Forgot your username?