Share Email Print
cover

Proceedings Paper

FP-tree approach for mining N-most interesting itemsets
Author(s): Yin Ling Cheung; Ada Wai Chee Fu
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

In classical association rules mining, a minimum support threshold is assumed to be available for mining frequent itemsets. However, setting such a threshold is typically hard. If the threshold is set too high, nothing will be discovered; and if it is set too low, too many itemsets will be generated, which also implies inefficiency. In this paper, we handle a more practical problem, roughly speaking, it is to mine the N k-itemsets with the highest support for k up to a certain kmax value. We call the results the N-most interesting itemsets. Generally, it is more straightforward for users to determine N and kmax. This approach also provides a solution for an open issue in the problem of subspace clustering. However, with the above problem definition without the support threshold, the subset closure property of the apriori-gen algorithm no longer holds. We propose three new algorithms, LOOPBACK, BOLB, and BOMO, for mining N-most interesting itemsets by variations of the FP-tree approach. Experiments show that all our methods outperform the previously proposed Itemset-Loop algorithm.

Paper Details

Date Published: 12 March 2002
PDF: 12 pages
Proc. SPIE 4730, Data Mining and Knowledge Discovery: Theory, Tools, and Technology IV, (12 March 2002); doi: 10.1117/12.460253
Show Author Affiliations
Yin Ling Cheung, Chinese Univ. of Hong Kong (Hong Kong)
Ada Wai Chee Fu, Chinese Univ. of Hong Kong (Hong Kong)


Published in SPIE Proceedings Vol. 4730:
Data Mining and Knowledge Discovery: Theory, Tools, and Technology IV
Belur V. Dasarathy, Editor(s)

© SPIE. Terms of Use
Back to Top