Share Email Print
cover

Proceedings Paper

Using closed itemsets in association rule mining with taxonomies
Author(s): Petteri Sevon
Format Member Price Non-Member Price
PDF $14.40 $18.00

Paper Abstract

Taxonomies or item hierarchies are often useful in association rule mining. The most time consuming subtask in rule mining is the discovery of the frequent itemsets. A standard algorithm for frequent itemset discovery, such as Apriori, would however produce many redundant itemsets if taxonomies are used, because any two itemsets that share their most specific items always match the same set of transactions. An efficient algorithm must therefore chose a canonical representative for each equivalence class of itemsets. Srikant and Agrawal solved the problem of redundancy in their Cumulate and Stratify algorithms by not allowing an itemset to contain an item and its ancestor. In this paper I will present a new algorithm, Closed Sets, for finding frequent itemsets in the presence of taxonomies. The algorithm requires itemsets to be closed under the taxonomies. If an item is a member of a closed itemset, then all its ancestors also are. The algorithm makes fewer passes over the database than Stratify and it prunes the search space optimally. This is not the case with Cumulate, and not even with Stratify, if there are items in the taxonomy with multiple parents. Furthermore, only modest modifications to Apriori are needed.

Paper Details

Date Published: 6 April 2000
PDF: 8 pages
Proc. SPIE 4057, Data Mining and Knowledge Discovery: Theory, Tools, and Technology II, (6 April 2000); doi: 10.1117/12.381729
Show Author Affiliations
Petteri Sevon, Univ. of Helsinki (Finland)


Published in SPIE Proceedings Vol. 4057:
Data Mining and Knowledge Discovery: Theory, Tools, and Technology II
Belur V. Dasarathy, Editor(s)

© SPIE. Terms of Use
Back to Top