Share Email Print
cover

Proceedings Paper

Distributed data mining: an attribute-oriented key-preserving method
Author(s): Maybin Muyeba; John A. Keane
Format Member Price Non-Member Price
PDF $14.40 $18.00

Paper Abstract

Data mining algorithms are constantly being challenged by the need to process large data volumes efficiently. Attribute- Oriented Induction (AOI) is an inductive set-oriented technique used to mine large data by reducing its search space through attribute generalization and form summary rules. Most data mining techniques only end at producing rules for user analysis. The Key-Preserving method of AOI (AOI-KP) allows users to query data related to the learning task efficiently by using keys (attributes that index relations) to relations in the database and the generated rules. However, the initial problem is loading the whole data set into memory on a single memory machine. As data input size increases, the preserved keys and the data itself use up memory. Further, to solve the file I/O bottleneck for writing preserved keys, concurrency mechanisms were used on a single cluster of a Windows NT machine and improvements in execution time were obtained. One of the major solutions is to employ parallelism i.e. utilizing a distributed memory machine with explicit message passing. A Network of Workstations offers attractive scalability in terms of computational power and memory availability. We analyze performance of our algorithm on NOW and compare speed-up and scalability, which showed significant improvements.

Paper Details

Date Published: 6 April 2000
PDF: 10 pages
Proc. SPIE 4057, Data Mining and Knowledge Discovery: Theory, Tools, and Technology II, (6 April 2000); doi: 10.1117/12.381730
Show Author Affiliations
Maybin Muyeba, Univ. of Manchester (United Kingdom)
John A. Keane, Univ. of Manchester (United Kingdom)


Published in SPIE Proceedings Vol. 4057:
Data Mining and Knowledge Discovery: Theory, Tools, and Technology II
Belur V. Dasarathy, Editor(s)

© SPIE. Terms of Use
Back to Top