Share Email Print

Proceedings Paper

Genetic programming system for building block analysis to enhance data analysis and data mining techniques
Author(s): Christoph F. Eick; Walter D. Sanz; Ruijian Zhang
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

Recently, many computerized data mining tools and environments have been proposed for finding interesting patterns in large data collections. These tools employ techniques that originate from research in various areas, such as machine learning, statistical data analysis, and visualization. Each of these techniques makes assumptions concerning the composition of the data collection to be analyzed. If the particular data collection does not meet these assumptions well, the technique usually performs poorly. For example, decision tree tools, such as C4.5, rely on rectangular approximations, which do not perform well if the boundaries between different classes have other shapes, such as a 45 degree line or elliptical shapes. However, if we could find a transformation f that transforms the original attribute space, in which class boundaries are more, better rectangular approximations could be obtained. In this paper, we address the problem of finding such transformations f. We describe the features of the tool, WOLS, whose goal is the discovery of ingredients for such transformation functions f, which we call building blocks. The tool employs genetic programming and symbolic regression for this purpose. We also present and discuss the results of case studies, using the building block analysis tool, in the areas of decision tree learning and regression analysis.

Paper Details

Date Published: 25 February 1999
PDF: 8 pages
Proc. SPIE 3695, Data Mining and Knowledge Discovery: Theory, Tools, and Technology, (25 February 1999); doi: 10.1117/12.339976
Show Author Affiliations
Christoph F. Eick, Univ. of Houston (United States)
Walter D. Sanz, Univ. of Houston (United States)
Ruijian Zhang, Univ. of Houston (United States)

Published in SPIE Proceedings Vol. 3695:
Data Mining and Knowledge Discovery: Theory, Tools, and Technology
Belur V. Dasarathy, Editor(s)

© SPIE. Terms of Use
Back to Top