Share Email Print
cover

Proceedings Paper

A GA-based clustering algorithm for large data sets with mixed numeric and categorical values
Author(s): Jie Li; Xinbo Gao; Licheng Jiao
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

In the field of data mining, it is often encountered to perform cluster analysis on large data sets with mixed numeric and categorical values. However, most exciting clustering algorithms are only efficient for the numeric data rather than the mixed data set. For this purpose, this paper presents a novel clustering algorithm for these mixed data sets by modifying the common cost function, trace of the within cluster dispersion matrix. The genetic algorithm (GA) is used to optimize the new cost function to obtain valid clustering result. Experimental result illustrates that the GA-based new clustering algorithm is feasible for the large data sets with mixed numeric and categorical values.

Paper Details

Date Published: 25 September 2003
PDF: 4 pages
Proc. SPIE 5286, Third International Symposium on Multispectral Image Processing and Pattern Recognition, (25 September 2003); doi: 10.1117/12.538864
Show Author Affiliations
Jie Li, Xidian Univ. (China)
Xinbo Gao, Xidian Univ. (China)
Licheng Jiao, Xidian Univ. (China)


Published in SPIE Proceedings Vol. 5286:
Third International Symposium on Multispectral Image Processing and Pattern Recognition
Hanqing Lu; Tianxu Zhang, Editor(s)

© SPIE. Terms of Use
Back to Top