Share Email Print

Proceedings Paper

An algorithmic approach to mining unknown clusters in training data
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

In this paper, unsupervised learning is utilized to develop a method for mining unknown clusters in training data. The approach is based on the Bayesian Data Reduction Algorithm (BDRA), which has recently been developed into a patented system called the Data Extraction and Mining Software Tool (DEMIST). In the BDRA, the modeling assumption is that the discrete symbol probabilities of each class are a priori uniformly Dirichlet distributed, and it employs a "greedy" approach to selecting and discretizing the relevant features of each class for best performance. The primary metric for selecting and discretizing all relevant features contained in each class is an analytic formula for the probability of error conditioned on the training data. Thus, the primary contribution of this work is to demonstrate an algorithmic approach to finding multiple unknown clusters in training data, which represents an extension to the original data clustering algorithm. To illustrate performance, results are demonstrated using simulated data that contains multiple clusters. In general, the results of this work will demonstrate an effective method for finding multiple clusters in data mining applications.

Paper Details

Date Published: 18 April 2006
PDF: 12 pages
Proc. SPIE 6241, Data Mining, Intrusion Detection, Information Assurance, and Data Networks Security 2006, 624101 (18 April 2006); doi: 10.1117/12.664731
Show Author Affiliations
Robert S. Lynch, Naval Undersea Warfare Ctr. (United States)
Peter K. Willett, Univ. of Connecticut (United States)

Published in SPIE Proceedings Vol. 6241:
Data Mining, Intrusion Detection, Information Assurance, and Data Networks Security 2006
Belur V. Dasarathy, Editor(s)

© SPIE. Terms of Use
Back to Top