Share Email Print

Proceedings Paper

Utilizing unsupervised learning to cluster data in the Bayesian data reduction algorithm
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

In this paper, unsupervised learning is utilized to illustrate the ability of the Bayesian Data Reduction Algorithm (BDRA) to cluster unlabeled training data. The BDRA is based on the assumption that the discrete symbol probabilities of each class are a priori uniformly Dirichlet distributed, and it employs a "greedy" approach (similar to a backward sequential feature search) for reducing irrelevant features from the training data of each class. Notice that reducing irrelevant features is synonymous here with selecting those features that provide best classification performance; the metric for making data reducing decisions is an analytic formula for the probability of error conditioned on the training data. The contribution of this work is to demonstrate how clustering performance varies depending on the method utilized for unsupervised training. To illustrate performance, results are demonstrated using simulated data. In general, the results of this work have implications for finding clusters in data mining applications.

Paper Details

Date Published: 28 March 2005
PDF: 10 pages
Proc. SPIE 5812, Data Mining, Intrusion Detection, Information Assurance, and Data Networks Security 2005, (28 March 2005); doi: 10.1117/12.603522
Show Author Affiliations
Robert S. Lynch, Naval Undersea Warfare Ctr. (United States)
Peter K. Willett, Univ. of Connecticut (United States)

Published in SPIE Proceedings Vol. 5812:
Data Mining, Intrusion Detection, Information Assurance, and Data Networks Security 2005
Belur V. Dasarathy, Editor(s)

© SPIE. Terms of Use
Back to Top