Share Email Print
cover

Proceedings Paper

Value-based customer grouping from large retail data sets
Author(s): Alexander Strehl; Joydeep Ghosh
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

In this paper, we propose OPOSSUM, a novel similarity-based clustering algorithm using constrained, weighted graph- partitioning. Instead of binary presence or absence of products in a market-basket, we use an extended 'revenue per product' measure to better account for management objectives. Typically the number of clusters desired in a database marketing application is only in the teens or less. OPOSSUM proceeds top-down, which is more efficient and takes a small number of steps to attain the desired number of clusters as compared to bottom-up agglomerative clustering approaches. OPOSSUM delivers clusters that are balanced in terms of either customers (samples) or revenue (value). To facilitate data exploration and validation of results we introduce CLUSION, a visualization toolkit for high-dimensional clustering problems. To enable closed loop deployment of the algorithm, OPOSSUM has no user-specified parameters. Thresholding heuristics are avoided and the optimal number of clusters is automatically determined by a search for maximum performance. Results are presented on a real retail industry data-set of several thousand customers and products, to demonstrate the power of the proposed technique.

Paper Details

Date Published: 6 April 2000
PDF: 10 pages
Proc. SPIE 4057, Data Mining and Knowledge Discovery: Theory, Tools, and Technology II, (6 April 2000); doi: 10.1117/12.381756
Show Author Affiliations
Alexander Strehl, Univ. of Texas/Austin (United States)
Joydeep Ghosh, Univ. of Texas/Austin (United States)


Published in SPIE Proceedings Vol. 4057:
Data Mining and Knowledge Discovery: Theory, Tools, and Technology II
Belur V. Dasarathy, Editor(s)

© SPIE. Terms of Use
Back to Top