Share Email Print

Proceedings Paper

Value-based customer grouping from large retail data sets
Author(s): Alexander Strehl; Joydeep Ghosh
Format Member Price Non-Member Price
PDF $14.40 $18.00

Paper Abstract

In this paper, we propose OPOSSUM, a novel similarity-based clustering algorithm using constrained, weighted graph- partitioning. Instead of binary presence or absence of products in a market-basket, we use an extended 'revenue per product' measure to better account for management objectives. Typically the number of clusters desired in a database marketing application is only in the teens or less. OPOSSUM proceeds top-down, which is more efficient and takes a small number of steps to attain the desired number of clusters as compared to bottom-up agglomerative clustering approaches. OPOSSUM delivers clusters that are balanced in terms of either customers (samples) or revenue (value). To facilitate data exploration and validation of results we introduce CLUSION, a visualization toolkit for high-dimensional clustering problems. To enable closed loop deployment of the algorithm, OPOSSUM has no user-specified parameters. Thresholding heuristics are avoided and the optimal number of clusters is automatically determined by a search for maximum performance. Results are presented on a real retail industry data-set of several thousand customers and products, to demonstrate the power of the proposed technique.

Paper Details

Date Published: 6 April 2000
PDF: 10 pages
Proc. SPIE 4057, Data Mining and Knowledge Discovery: Theory, Tools, and Technology II, (6 April 2000); doi: 10.1117/12.381756
Show Author Affiliations
Alexander Strehl, Univ. of Texas/Austin (United States)
Joydeep Ghosh, Univ. of Texas/Austin (United States)

Published in SPIE Proceedings Vol. 4057:
Data Mining and Knowledge Discovery: Theory, Tools, and Technology II
Belur V. Dasarathy, Editor(s)

© SPIE. Terms of Use
Back to Top