Share Email Print

Proceedings Paper

Simple linear regression model based data clustering
Author(s): Bingcheng Li
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

KMeans is one of most popular algorithms in data mining (ranking number 2) and has be widely used in many fields. KMeans uses Euclidean distance to compare two data. However Euclidean distance is sensitive to linear transform in data collection process. Due to these linear transforms, the distance between two data points for the same class (intra-class distance) may larger than those for different classes (inter-class distance) that may cause low clustering performance for KMeans algorithm. In this paper, we propose simple linear regression approach for data clustering. Instead of using Euclidean distance to measure the difference, we recommend using the goodness of fitting (or normalized cross correlation) to measure the similarity and compare two data points. Using this new data comparison technique, we introduce linear regression approach for data clustering and demonstrate that the proposed method has higher performance and low computational cost than KMeans methods.

Paper Details

Date Published: 14 May 2019
PDF: 8 pages
Proc. SPIE 10988, Automatic Target Recognition XXIX, 109880A (14 May 2019); doi: 10.1117/12.2518037
Show Author Affiliations
Bingcheng Li, Lockheed Martin MST (United States)

Published in SPIE Proceedings Vol. 10988:
Automatic Target Recognition XXIX
Riad I. Hammoud; Timothy L. Overman, Editor(s)

© SPIE. Terms of Use
Back to Top
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research
Forgot your username?