Share Email Print

Proceedings Paper • new

Simple linear regression model based data clustering
Author(s): Bingcheng Li
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

KMeans is one of most popular algorithms in data mining (ranking number 2) and has be widely used in many fields. KMeans uses Euclidean distance to compare two data. However Euclidean distance is sensitive to linear transform in data collection process. Due to these linear transforms, the distance between two data points for the same class (intra-class distance) may larger than those for different classes (inter-class distance) that may cause low clustering performance for KMeans algorithm. In this paper, we propose simple linear regression approach for data clustering. Instead of using Euclidean distance to measure the difference, we recommend using the goodness of fitting (or normalized cross correlation) to measure the similarity and compare two data points. Using this new data comparison technique, we introduce linear regression approach for data clustering and demonstrate that the proposed method has higher performance and low computational cost than KMeans methods.

Paper Details

Date Published: 14 May 2019
PDF: 8 pages
Proc. SPIE 10988, Automatic Target Recognition XXIX, 109880A (14 May 2019); doi: 10.1117/12.2518037
Show Author Affiliations
Bingcheng Li, Lockheed Martin MST (United States)

Published in SPIE Proceedings Vol. 10988:
Automatic Target Recognition XXIX
Riad I. Hammoud; Timothy L. Overman, Editor(s)

© SPIE. Terms of Use
Back to Top