Share Email Print

Proceedings Paper

Automatic style clustering of printed characters in form images
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

Style is an important feature of printed or handwritten characters. But it is not studied thoroughly compared with character recognition. In this paper, we try to learn how many typical styles exist in a kind of real world form images. A hierarchical clustering method has been developed and tested. A cross recognition error rate constraint is proposed to reduce the false combinations in the hierarchical clustering process, and a cluster selecting method is used to delete redundant or unsuitable clusters. Only a similarity measure between any patterns is needed by the algorithm. It is tested on a template matching based similarity measure which can be extended to any other feature and distance measure easily. The detailed comparing on every step’s effects is shown in the paper. Total 16 kinds of typical styles are found out, and by giving each character in each style a prototype for recognition, a 0.78% error rate is achieved by recognizing the testing set.

Paper Details

Date Published: 17 January 2005
PDF: 8 pages
Proc. SPIE 5676, Document Recognition and Retrieval XII, (17 January 2005); doi: 10.1117/12.588138
Show Author Affiliations
Changsong Liu, Tsinghua Univ. (China)
Xiaoqing Ding, Tsinghua Univ. (China)

Published in SPIE Proceedings Vol. 5676:
Document Recognition and Retrieval XII
Elisa H. Barney Smith; Kazem Taghva, Editor(s)

© SPIE. Terms of Use
Back to Top
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research
Forgot your username?