Share Email Print

Proceedings Paper

Form classification
Format Member Price Non-Member Price
PDF $14.40 $18.00

Paper Abstract

The problem of form classification is to assign a single-page form image to one of a set of predefined form types or classes. We classify the form images using low level pixel density information from the binary images of the documents. In this paper, we solve the form classification problem with a classifier based on the k-means algorithm, supported by adaptive boosting. Our classification method is tested on the NIST scanned tax forms data bases (special forms databases 2 and 6) which include machine-typed and handwritten documents. Our method improves the performance over published results on the same databases, while still using a simple set of image features.

Paper Details

Date Published: 28 January 2008
PDF: 6 pages
Proc. SPIE 6815, Document Recognition and Retrieval XV, 68150Y (28 January 2008); doi: 10.1117/12.766737
Show Author Affiliations
K. V. Umamaheswara Reddy, Univ. of Buffalo (United States)
Venu Govindaraju, Univ. of Buffalo (United States)

Published in SPIE Proceedings Vol. 6815:
Document Recognition and Retrieval XV
Berrin A. Yanikoglu; Kathrin Berkner, Editor(s)

© SPIE. Terms of Use
Back to Top