
Proceedings Paper
Automated recognition and extraction of tabular fields for the indexing of census recordsFormat | Member Price | Non-Member Price |
---|---|---|
$17.00 | $21.00 |
Paper Abstract
We describe a system for indexing of census records in tabular documents with the goal of recognizing the content
of each cell, including both headers and handwritten entries. Each document is automatically rectified, registered
and scaled to a known template following which lines and fields are detected and delimited as cells in a tabular
form. Whole-word or whole-phrase recognition of noisy machine-printed text is performed using a glyph library,
providing greatly increased efficiency and accuracy (approaching 100%), while avoiding the problems inherent
with traditional OCR approaches. Constrained handwriting recognition results for a single author reach as high
as 98% and 94.5% for the Gender field and Birthplace respectively. Multi-author accuracy (currently 82%) can
be improved through an increased training set. Active integration of user feedback in the system will accelerate
the indexing of records while providing a tightly coupled learning mechanism for system improvement.
Paper Details
Date Published: 4 February 2013
PDF: 11 pages
Proc. SPIE 8658, Document Recognition and Retrieval XX, 86580J (4 February 2013); doi: 10.1117/12.2004788
Published in SPIE Proceedings Vol. 8658:
Document Recognition and Retrieval XX
Richard Zanibbi; Bertrand Coüasnon, Editor(s)
PDF: 11 pages
Proc. SPIE 8658, Document Recognition and Retrieval XX, 86580J (4 February 2013); doi: 10.1117/12.2004788
Show Author Affiliations
Robert Clawson, Brigham Young Univ. (United States)
Kevin Bauer, Brigham Young Univ. (United States)
Glen Chidester, Brigham Young Univ. (United States)
Milan Pohontsch, Brigham Young Univ. (United States)
Kevin Bauer, Brigham Young Univ. (United States)
Glen Chidester, Brigham Young Univ. (United States)
Milan Pohontsch, Brigham Young Univ. (United States)
Douglas Kennard, Brigham Young Univ. (United States)
Jongha Ryu, Brigham Young Univ. (United States)
William Barrett, Brigham Young Univ. (United States)
Jongha Ryu, Brigham Young Univ. (United States)
William Barrett, Brigham Young Univ. (United States)
Published in SPIE Proceedings Vol. 8658:
Document Recognition and Retrieval XX
Richard Zanibbi; Bertrand Coüasnon, Editor(s)
© SPIE. Terms of Use
