Share Email Print

Proceedings Paper

A super resolution framework for low resolution document image OCR
Author(s): Di Ma; Gady Agam
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

Optical character recognition is widely used for converting document images into digital media. Existing OCR algorithms and tools produce good results from high resolution, good quality, document images. In this paper, we propose a machine learning based super resolution framework for low resolution document image OCR. Two main techniques are used in our proposed approach: a document page segmentation algorithm and a modified K-means clustering algorithm. Using this approach, by exploiting coherence in the document, we reconstruct from a low resolution document image a better resolution image and improve OCR results. Experimental results show substantial gain in low resolution documents such as the ones captured from video.

Paper Details

Date Published: 4 February 2013
PDF: 9 pages
Proc. SPIE 8658, Document Recognition and Retrieval XX, 86580P (4 February 2013); doi: 10.1117/12.2008354
Show Author Affiliations
Di Ma, Illinois Institute of Technology (United States)
Gady Agam, Illinois Institute of Technology (United States)

Published in SPIE Proceedings Vol. 8658:
Document Recognition and Retrieval XX
Richard Zanibbi; Bertrand Coüasnon, Editor(s)

© SPIE. Terms of Use
Back to Top
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research
Forgot your username?