Share Email Print

Proceedings Paper

Ancient documents bleed-through evaluation and its application for predicting OCR error rates
Author(s): V. Rabeux; N. Journet; J. P. Domenger
Format Member Price Non-Member Price
PDF $14.40 $18.00

Paper Abstract

This article presents a way to evaluate the bleed-through defect on very old document images. We design measures to quantify and evaluate the verso ink bleeding through the paper onto the recto side. Measuring the bleed-through defect alows us to perform statistical analysis that are able to predict the feasibility of different post-scan tasks. In this article we choose to illustrate our measures by creating two OCR error rate predicting models based bleed-through evaluation. Two models are proposed, one for Abbyy FineReader * which is a very power-full commercial OCR and OCRopus † which is sponsored by Google. Both prediction models appears to be very accurate when calculating various statistic indicators.

Paper Details

Date Published: 24 January 2011
PDF: 8 pages
Proc. SPIE 7874, Document Recognition and Retrieval XVIII, 78740Q (24 January 2011); doi: 10.1117/12.873368
Show Author Affiliations
V. Rabeux, Univ. de Bordeaux (France)
N. Journet, Univ. de Bordeaux (France)
J. P. Domenger, Univ. de Bordeaux (France)

Published in SPIE Proceedings Vol. 7874:
Document Recognition and Retrieval XVIII
Gady Agam; Christian Viard-Gaudin, Editor(s)

© SPIE. Terms of Use
Back to Top