Share Email Print

Proceedings Paper

A robust omnifont open-vocabulary Arabic OCR system using pseudo-2D-HMM
Author(s): Abdullah M. Rashwan; Mohsen A. Rashwan; Ahmed Abdel-Hameed; Sherif Abdou; A. H. Khalil
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

Recognizing old documents is highly desirable since the demand for quickly searching millions of archived documents has recently increased. Using Hidden Markov Models (HMMs) has been proven to be a good solution to tackle the main problems of recognizing typewritten Arabic characters. These attempts however achieved a remarkable success for omnifont OCR under very favorable conditions, they didn't achieve the same performance in practical conditions, i.e. noisy documents. In this paper we present an omnifont, large-vocabulary Arabic OCR system using Pseudo Two Dimensional Hidden Markov Model (P2DHMM), which is a generalization of the HMM. P2DHMM offers a more efficient way to model the Arabic characters, such model offer both minimal dependency on the font size/style (omnifont), and high level of robustness against noise. The evaluation results of this system are very promising compared to a baseline HMM system and best OCRs available in the market (Sakhr and NovoDynamics). The recognition accuracy of the P2DHMM classifier is measured against the classic HMM classifier, the average word accuracy rates for P2DHMM and HMM classifiers are 79% and 66% respectively. The overall system accuracy is measured against Sakhr and NovoDynamics OCR systems, the average word accuracy rates for P2DHMM, NovoDynamics, and Sakhr are 74%, 71%, and 61% respectively.

Paper Details

Date Published: 23 January 2012
PDF: 8 pages
Proc. SPIE 8297, Document Recognition and Retrieval XIX, 829707 (23 January 2012);
Show Author Affiliations
Abdullah M. Rashwan, Cairo Univ. (Egypt)
RDI (Egypt)
Mohsen A. Rashwan, Cairo Univ. (Egypt)
RDI (Egypt)
Ahmed Abdel-Hameed, Cairo Univ. (Egypt)
RDI (Egypt)
Sherif Abdou, Cairo Univ. (Egypt)
RDI (Egypt)
A. H. Khalil, Cairo Univ. (Egypt)

Published in SPIE Proceedings Vol. 8297:
Document Recognition and Retrieval XIX
Christian Viard-Gaudin; Richard Zanibbi, Editor(s)

© SPIE. Terms of Use
Back to Top
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research
Forgot your username?