Share Email Print
cover

Proceedings Paper

Optical character recognition of handwritten Arabic using hidden Markov models
Author(s): Mohannad M. Aulama; Asem M. Natsheh; Gheith A. Abandah; Mohammed M. Olama
Format Member Price Non-Member Price
PDF $14.40 $18.00

Paper Abstract

The problem of optical character recognition (OCR) of handwritten Arabic has not received a satisfactory solution yet. In this paper, an Arabic OCR algorithm is developed based on Hidden Markov Models (HMMs) combined with the Viterbi algorithm, which results in an improved and more robust recognition of characters at the sub-word level. Integrating the HMMs represents another step of the overall OCR trends being currently researched in the literature. The proposed approach exploits the structure of characters in the Arabic language in addition to their extracted features to achieve improved recognition rates. Useful statistical information of the Arabic language is initially extracted and then used to estimate the probabilistic parameters of the mathematical HMM. A new custom implementation of the HMM is developed in this study, where the transition matrix is built based on the collected large corpus, and the emission matrix is built based on the results obtained via the extracted character features. The recognition process is triggered using the Viterbi algorithm which employs the most probable sequence of sub-words. The model was implemented to recognize the sub-word unit of Arabic text raising the recognition rate from being linked to the worst recognition rate for any character to the overall structure of the Arabic language. Numerical results show that there is a potentially large recognition improvement by using the proposed algorithms.

Paper Details

Date Published: 26 April 2011
PDF: 12 pages
Proc. SPIE 8055, Optical Pattern Recognition XXII, 80550L (26 April 2011); doi: 10.1117/12.884087
Show Author Affiliations
Mohannad M. Aulama, The Univ. of Jordan (Jordan)
Asem M. Natsheh, The Univ. of Jordan (Jordan)
Gheith A. Abandah, The Univ. of Jordan (Jordan)
Mohammed M. Olama, Oak Ridge National Lab. (United States)


Published in SPIE Proceedings Vol. 8055:
Optical Pattern Recognition XXII
David P. Casasent; Tien-Hsin Chao, Editor(s)

© SPIE. Terms of Use
Back to Top