Share Email Print

Proceedings Paper

Word-level recognition of multifont Arabic text using a feature vector matching approach
Author(s): Erik J. Erlandson; John M. Trenkle; Robert C. Vogt
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

Many text recognition systems recognize text imagery at the character level and assemble words from the recognized characters. An alternative approach is to recognize text imagery at the word level, without analyzing individual characters. This approach avoids the problem of individual character segmentation, and can overcome local errors in character recognition. A word-level recognition system for machine-printed Arabic text has been implemented. Arabic is a script language, and is therefore difficult to segment at the character level. Character segmentation has been avoided by recognizing text imagery of complete words. The Arabic recognition system computes a vector of image-morphological features on a query word image. This vector is matched against a precomputed database of vectors from a lexicon of Arabic words. Vectors from the database with the highest match score are returned as hypotheses for the unknown image. Several feature vectors may be stored for each word in the database. Database feature vectors generated using multiple fonts and noise models allow the system to be tuned to its input stream. Used in conjunction with database pruning techniques, this Arabic recognition system has obtained promising word recognition rates on low-quality multifont text imagery.

Paper Details

Date Published: 7 March 1996
PDF: 8 pages
Proc. SPIE 2660, Document Recognition III, (7 March 1996); doi: 10.1117/12.234725
Show Author Affiliations
Erik J. Erlandson, Environmental Research Institute of Michigan (United States)
John M. Trenkle, Environmental Research Institute of Michigan (United States)
Robert C. Vogt, Environmental Research Institute of Michigan (United States)

Published in SPIE Proceedings Vol. 2660:
Document Recognition III
Luc M. Vincent; Jonathan J. Hull, Editor(s)

© SPIE. Terms of Use
Back to Top