Share Email Print
cover

Proceedings Paper

Simultaneous segmentation and recognition of Arabic printed text using linguistic concepts of vocabulary
Author(s): Mohamed Ben Halima; Adel M. Alimi
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

In this paper, we propose a new approach to Arabic printed text analysis and recognition. This approach is based on linguistic concepts of Arabic vocabulary. For the text, we allow to categorize the words in decomposable words (derived from a root) and indecomposable words (not derived from a root) and to put forth morpho-syntactic characterization hypotheses for each word. For the decomposable words, we attempt to recognize word basic morphemes: antefix, prefix, infix, suffix, postfix and root contrary to existing approaches which are usually based on recognition of word entity by holistic approach.

Paper Details

Date Published: 19 January 2009
PDF: 10 pages
Proc. SPIE 7247, Document Recognition and Retrieval XVI, 72470T (19 January 2009); doi: 10.1117/12.805617
Show Author Affiliations
Mohamed Ben Halima, The High School of National Engineering of Sfax (Tunisia)
Adel M. Alimi, The High School of National Engineering of Sfax (Tunisia)


Published in SPIE Proceedings Vol. 7247:
Document Recognition and Retrieval XVI
Kathrin Berkner; Laurence Likforman-Sulem, Editor(s)

© SPIE. Terms of Use
Back to Top