Share Email Print

Proceedings Paper

A neural-linguistic approach for the recognition of a wide Arabic word lexicon
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

Recently, we have investigated the use of Arabic linguistic knowledge to improve the recognition of wide Arabic word lexicon. A neural-linguistic approach was proposed to mainly deal with canonical vocabulary of decomposable words derived from tri-consonant healthy roots. The basic idea is to factorize words by their roots and schemes. In this direction, we conceived two neural networks TNN_R and TNN_S to respectively recognize roots and schemes from structural primitives of words. The proposal approach achieved promising results. In this paper, we will focus on how to reach better results in terms of accuracy and recognition rate. Current improvements concern especially the training stage. It is about 1) to benefit from word letters order 2) to consider "sisters letters" (letters having same features), 3) to supervise networks behaviors, 4) to split up neurons to save letter occurrences and 5) to solve observed ambiguities. Considering theses improvements, experiments carried on 1500 sized vocabulary show a significant enhancement: TNN_R (resp. TNN_S) top4 has gone up from 77% to 85.8% (resp. from 65% to 97.9%). Enlarging the vocabulary from 1000 to 1700, adding 100 words each time, again confirmed the results without altering the networks stability.

Paper Details

Date Published: 18 January 2010
PDF: 10 pages
Proc. SPIE 7534, Document Recognition and Retrieval XVII, 75340L (18 January 2010); doi: 10.1117/12.839975
Show Author Affiliations
I. Ben Cheikh, UTIC-ESSTT (Tunisia)
A. Kacem, UTIC-ESSTT (Tunisia)
A. Belaïd, LORIA (France)

Published in SPIE Proceedings Vol. 7534:
Document Recognition and Retrieval XVII
Laurence Likforman-Sulem; Gady Agam, Editor(s)

© SPIE. Terms of Use
Back to Top