Share Email Print

Proceedings Paper

Transcript mapping for handwritten Arabic documents
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

Handwriting recognition research requires large databases of word images each of which is labeled with the word it contains. Full images scanned in, however, usually contain sentences or paragraphs of writing. The creation of labeled databases of images of isolated words is usually tedious, requiring a person to drag a rectangle around each word in the full image and type in the label. Transcript mapping is the automatic alignment of words in a text file with word locations in the full image. It can ease the creation of databases for research. We propose the first transcript mapping method for handwritten Arabic documents. Our approach is based on Dynamic Time Warping (DTW) and offers two primary algorithmic contributions. First is an extension to DTW that uses true distances when mapping multiple entries from one series to a single entry in the second series. Second is a method to concurrently map elements of a partially aligned third series within the main alignment. Preliminary results are provided.

Paper Details

Date Published: 29 January 2007
PDF: 8 pages
Proc. SPIE 6500, Document Recognition and Retrieval XIV, 65000W (29 January 2007); doi: 10.1117/12.696140
Show Author Affiliations
Liana M. Lorigo, Univ. at Buffalo (United States)
Venu Govindaraju, Univ. at Buffalo (United States)

Published in SPIE Proceedings Vol. 6500:
Document Recognition and Retrieval XIV
Xiaofan Lin; Berrin A. Yanikoglu, Editor(s)

© SPIE. Terms of Use
Back to Top
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research
Forgot your username?