Share Email Print
cover

Proceedings Paper

Compound character recognition by run-number-based metric distance
Author(s): Uptal Garain; B. B. Chaudhuri
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

This paper concerns automatic OCR of Bangla, a major Indian Language Script which is the fourth most popular script in the world. A Bangla OCR system has to recognize about 300 graphemic shapes among which 250 compound characters have quite complex stroke patterns. For recognition of such compound characters, feature based approaches are less reliable and template based approaches are less flexible to size and style variation of character font. We combine the positive aspects of feature based and template based approaches. Here we propose a run number based normalized template matching technique for compound character recognition. Run number vectors for both horizontal and vertical scanning are computed. As the number of scans may very from pattern to pattern, we normalize and abbreviate the vector. We prove that this normalized and abbreviated vector induces metric distance metric distance. Moreover, this vector is invariant to scaling, insensitive to character style variation and more effective for more complex-shaped characters than simple-shaped ones. We use this vector representation for matching within a group of compound characters. We notice that the matching is more efficient if the vector is reorganized with respect to the centroid of the pattern. We have tested our approach on a large set of segmented compounds characters at different point sizes as well as different styles. Italic characters are subject to preprocessing. The overall correct recognition rate is 99.69 percent.

Paper Details

Date Published: 1 April 1998
PDF: 8 pages
Proc. SPIE 3305, Document Recognition V, (1 April 1998); doi: 10.1117/12.304622
Show Author Affiliations
Uptal Garain, Indian Statistical Institute (India)
B. B. Chaudhuri, Indian Statistical Institute (India)


Published in SPIE Proceedings Vol. 3305:
Document Recognition V
Daniel P. Lopresti; Jiangying Zhou, Editor(s)

© SPIE. Terms of Use
Back to Top