Share Email Print

Proceedings Paper

Character recognition of Japanese newspaper headlines with graphical designs
Author(s): Minako Sawaki; Norihiro Hagita
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

Graphical designs are often used in Japanese newspaper headlines to indicate hot articles. However, conventional OCR software seldom recognizes characters in such headlines because of the difficulty of removing the designs. This paper proposes a method that recognizes these characters without needing removal of the graphical designs. First, the number of text-line regions and the averaged character heights are roughly extracted from the local distribution of the black and white runs observed in a rectangular window while the window is shifted pixel- by-pixel along the direction of the text-line. Next, normalized text-line regions are yielded by normalizing their heights to the height of binary reference patterns in a dictionary. Next, displacement matching is applied to the normalized text-line region for character recognition. A square window at each position is matched against binary reference patterns while being shifted pixel-by-pixel along the direction of the text-line. The complementary similarity measure, which is robust against graphical designs, is used as a discriminant function. When the maximum similarity value at each position exceeds the threshold, which is automatically determined from the degree of degradation in the square window, the character category of this similarity value is specified as a recognized category. Experimental results for fifty Japanese newspaper headlines show that the method achieves recognition rates of over 90%, much higher than a conventional method (17%).

Paper Details

Date Published: 7 March 1996
PDF: 9 pages
Proc. SPIE 2660, Document Recognition III, (7 March 1996); doi: 10.1117/12.234718
Show Author Affiliations
Minako Sawaki, NTT Basic Research Labs. (Japan)
Norihiro Hagita, NTT Basic Research Labs. (Japan)

Published in SPIE Proceedings Vol. 2660:
Document Recognition III
Luc M. Vincent; Jonathan J. Hull, Editor(s)

© SPIE. Terms of Use
Back to Top