Share Email Print

Proceedings Paper

A general framework for multicharacter segmentation and its application in recognizing multilingual Asian documents
Author(s): Di Wen; Xiaoqing Ding
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

In this paper we propose a general framework for character segmentation in complex multilingual documents, which is an endeavor to combine the traditionally separated segmentation and recognition processes into a cooperative system. The framework contains three basic steps: Dissection, Local Optimization and Global Optimization, which are designed to fuse various properties of the segmentation hypotheses hierarchically into a composite evaluation to decide the final recognition results. Experimental results show that this framework is general enough to be applied in variety of documents. A sample system based on this framework to recognize Chinese, Japanese and Korean documents and experimental performance is reported finally.

Paper Details

Date Published: 15 December 2003
PDF: 8 pages
Proc. SPIE 5296, Document Recognition and Retrieval XI, (15 December 2003); doi: 10.1117/12.528951
Show Author Affiliations
Di Wen, Tsinghua Univ. (China)
Xiaoqing Ding, Tsinghua Univ. (China)

Published in SPIE Proceedings Vol. 5296:
Document Recognition and Retrieval XI
Elisa H. Barney Smith; Jianying Hu; James Allan, Editor(s)

© SPIE. Terms of Use
Back to Top
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research
Forgot your username?