Share Email Print
cover

Proceedings Paper

A general framework for multicharacter segmentation and its application in recognizing multilingual Asian documents
Author(s): Di Wen; Xiaoqing Ding
Format Member Price Non-Member Price
PDF $14.40 $18.00

Paper Abstract

In this paper we propose a general framework for character segmentation in complex multilingual documents, which is an endeavor to combine the traditionally separated segmentation and recognition processes into a cooperative system. The framework contains three basic steps: Dissection, Local Optimization and Global Optimization, which are designed to fuse various properties of the segmentation hypotheses hierarchically into a composite evaluation to decide the final recognition results. Experimental results show that this framework is general enough to be applied in variety of documents. A sample system based on this framework to recognize Chinese, Japanese and Korean documents and experimental performance is reported finally.

Paper Details

Date Published: 15 December 2003
PDF: 8 pages
Proc. SPIE 5296, Document Recognition and Retrieval XI, (15 December 2003); doi: 10.1117/12.528951
Show Author Affiliations
Di Wen, Tsinghua Univ. (China)
Xiaoqing Ding, Tsinghua Univ. (China)


Published in SPIE Proceedings Vol. 5296:
Document Recognition and Retrieval XI
Elisa H. Barney Smith; Jianying Hu; James Allan, Editor(s)

© SPIE. Terms of Use
Back to Top