Share Email Print

Proceedings Paper

SemiBoost-based Arabic character recognition method
Author(s): Bing Su; Liangrui Peng; Xiaoqing Ding
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

A SemiBoost-based character recognition method is introduced in order to incorporate the information of unlabeled practical samples in training stage. One of the key problems in semi-supervised learning is the criteria of unlabeled sample selection. In this paper, a criteria based on pair-wise sample similarity is adopted to guide the SemiBoost learning process. At each time of iteration, unlabeled examples are selected and assigned labels. The selected samples are used along with the original labeled samples to train a new classifier. The trained classifiers are integrated to make the final classfier. An empirical study on several Arabic similar character pairs with different similarities shows that the proposed method improves the performance as unlabeled samples reveal the distribution of practical samples.

Paper Details

Date Published: 24 January 2011
PDF: 9 pages
Proc. SPIE 7874, Document Recognition and Retrieval XVIII, 787409 (24 January 2011); doi: 10.1117/12.876622
Show Author Affiliations
Bing Su, Tsinghua Univ. (China)
Liangrui Peng, Tsinghua Univ. (China)
Xiaoqing Ding, Tsinghua Univ. (China)

Published in SPIE Proceedings Vol. 7874:
Document Recognition and Retrieval XVIII
Gady Agam; Christian Viard-Gaudin, Editor(s)

© SPIE. Terms of Use
Back to Top
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research
Forgot your username?