Share Email Print

Proceedings Paper

Training a whole-book LSTM-based recognizer with an optimal training set
Author(s): Mohammad Reza Soheili; Mohammad Reza Yousefi; Ehsanollah Kabir; Didier Stricker
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

Despite the recent progress in OCR technologies, whole-book recognition, is still a challenging task, in particular in case of old and historical books, that the unknown font faces or low quality of paper and print contributes to the challenge. Therefore, pre-trained recognizers and generic methods do not usually perform up to required standards, and usually the performance degrades for larger scale recognition tasks, such as of a book. Such reportedly low error-rate methods turn out to require a great deal of manual correction. Generally, such methodologies do not make effective use of concepts such redundancy in whole-book recognition. In this work, we propose to train Long Short Term Memory (LSTM) networks on a minimal training set obtained from the book to be recognized. We show that clustering all the sub-words in the book, and using the sub-word cluster centers as the training set for the LSTM network, we can train models that outperform any identical network that is trained with randomly selected pages of the book. In our experiments, we also show that although the sub-word cluster centers are equivalent to about 8 pages of text for a 101- page book, a LSTM network trained on such a set performs competitively compared to an identical network that is trained on a set of 60 randomly selected pages of the book.

Paper Details

Date Published: 13 April 2018
PDF: 8 pages
Proc. SPIE 10696, Tenth International Conference on Machine Vision (ICMV 2017), 1069610 (13 April 2018); doi: 10.1117/12.2309615
Show Author Affiliations
Mohammad Reza Soheili, Kharazmi Univ. (Iran, Islamic Republic of)
Mohammad Reza Yousefi, German Research Ctr. for Artificial Intelligence (Germany)
Ehsanollah Kabir, Tarbiat Modares Univ. (Iran, Islamic Republic of)
Didier Stricker, German Research Ctr. for Artificial Intelligence (Germany)

Published in SPIE Proceedings Vol. 10696:
Tenth International Conference on Machine Vision (ICMV 2017)
Antanas Verikas; Petia Radeva; Dmitry Nikolaev; Jianhong Zhou, Editor(s)

© SPIE. Terms of Use
Back to Top