Share Email Print

Proceedings Paper

Recognizing characters of ancient manuscripts
Author(s): Markus Diem; Robert Sablatnig
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

Considering printed Latin text, the main issues of Optical Character Recognition (OCR) systems are solved. However, for degraded handwritten document images, basic preprocessing steps such as binarization, gain poor results with state-of-the-art methods. In this paper ancient Slavonic manuscripts from the 11th century are investigated. In order to minimize the consequences of false character segmentation, a binarization-free approach based on local descriptors is proposed. Additionally local information allows the recognition of partially visible or washed out characters. The proposed algorithm consists of two steps: character classification and character localization. Initially Scale Invariant Feature Transform (SIFT) features are extracted which are subsequently classified using Support Vector Machines (SVM). Afterwards, the interest points are clustered according to their spatial information. Thereby, characters are localized and finally recognized based on a weighted voting scheme of pre-classified local descriptors. Preliminary results show that the proposed system can handle highly degraded manuscript images with background clutter (e.g. stains, tears) and faded out characters.

Paper Details

Date Published: 16 February 2010
PDF: 12 pages
Proc. SPIE 7531, Computer Vision and Image Analysis of Art, 753106 (16 February 2010); doi: 10.1117/12.843532
Show Author Affiliations
Markus Diem, Vienna Univ. of Technology (Austria)
Robert Sablatnig, Vienna Univ. of Technology (Austria)

Published in SPIE Proceedings Vol. 7531:
Computer Vision and Image Analysis of Art
David G. Stork; Jim Coddington; Anna Bentkowska-Kafel, Editor(s)

© SPIE. Terms of Use
Back to Top
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research
Forgot your username?