Share Email Print
cover

Proceedings Paper

Document reconstruction by layout analysis of snippets
Author(s): Florian Kleber; Markus Diem; Robert Sablatnig
Format Member Price Non-Member Price
PDF $14.40 $18.00

Paper Abstract

Document analysis is done to analyze entire forms (e.g. intelligent form analysis, table detection) or to describe the layout/structure of a document. Also skew detection of scanned documents is performed to support OCR algorithms that are sensitive to skew. In this paper document analysis is applied to snippets of torn documents to calculate features for the reconstruction. Documents can either be destroyed by the intention to make the printed content unavailable (e.g. tax fraud investigation, business crime) or due to time induced degeneration of ancient documents (e.g. bad storage conditions). Current reconstruction methods for manually torn documents deal with the shape, inpainting and texture synthesis techniques. In this paper the possibility of document analysis techniques of snippets to support the matching algorithm by considering additional features are shown. This implies a rotational analysis, a color analysis and a line detection. As a future work it is planned to extend the feature set with the paper type (blank, checked, lined), the type of the writing (handwritten vs. machine printed) and the text layout of a snippet (text size, line spacing). Preliminary results show that these pre-processing steps can be performed reliably on a real dataset consisting of 690 snippets.

Paper Details

Date Published: 16 February 2010
PDF: 11 pages
Proc. SPIE 7531, Computer Vision and Image Analysis of Art, 753107 (16 February 2010); doi: 10.1117/12.843687
Show Author Affiliations
Florian Kleber, Vienna Univ. of Technology (Austria)
Markus Diem, Vienna Univ. of Technology (Austria)
Robert Sablatnig, Vienna Univ. of Technology (Austria)


Published in SPIE Proceedings Vol. 7531:
Computer Vision and Image Analysis of Art
David G. Stork; Jim Coddington; Anna Bentkowska-Kafel, Editor(s)

© SPIE. Terms of Use
Back to Top