Share Email Print

Proceedings Paper

An approach to the segmentation of multi-page document flow using binary classification
Author(s): Onur Agin; Cagdas Ulas; Mehmet Ahat; Can Bekar
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

In this paper, we present a method for segmentation of document page flow applied to heterogeneous real bank documents. The approach is based on the content of images and it also incorporates font based features inside the documents. Our method involves a bag of visual words (BoVW) model on the designed image based feature descriptors and a novel approach to combine the consecutive pages of a document into a single feature vector that represents the transition between these pages. The transitions here could be represented by one of the two different classes: continuity of the same document or beginning of a new document. Using the transition feature vectors, we utilize three different binary classifiers to make predictions on the relationship between consecutive pages. Our initial results demonstrate that the proposed method can exhibit promising performance for document flow segmentation at this stage.

Paper Details

Date Published: 4 March 2015
PDF: 7 pages
Proc. SPIE 9443, Sixth International Conference on Graphic and Image Processing (ICGIP 2014), 944311 (4 March 2015); doi: 10.1117/12.2178778
Show Author Affiliations
Onur Agin, Yapı Kredi Bank (Turkey)
Cagdas Ulas, Yapı Kredi Bank (Turkey)
Mehmet Ahat, Yapı Kredi Bank (Turkey)
Can Bekar, Yapı Kredi Bank (Turkey)

Published in SPIE Proceedings Vol. 9443:
Sixth International Conference on Graphic and Image Processing (ICGIP 2014)
Yulin Wang; Xudong Jiang; David Zhang, Editor(s)

© SPIE. Terms of Use
Back to Top
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research
Forgot your username?