Journal of Electronic ImagingUnified layout analysis and text localization framework
|Format||Member Price||Non-Member Price|
|GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free.||Check Access|
A technique appropriate for extracting textual information from documents with complex layouts, such as newspapers and journals, is presented. It is a combination of a foreground analysis and a text localization method. The first one is used to segment the page in text and nontext blocks, whereas the second one is used to detect text that may be embedded inside images, charts, diagrams, tables, etc. Detailed experiments on two public databases showed that mixing layout analysis and text localization techniques can lead to improved page segmentation and text extraction results.