Share Email Print
cover

Proceedings Paper

A hybrid intelligence approach to artifact recognition in digital publishing
Format Member Price Non-Member Price
PDF $14.40 $18.00

Paper Abstract

The system presented integrates rule-based and case-based reasoning for artifact recognition in Digital Publishing. In Variable Data Printing (VDP) human proofing could result prohibitive since a job could contain millions of different instances that may contain two types of artifacts: 1) evident defects, like a text overflow or overlapping 2) style-dependent artifacts, subtle defects that show as inconsistencies with regard to the original job design. We designed a Knowledge-Based Artifact Recognition tool for document segmentation, layout understanding, artifact detection, and document design quality assessment. Document evaluation is constrained by reference to one instance of the VDP job proofed by a human expert against the remaining instances. Fundamental rules of document design are used in the rule-based component for document segmentation and layout understanding. Ambiguities in the design principles not covered by the rule-based system are analyzed by case-based reasoning, using the Nearest Neighbor Algorithm, where features from previous jobs are used to detect artifacts and inconsistencies within the document layout. We used a subset of XSL-FO and assembled a set of 44 document samples. The system detected all the job layout changes, while obtaining an overall average accuracy of 84.56%, with the highest accuracy of 92.82%, for overlapping and the lowest, 66.7%, for the lack-of-white-space.

Paper Details

Date Published: 10 February 2006
PDF: 10 pages
Proc. SPIE 6076, Digital Publishing, 60760B (10 February 2006); doi: 10.1117/12.646240
Show Author Affiliations
J. Fernando Vega-Riveros, Univ. of Puerto Rico, Mayagüez (United States)
Hector J. Santos Villalobos, Univ. of Puerto Rico, Mayagüez (United States)


Published in SPIE Proceedings Vol. 6076:
Digital Publishing
Jan P. Allebach; Hui Chao, Editor(s)

© SPIE. Terms of Use
Back to Top