Share Email Print

Proceedings Paper

The aware toolbox for the detection of law infringements on web pages
Author(s): Asif Shahab; Thomas Kieninger; Andreas Dengel
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

In the project Aware we aim to develop an automatic assistant for the detection of law infringements on web pages. The motivation for this project is that many authors of web pages are at some points infringing copyrightor other laws, mostly without being aware of that fact, and are more and more often confronted with costly legal warnings. As the legal environment is constantly changing, an important requirement of Aware is that the domain knowledge can be maintained (and initially defined) by numerous legal experts remotely working without further assistance of the computer scientists. Consequently, the software platform was chosen to be a web-based generic toolbox that can be configured to suit individual analysis experts, definitions of analysis flow, information gathering and report generation. The report generated by the system summarizes all critical elements of a given web page and provides case specific hints to the page author and thus forms a new type of service. Regarding the analysis subsystems, Aware mainly builds on existing state-of-the-art technologies. Their usability has been evaluated for each intended task. In order to control the heterogeneous analysis components and to gather the information, a lightweight scripting shell has been developed. This paper describes the analysis technologies, ranging from text based information extraction, over optical character recognition and phonetic fuzzy string matching to a set of image analysis and retrieval tools; as well as the scripting language to define the analysis flow.

Paper Details

Date Published: 18 January 2010
PDF: 7 pages
Proc. SPIE 7534, Document Recognition and Retrieval XVII, 753407 (18 January 2010); doi: 10.1117/12.839950
Show Author Affiliations
Asif Shahab, German Research Ctr. for Artificial Intelligence GmbH (Germany)
Thomas Kieninger, German Research Ctr. for Artificial Intelligence GmbH (Germany)
Andreas Dengel, German Research Ctr. for Artificial Intelligence GmbH (Germany)

Published in SPIE Proceedings Vol. 7534:
Document Recognition and Retrieval XVII
Laurence Likforman-Sulem; Gady Agam, Editor(s)

© SPIE. Terms of Use
Back to Top