Share Email Print

Proceedings Paper

Document image decoding in the UC Berkeley Digital Library
Author(s): Gary E. Kopec
Format Member Price Non-Member Price
PDF $14.40 $18.00

Paper Abstract

The UC Berkeley Environmental Digital Library Project is one of six university-led projects that were initiated in the fall of 1994 as part of a four-year digital library initiative sponsored by the NSF, NASA, and ARPA. The Berkeley project is particularly interesting from a document image analysis perspective because its testbed collection consists almost entirely of scanned materials. As a result, the Berkeley project is making extensive use of document recognition and other image analysis technology to provide content-based access to the collection. The Document Image Decoding (DID) group at Xerox PARC is a member of the Berkeley team and is investigating the application of DID techniques to providing high-quality (accurate and properly structured) transcriptions of scanned documents in the collection. This paper briefly describes the Berkeley project, discusses some of its recognition requirements and presents examples of online structured documents created using DID technology.

Paper Details

Date Published: 7 March 1996
PDF: 12 pages
Proc. SPIE 2660, Document Recognition III, (7 March 1996); doi: 10.1117/12.234702
Show Author Affiliations
Gary E. Kopec, Xerox Palo Alto Research Ctr. (United States)

Published in SPIE Proceedings Vol. 2660:
Document Recognition III
Luc M. Vincent; Jonathan J. Hull, Editor(s)

© SPIE. Terms of Use
Back to Top