Share Email Print

Proceedings Paper

Google Books: making the public domain universally accessible
Author(s): Adam Langley; Dan S. Bloomberg
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

Google Book Search is working with libraries and publishers around the world to digitally scan books. Some of those works are now in the public domain and, in keeping with Google's mission to make all the world's information useful and universally accessible, we wish to allow users to download them all. For users, it is important that the files are as small as possible and of printable quality. This means that a single codec for both text and images is impractical. We use PDF as a container for a mixture of JBIG2 and JPEG2000 images which are composed into a final set of pages. We discuss both the implementation of an open source JBIG2 encoder, which we use to compress text data, and the design of the infrastructure needed to meet the technical, legal and user requirements of serving many scanned works. We also cover the lessons learnt about dealing with different PDF readers and how to write files that work on most of the readers, most of the time.

Paper Details

Date Published: 29 January 2007
PDF: 10 pages
Proc. SPIE 6500, Document Recognition and Retrieval XIV, 65000H (29 January 2007); doi: 10.1117/12.710609
Show Author Affiliations
Adam Langley, Google Inc. (United States)
Dan S. Bloomberg, Google Inc. (United States)

Published in SPIE Proceedings Vol. 6500:
Document Recognition and Retrieval XIV
Xiaofan Lin; Berrin A. Yanikoglu, Editor(s)

© SPIE. Terms of Use
Back to Top
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research
Forgot your username?