Share Email Print

Proceedings Paper

Online scientific data curation, publication, and archiving
Author(s): Jim Gray; Alexander S. Szalay; Ani R. Thakar; Christopher Stoughton; Jan vandenBerg
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

Science projects are data publishers. The scale and complexity of current and future science data changes the nature of the publication process. Publication is becoming a major project component. At a minimum, a project must preserve the ephemeral data it gathers. Derived data can be reconstructed from metadata, but metadata is ephemeral. Longer term, a project should expect some archive to preserve the data. We observe that published scientific data needs to be available forever -- this gives rise to the data pyramid of versions and to data inflation where the derived data volumes explode. As an example, this article describes the Sloan Digital Sky Survey (SDSS) strategies for data publication, data access, curation, and preservation.

Paper Details

Date Published: 16 December 2002
PDF: 5 pages
Proc. SPIE 4846, Virtual Observatories, (16 December 2002); doi: 10.1117/12.461524
Show Author Affiliations
Jim Gray, Microsoft Corp. (United States)
Alexander S. Szalay, The Johns Hopkins Univ. (United States)
Ani R. Thakar, The Johns Hopkins Univ. (United States)
Christopher Stoughton, Fermi National Accelerator Lab. (United States)
Jan vandenBerg, The Johns Hopkins Univ. (United States)

Published in SPIE Proceedings Vol. 4846:
Virtual Observatories
Alexander S. Szalay, Editor(s)

© SPIE. Terms of Use
Back to Top