Exploiting major trends in subject hierarchies for large-scale collection visualization
Paper Abstract
Many large digital collections are currently organized by subject; however, these useful information organization
structures are large and complex, making them difficult to browse. Current online tools and visualization prototypes
show small localized subsets and do not provide the ability to explore the predominant patterns of the overall subject
structure. This research addresses this issue by simplifying the subject structure using two techniques based on the
highly uneven distribution of real-world collections: level compression and child pruning. The approach is demonstrated
using a sample of 130K records organized by the Library of Congress Subject Headings (LCSH). Promising results show
that the subject hierarchy can be reduced down to 42% of its initial size, while maintaining access to 81% of the
collection. The visual impact is demonstrated using a traditional outline view allowing searchers to dynamically change
the amount of complexity that they feel necessary for the tasks at hand.
This paper was published in SPIE Proceedings Vol. 8294