Share Email Print
cover

Proceedings Paper

Finding relevant PDF medical journal articles by the content of their figures
Author(s): Ammon Christiansen; Dah-Jye Lee; Yuchou Chang
Format Member Price Non-Member Price
PDF $14.40 $18.00

Paper Abstract

Literature review is a time-consuming burden because it is hard to find relevant articles. But literature review is so important because it allows researchers to find solutions to their questions/problems from previous work already performed and published by others. It is difficult to wade through documents quickly and assess their quality by only looking at their title, abstract, or even full-text. The human visual system allows us to quickly glance at images and infer the main subject of an article and decide whether we are interested in reading more. In some cases, such as biology articles for example, figures showing photos of experimental results quickly allow a researcher in the literature review phase to determine the quality of the work by its results. This work describes a system for literature review that uses content-based image retrieval (CBIR) techniques to search for relevant documents using the content of figures in a document along with relevance feedback refinement instead of keyword search guesswork. The long-term goal is to use it as a subsystem in a content-based document retrieval system where the figures and their captions are combined with the document's body text. This paper describes the processing of the documents to extract available raster graphics as well as text with its layout and formatting information intact. The process of matching a figure to its caption using this layout information is then described. While caption-based search is implemented but not quite merged into the system yet, the figure-caption matching is complete. Two novel modified tf-idf measures that are being considered to take into account bold/italic text, font size, and document structure as a way to infer text importance rather than just rely on text frequency is detailed mathematically and explained intuitively. CBIR queries where there are multiple images that form the query are issued as separate queries and their results are then merged together.

Paper Details

Date Published: 21 March 2007
PDF: 12 pages
Proc. SPIE 6516, Medical Imaging 2007: PACS and Imaging Informatics, 65160K (21 March 2007); doi: 10.1117/12.709911
Show Author Affiliations
Ammon Christiansen, Brigham Young Univ. (United States)
Dah-Jye Lee, Brigham Young Univ. (United States)
Yuchou Chang, Brigham Young Univ. (United States)


Published in SPIE Proceedings Vol. 6516:
Medical Imaging 2007: PACS and Imaging Informatics
Steven C. Horii; Katherine P. Andriole, Editor(s)

© SPIE. Terms of Use
Back to Top