Share Email Print

Proceedings Paper

Toward building a comprehensive data mart
Author(s): Douglas Boulware; John Salerno; Richard Bleich; Michael L. Hinman
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

To uncover new relationships or patterns one must first build a corpus of data or what some call a data mart. How can we make sure we have collected all the pertinent data and have maximized coverage? There are hundreds of search engines that are available for use on the Internet today. Which one is best? Is one better for one problem and a second better for another? Are meta-search engines better than individual search engines? In this paper we look at one possible approach in developing a methodology to compare a number of search engines. Before we present this methodology, we first provide our motivation towards the need for increased coverage. We next investigate how we can obtain ground truth and what the ground truth can provide us in the way of some insight into the Internet and search engine capabilities. We then conclude our discussion by developing a methodology in which we compare a number of the search engines and how we can increase overall coverage and thus a more comprehensive data mart.

Paper Details

Date Published: 12 April 2004
PDF: 10 pages
Proc. SPIE 5433, Data Mining and Knowledge Discovery: Theory, Tools, and Technology VI, (12 April 2004); doi: 10.1117/12.542928
Show Author Affiliations
Douglas Boulware, Air Force Research Lab. (United States)
John Salerno, Air Force Research Lab. (United States)
Richard Bleich, Air Force Research Lab. (United States)
Michael L. Hinman, Air Force Research Lab. (United States)

Published in SPIE Proceedings Vol. 5433:
Data Mining and Knowledge Discovery: Theory, Tools, and Technology VI
Belur V. Dasarathy, Editor(s)

© SPIE. Terms of Use
Back to Top