Share Email Print
cover

Proceedings Paper

Data mining approaches for information retrieval from genomic databases
Author(s): Donglin Liu; Gautam B. Singh
Format Member Price Non-Member Price
PDF $14.40 $18.00

Paper Abstract

Sequence retrieval in genomic databases is used for finding sequences related to a query sequence specified by a user. Comparison is the main part of the retrieval system in genomic databases. An efficient sequence comparison algorithm is critical in bioinformatics. There are several different algorithms to perform sequence comparison, such as the suffix array based database search, divergence measurement, methods that rely upon the existence of a local similarity between the query sequence and sequences in the database, or common mutual information between query and sequences in DB. In this paper we have described a new method for DNA sequence retrieval based on data mining techniques. Data mining tools generally find patterns among data and have been successfully applied in industries to improve marketing, sales, and customer support operations. We have applied the descriptive data mining techniques to find relevant patterns that are significant for comparing genetic sequences. Relevance feedback score based on common patterns is developed and employed to compute distance between sequences. The contigs of human chromosomes are used to test the retrieval accuracy and the experimental results are presented.

Paper Details

Date Published: 6 April 2000
PDF: 10 pages
Proc. SPIE 4057, Data Mining and Knowledge Discovery: Theory, Tools, and Technology II, (6 April 2000); doi: 10.1117/12.381750
Show Author Affiliations
Donglin Liu, Oakland Univ. (United States)
Gautam B. Singh, Oakland Univ. (United States)


Published in SPIE Proceedings Vol. 4057:
Data Mining and Knowledge Discovery: Theory, Tools, and Technology II
Belur V. Dasarathy, Editor(s)

© SPIE. Terms of Use
Back to Top