Share Email Print

Proceedings Paper

Toward text understanding: classification of text documents by word map
Author(s): Ari J. E. Visa; Jarmo Toivanen; Barbro Back; Hannu Vanharanta
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

In many fields, for example in business, engineering, and law there is interest in the search and the classification of text documents in large databases. To information retrieval purposes there exist methods. They are mainly based on keywords. In cases where keywords are lacking the information retrieval is problematic. One approach is to use the whole text document as a search key. Neural networks offer an adaptive tool for this purpose. This paper suggests a new adaptive approach to the problem of clustering and search in large text document databases. The approach is a multilevel one based on word, sentence, and paragraph level maps. Here only the word map level is reported. The reported approach is based on smart encoding, on Self-Organizing Maps, and on document histograms. The results are very promising.

Paper Details

Date Published: 6 April 2000
PDF: 7 pages
Proc. SPIE 4057, Data Mining and Knowledge Discovery: Theory, Tools, and Technology II, (6 April 2000); doi: 10.1117/12.381745
Show Author Affiliations
Ari J. E. Visa, Lappeenranta Univ. of Technology (Finland)
Jarmo Toivanen, Lappeenranta Univ. of Technology (Finland)
Barbro Back, Abo Akademi Univ. (Finland)
Hannu Vanharanta, Pori School of Technology and Economics (Finland)

Published in SPIE Proceedings Vol. 4057:
Data Mining and Knowledge Discovery: Theory, Tools, and Technology II
Belur V. Dasarathy, Editor(s)

© SPIE. Terms of Use
Back to Top