Share Email Print

Proceedings Paper

Identification of comment-on sentences in online biomedical documents using support vector machines
Author(s): In Cheol Kim; Daniel X. Le; George R. Thoma
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

MEDLINE(R) is the premier bibliographic online database of the National Library of Medicine, containing approximately 14 million citations and abstracts from over 4,800 biomedical journals. This paper presents an automated method based on support vector machines to identify a "comment-on" list, which is a field in a MEDLINE citation denoting previously published articles commented on by a given article. For comparative study, we also introduce another method based on scoring functions that estimate the significance of each sentence in a given article. Preliminary experiments conducted on HTML-formatted online biomedical documents collected from 24 different journal titles show that the support vector machine with polynomial kernel function performs best in terms of recall and F-measure rates.

Paper Details

Date Published: 29 January 2007
PDF: 8 pages
Proc. SPIE 6500, Document Recognition and Retrieval XIV, 65000O (29 January 2007); doi: 10.1117/12.704423
Show Author Affiliations
In Cheol Kim, National Library of Medicine (United States)
Daniel X. Le, National Library of Medicine (United States)
George R. Thoma, National Library of Medicine (United States)

Published in SPIE Proceedings Vol. 6500:
Document Recognition and Retrieval XIV
Xiaofan Lin; Berrin A. Yanikoglu, Editor(s)

© SPIE. Terms of Use
Back to Top
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research
Forgot your username?