Share Email Print

Proceedings Paper

Combining SVM classifiers to identify investigator name zones in biomedical articles
Author(s): Jongwoo Kim; Daniel X. Le; George R. Thoma
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

This paper describes an automated system to label zones containing Investigator Names (IN) in biomedical articles, a key item in a MEDLINE® citation. The correct identification of these zones is necessary for the subsequent extraction of IN from these zones. A hierarchical classification model is proposed using two Support Vector Machine (SVM) classifiers. The first classifier is used to identify an IN zone with highest confidence, and the other classifier identifies the remaining IN zones. Eight sets of word lists are collected to train and test the classifiers, each set containing collections of words ranging from 100 to 1,200. Experiments based on a test set of 105 journal articles show a Precision of 0.88, 0.97 Recall, 0.92 F-Measure, and 0.99 Accuracy.

Paper Details

Date Published: 23 January 2012
PDF: 8 pages
Proc. SPIE 8297, Document Recognition and Retrieval XIX, 829704 (23 January 2012); doi: 10.1117/12.910517
Show Author Affiliations
Jongwoo Kim, National Library of Medicine (United States)
Daniel X. Le, National Library of Medicine (United States)
George R. Thoma, National Library of Medicine (United States)

Published in SPIE Proceedings Vol. 8297:
Document Recognition and Retrieval XIX
Christian Viard-Gaudin; Richard Zanibbi, Editor(s)

© SPIE. Terms of Use
Back to Top