Share Email Print
cover

Proceedings Paper

The effect of data set size on computer-aided diagnosis of breast cancer: comparing decision fusion to a linear discriminant
Author(s): Jonathan L. Jesneck; Loren W. Nolte; Jay A. Baker; Joseph Y. Lo
Format Member Price Non-Member Price
PDF $14.40 $18.00

Paper Abstract

Data sets with relatively few observations (cases) in medical research are common, especially if the data are expensive or difficult to collect. Such small sample sizes usually do not provide enough information for computer models to learn data patterns well enough for good prediction and generalization. As a model that may be able to maintain good classification performance in the presence of limited data, we used decision fusion. In this study, we investigated the effect of sample size on the generalization ability of both linear discriminant analysis (LDA) and decision fusion. Subsets of large data sets were selected by a bootstrap sampling method, which allowed us to estimate the mean and standard deviation of the classification performance as a function of data set size. We applied the models to two breast cancer data sets and compared the models using receiver operating characteristic (ROC) analysis. For the more challenging calcification data set, decision fusion reached its maximum classification performance of AUC = 0.80±0.04 at 50 samples and pAUC = 0.34±0.05 at 100 samples. The LDA reached a lower performance and required many more cases, with a maximum of AUC = 0.68±0.04 and pAUC = 0.12±0.05 at 450 samples. For the mass data set, the two classifiers had more similar performance, with AUC = 0.92±0.02 and pAUC = 0.48±0.02 at 50 samples for decision fusion and AUC = 0.92±0.03 and pAUC = 0.55±0.04 at 500 samples for the LDA.

Paper Details

Date Published: 17 March 2006
PDF: 6 pages
Proc. SPIE 6146, Medical Imaging 2006: Image Perception, Observer Performance, and Technology Assessment, 614616 (17 March 2006); doi: 10.1117/12.655235
Show Author Affiliations
Jonathan L. Jesneck, Duke Univ. (United States)
Duke Advanced Imaging Labs. (United States)
Loren W. Nolte, Duke Univ. (United States)
Jay A. Baker, Duke Advanced Imaging Labs. (United States)
Joseph Y. Lo, Duke Univ. (United States)
Duke Advanced Imaging Labs. (United States)


Published in SPIE Proceedings Vol. 6146:
Medical Imaging 2006: Image Perception, Observer Performance, and Technology Assessment
Yulei Jiang; Miguel P. Eckstein, Editor(s)

© SPIE. Terms of Use
Back to Top