Variance reduction for error estimation when classifying colon polyps from CT colonography
For cancer polyp detection based on CT colonography we investigate the sample variance of two methods for estimating the sensitivity and specificity. The goal is the reduction of sample variance for both error estimates, as a first step towards comparison with other detection schemes. Our detection scheme is based on a committee of support vector machines. The two estimates of sensitivity and specificity studied here are a smoothed bootstrap (the 632+ estimator), and ten-fold cross-validation. It is shown that the 632+ estimator generally has lower sample variance than the usual cross-validation estimator. When the number of nonpolyps in the training set is relatively small we obtain approximately 80% sensitivity and 50% specificity (for either method). On the other hand, when the number of nonpolyps in the training set is relatively large, estimated sensitivity (for either method) drops considerably. Finally, we consider the intertwined roles of relative sample sizes (polyp/nonpolyp), misclassification costs, and bias-variance reduction.

Date Published: 2 May 2003
