Proceedings Volume 3981

Medical Imaging 2000: Image Perception and Performance

cover
Proceedings Volume 3981

Medical Imaging 2000: Image Perception and Performance

View the digital version of this volume at SPIE Digital Libarary.

Volume Details

Date Published: 14 April 2000
Contents: 8 Sessions, 36 Papers, 0 Presentations
Conference: Medical Imaging 2000 2000
Volume Number: 3981

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Computer-Aided Diagnosis and Performance
  • Image Quality Assessment
  • Modeling the Visual System and Observer Performance I
  • Modeling the Visual System and Observer Performance II
  • Poster Session
  • Modeling the Visual System and Observer Performance II
  • ROC Analysis
  • Observer Performance and Expertise
  • Technology Evaluation and Observer Performance
  • Poster Session
Computer-Aided Diagnosis and Performance
icon_mobile_dropdown
Assessing the value of diagnostic imaging: the role of perception
E. James Potchen, Thomas G. Cooper
The value of diagnostic radiology rests in its ability to provide information. Information is defined as a reduction in randomness. Quality improvement in any system requires diminution in the variation in its performance. The major variation in performance of the system of diagnostic radiology occurs in observer performance and in the communication of information from the observer to someone who will apply that information to the benefit of the patient. The ability to provide information can be determined by observer performance studies using a receiver-operating characteristic (ROC) curve analysis. The amount of information provided by each observer can be measured in terms of the uncertainty they reduce. Using a set of standardized radiographs, some normal and some abnormal, sorting them randomly, and then asking an observer to redistribute them according to their probability of normality can measure the difference in the value added by different observers. By applying this observer performance measure, we have been able to characterize individual radiologists, groups of radiologists, and regions of the United States in their ability to add value in chest radiology. The use of these technologies in health care may improve upon the contribution of diagnostic imaging.
Relative gains in diagnostic accuracy between computer-aided diagnosis and independent double reading
Yulei Jiang, Robert M. Nishikawa, Robert A. Schmidt, et al.
Double readings of chest radiographs and mammograms made by two radiologists have been investigated as a way to improve diagnostic accuracy. Computer-aided diagnosis (CAD) also has been investigated as an alternative method to improve diagnostic accuracy. Our purpose was to compare the relative gains that can be obtained from independent double readings and from computer-aided diagnosis in the diagnosis of breast lesions in mammograms (determining a known lesion as malignant or benign). We conducted an observer study from which we obtained data of radiologists' unaided single-reading performance and their single-reading performance with a computer aid. We then derived their unaided double-reading performance according to six different methods. Our results show that computer-aided diagnosis can potentially improve radiologists' diagnostic accuracy more than independent double readings by two radiologists.
Evaluation of an automated segmentation method based on performances of an automated classification method
We have developed a computerized method for the automatic segmentation of mass lesions on digitized mammograms using gray-level region-growing. This segmentation technique has been incorporated into our automated classification scheme which consists of (1) automated segmentation (2) automated feature-extraction and (3) determination of likelihood of malignancy using an automated classifier. The feature- extraction techniques extract various features from the neighborhoods of the computer-grown mass region to characterize the margin, shape and density of the mass. The automated classifier is then used to merge these computer- extracted features into a number related to the likelihood of malignancy. To evaluate quantitatively the performance of the segmentation technique, we calculate the area of overlap between the computer-grown mass regions and radiologist- identified mass regions. In addition, we substitute the computer-identified margins with radiologist-identified margins in our classification scheme. The performances of individual features as well as the classification scheme in terms of their ability to differentiate between benign and malignant masses are evaluated using receiver operating characteristic (ROC) analysis. The performance obtained based on the mass regions identified by the automated segmentation technique and by radiologists are compared to evaluate the adequacy of the region growing. Results from this study show that the automated segmentation technique tends to undergrow the mass regions by approximately one quarter of the area identified by the radiologists. However, the superior performances of the computer-extracted features and the classification scheme based on the analysis of the computer- grown mass regions indicated that the computer-grown mass regions are sufficient for the subsequent techniques of feature-extraction and classification to accurately characterize mass lesions.
Image Quality Assessment
icon_mobile_dropdown
A quantitative method for visual phantom image quality evaluation
Dev Prasad Chakraborty, Xiong Liu, Michael O'Shea, et al.
This work presents an image quality evaluation technique for uniform-background target-object phantom images. The Degradation-Comparison-Threshold (DCT) method involves degrading the image quality of a target-containing region with a blocking processing and comparing the resulting image to a similarly degraded target-free region. The threshold degradation needed for 92% correct detection of the target region is the image quality measure of the target. Images of American College of Radiology (ACR) mammography accreditation program phantom were acquired under varying x-ray conditions on a digital mammography machine. Five observers performed ACR and DCT evaluations of the images. A figure-of-merit (FOM) of an evaluation method was defined which takes into account measurement noise and the change of the measure as a function of x-ray exposure to the phantom. The FOM of the DCT method was 4.1 times that of the ACR method for the specks, 2.7 times better for the fibers and 1.4 times better for the masses. For the specks, inter-reader correlations on the same image set increased significantly from 87% for the ACR method to 97% for the DCT method. The viewing time per target for the DCT method was 3 - 5 minutes. The observed greater sensitivity of the DCT method could lead to more precise Quality Control (QC) testing of digital images, which should improve the sensitivity of the QC process to genuine image quality variations. Another benefit of the method is that it can measure the image quality of high detectability target objects, which is impractical by existing methods.
Evaluation of lumbar spine images with added pathology
Anders Tingberg, Clemens Herrmann, Jack Besjakov, et al.
Optimization of radiographic procedures require solid tools for evaluation of the image quality in order to ensure that it is sufficient to answer the clinical question at the lowest possible absorbed dose to the patient. Lumbar spine radiography is an examination giving a relatively high dose and good methods for evaluation of image quality as well as dose are needed. We have developed and used a method for the addition of artificial pathological structures into clinical images. The new images were evaluated in a study of detectability (free-response forced error experiment). The results from the study showed that the methodology can be used to detect differences in the screen-film systems used to produce the images, indicating that the method can be used in a study of image quality. The results of the study of detectability were compared with the outcome of a visual grading analysis based on the structures mentioned in the European Quality Criteria. The comparison indicated that a linear correlation exists between the two methods. This means that the simple VGA can be used in the evaluation of clinical image quality.
Quantitative image quality evaluation of an order-statistic filter in x-ray fluoroscopic imaging
David L. Wilson, Yogesh Srinivas, Francisco J. Sanchez-Marin, et al.
The use of x-ray fluoroscopy in complex interventional procedures can result in high patient dose leading to severe skin injuries. Simply reducing exposure degrades image quality. One solution is to acquire images at reduced exposures and digitally filter to reduce noise and restore image quality. We quantitatively evaluated image quality improvement from a bi-directional multi-stage (BMS) median spatio-temporal filter. Improvements were assessed using forced-choice measurement of detection and discrimination. Targets were vertical projected cylinders mimicking interventional devices such as catheters. Poisson white noise was added on uniform backgrounds to simulate low-dose x-ray fluoroscopy. The BMS filter improved detection by 20% and discrimination by 31% giving estimated dose savings of 31% and 42%, respectively. Minimal spatial and temporal blurring of targets was observed in filtered sequences.
Evaluating image reconstruction methods for tumor detection performance in whole-body PET oncology imaging
Carole Lartizien, Paul E. Kinahan, Claude Comtat, et al.
This work presents initial results from observer detection performance studies using the same volume visualization software tools that are used in clinical PET oncology imaging. Research into the FORE+OSEM and FORE+AWOSEM statistical image reconstruction methods tailored to whole- body 3D PET oncology imaging have indicated potential improvements in image SNR compared to currently used analytic reconstruction methods (FBP). To assess the resulting impact of these reconstruction methods on the performance of human observers in detecting and localizing tumors, we use a non- Monte Carlo technique to generate multiple statistically accurate realizations of 3D whole-body PET data, based on an extended MCAT phantom and with clinically realistic levels of statistical noise. For each realization, we add a fixed number of randomly located 1 cm diam. lesions whose contrast is varied among pre-calibrated values so that the range of true positive fractions is well sampled. The observer is told the number of tumors and, similar to the AFROC method, asked to localize all of them. The true positive fraction for the three algorithms (FBP, FORE+OSEM, FORE+AWOSEM) as a function of lesion contrast is calculated, although other protocols could be compared. A confidence level for each tumor is also recorded for incorporation into later AFROC analysis.
Perceptually standardized imaging of digitized film for comparative ROC measurements
Klaus-Ruediger Peters
Primary diagnostic reading of digitized film, displayed on a CRT, requires a quality assurance (QA) that all of the perceptual information, which is accessible from the film when displayed on a light table, can be reproduced on the CRT display. Some of the CRT display parameters are already defined. The DICOM standard 3.14 introduces a contrast transfer function (grayscale display function standard) for minimum contrasts which are defined as single just-noticeable- differences (JND) in luminance. It establishes, throughout the entire intensity range of the data, contrast recognition of single JND resolution and implies linearity of contrast perception. Conventionally, the CRT QA utilizes a SMPTE contrast test pattern of 5% contrast ]13 digital driving levels (DDLs)[ resolution. We developed for the QA of the new standard a perceptual contrast pattern with single DDL resolution (perceptual contrast pattern equals P-pattern). It allows the visual assessment and measurement of perceived contrast linearity at a minimum level 1.2% contrast (3 DDLs). We analyzed, at various room light conditions (0 - 300 cdm-2) and on DICOM 3.14-standardized CRTs of various maximum luminance (110 - 400 cdm-2), the minimum luminance of the CRT that is required for establishing perceptual contrast linearity. We compared the use of the SMPTE pattern and the P-pattern for QA of linearity. The SMPTE pattern was not effective for a statistically significant correlation of contrast linearity with room light and monitor luminance. However, the P-pattern furnished an effective tool for the assessment, measurement or adjustment of contrast linearity. We found that perception of linearity for 3 JND contrasts at low room light conditions (0 - 6 cdm-2) requires a minimum luminance of 10 cdm-2 which is not directly derived from the background luminance of the light reflected from the monitor but may indicate a perceptual threshold value for task performance.
Modeling the Visual System and Observer Performance I
icon_mobile_dropdown
Estimates of human-observer templates for a simple detection task in correlated noise
An observer template encapsulates the relative weight assigned to each pixel by an observer performing a detection or discrimination task. In previous work, Abbey et Al. (Proc. SPIE 3663, 1999.) have developed a methodology for estimating human-observer templates in two-alternative forced-choice detection and classification experiments. The methodology assumes a linear observer, and is designed for use on images with noise that is Gaussian distributed. In this work, the methodology is used to evaluate human-observer templates for tasks in correlated Gaussian noise. We consider detection of a Gaussian 'bump' signal in image noise that has either lowpass or highpass correlation structures, as well as white- noise task for reference. The estimated templates show that the human observers are sensitive to the correlation structure of the noise (as expected). However, the human-observer templates do not appear to fully adapt to noise correlations as the ideal observer does.
Perception of achromatic, monochromatic, pure chromatic, and chromatic noisy images by real human-observer under threshold conditions
Nikolay N. Krasilnikov, Olga I. Krasilnikova, Yury E. Shelepin M.D.
In the experimental verification of the ideal observer theory applicability to observation of: achromatic, monochromatic, pure chromatic and chromatic noisy images by real human- observer under threshold conditions we used the method of comparative measurements. We measured and compared the correct identification probabilities of the test objects in noisy above mentioned images by real human-observer and computer model of ideal observer. For the case when we have no full knowledge about test objects parameters we've developed the modified Zigert-Kotelnikov algorithm and appropriate model. In particular, when all image parameters are a priory known, this algorithm coincides with the ideal observer one. We formulated 3 new laws of matched filtering of exactly known color images and concluded that the probabilities of correct identification by the observer and by the computer model are in good agreement in a wide range of noise intensities. Absence of a priori information about test objects coordinates unlike test objects sizes information influences greatly on the correct identification probabilities. Our results are useful in modeling of human vision under threshold conditions. The developed model may be effectively used for estimation of picture quality impairment on the monitor screen, the diagnostic of the human visual system condition, etc.
Estimation of linear observer templates in the presence of multi-peaked gaussian noise through 2AFC experiments
We extend a method for linear template estimation developed by Abbey et al. which demonstrated that a linear observer template can be estimated effectively through a two- alternative forced choice (2AFC) experiment, assuming the noise in the images is Gaussian, or multivariate normal (MVN). We relax this assumption, allowing the noise in the images to be drawn from a weighted sum of MVN distributions, which we call a multi-peaked MVN (MPMVN) distribution. Our motivation is that more complicated probability density functions might be approximated in general by such MPMVN distributions. Our extension of Abbey et al.'s method requires us to impose the additional constraint that the covariance matrices of the component peaks of the signal-present noise distribution all be equal, and that the covariance matrices of the component peaks of the signal-absent noise distribution all be equal (but different in general from the signal-present covariance matrices). Preliminary research shows that our generalized method is capable of producing unbiased estimates of linear observer templates in the presence of MPMVN noise under the stated assumptions. We believe this extension represents a next step toward the general treatment of arbitrary image noise distributions.
Modeling of the contrast sensitivity saturation depended on the number of cycles of the test sine-wave gratings
Olga I. Krasilnikova, Nikolay N. Krasilnikov, Yury E. Shelepin M.D.
We investigated what mechanism is responsible for the saturation process of the contrast sensitivity dependence upon the number of cycles of sine-wave gratings (SWG). We used the method of mathematical analysis of SWG spatial frequency transformation made by fovea receptive fields of retina by means of convolution of SWG with receptive fields weight functions. We started from matched filtering conception supposing that contrast sensitivity is inversely proportional to threshold grating contrast, which is sufficient for the grating detection in the presence of the internal noise of the visual system. The principal feature of the analysis consists in taking into account the following fact. When the eccentricity increases the size of receptive field increases too and so when the number of cycles of SWG increases under constant spatial frequency, the more part of grating is located in the area of low spatial frequency resolution. Therefore the further increase in the number of cycles of SWG does not effect on the contrast energy of the signal in neural net. The comparison of calculated data and numerous published experimental data demonstrated their good coincidence.
Modeling the Visual System and Observer Performance II
icon_mobile_dropdown
Model observer based optimization of JPEG image compression
Recent work applied model observers to predict the effect of JPEG and wavelet image compression on human visual detection of simulated lesions embedded in real structured backgrounds (x-ray coronary angiograms). We extend the use of model observers to perform parameter optimization of image compression in order to maximize visual detection performance at a given compression ratio. A simulated annealing algorithm was used to find the optimal quantization matrix. In each iteration, the 64 quantization parameters of the JPEG algorithm were randomly perturbed (while preserving a fixed compression ratio). Each set of quantization parameters setting was used to compress 400 test images. Model observer performance (Pc) was then obtained for the images that had undergone compression. The simulated annealing algorithm converged (as determined by the 'annealing schedule') to an 'optimal' quantization matrix. A follow-up human psychophysical study with two naive observers was conducted to compare optimal quantization matrix with respect to the default JPEG quantization matrix. For both human observers visual detection performance improved significantly from the default to the optimized quantization matrix condition. Our results suggest that model observers can be successfully used for task-based performance optimization of image compression algorithms.
Poster Session
icon_mobile_dropdown
Interpretation of the function of the striate cortex
Bernardette M. Garner, Andrew P. Paplinski
Biological neural networks do not require retraining every time objects move in the visual field. Conventional computer neural networks do not share this shift-invariance. The brain compensates for movements in the head, body, eyes and objects by allowing the sensory data to be tracked across the visual field. The neurons in the striate cortex respond to objects moving across the field of vision as is seen in many experiments. It is proposed, that the neurons in the striate cortex allow continuous angle changes needed to compensate for changes in orientation of the head, eyes and the motion of objects in the field of vision. It is hypothesized that the neurons in the striate cortex form a system that allows for the translation, some rotation and scaling of objects and provides a continuity of objects as they move relative to other objects. The neurons in the striate cortex respond to features which are fundamental to sight, such as orientation of lines, direction of motion, color and contrast. The neurons that respond to these features are arranged on the cortex in a way that depends on the features they are responding to and on the area of the retina from which they receive their inputs.
Modeling the Visual System and Observer Performance II
icon_mobile_dropdown
What visual perception model is optimal in terms of signal-to-noise ratio?
Yury E. Shelepin M.D., Nikolay N. Krasilnikov, Olga I. Krasilnikova, et al.
The choice between two groups of human vision functional models: multi-channel and its modifications and the models based on the conception of matched filtering in the human visual system is difficult because predictions made by using each of these models are very close to experimental data. One of exceptions is the dependence of contrast sensitivity on the number (N) of sine-wave grating cycles when N is small and spatial frequency is constant. According to matched filtering models reduction of N specifies the proportional reduction in contrast sensitivity. According to multi-channel models reduction in N specifies the more strong reduction in contrast sensitivity than in the first case, since when N decreases, the spatial-frequency spectrum of test grating expands and therefore the energy corresponding to the appropriate channel decreases because of energy redistribution between all channels. Hence, the more energy is need for test grating detection with given probability than it is obtained in the experiments. Performed calculations and comparison their results with experimental data permit us to prefer the matched filtering model for description and modeling of the human- display interactions.
ROC Analysis
icon_mobile_dropdown
The problem of ROC analysis without truth: the EM algorithm and the information matrix
Sergey V. Beiden, Gregory Campbell, Kristen L. Meier, et al.
Henkelman, Kay, and Bronskill (HKB) showed that although the problem of ROC analysis without truth is underconstrained and thus not uniquely solvable in one dimension (one diagnostic test), it is in principle solvable in two or more dimensions. However, they gave no analysis of the resulting uncertainties. The present work provides a maximum-likelihood solution using the EM (expectation-maximization) algorithm for the two- dimensional case. We also provide an analysis of uncertainties in terms of Monte Carlo simulations as well as estimates based on Fisher Information Matrices for the complete- and the missing-data problem. We find that the number of patients required for a given precision of estimate for the truth- unknown problem is a very large multiple of that required for the corresponding truth-known case.
Disease prevalence and the index of detectability: a survey of studies of lung cancer detection by chest radiography
Harold L. Kundel
A survey of 12 studies of lung cancer detection with cancer prevalence ranging from 0.9 to 476 cancers per 1000 showed that the unit-slope index of detectability, d', decreased from a high value of 3.9 at low prevalence to 1.4 at high prevalence. A proposed explanation is that the readers are operating on an ROC curve with a slope that is less than unity approximating 0.6. On such a curve, a shift to more stringent criteria that occurs with decreasing prevalence in order to minimize false positives, would result in a increased unity- slope d'.
Using a constrained formulation based on probability summation to fit receiver operating characteristic (ROC) curves
Richard G. Swensson, Jill L. King, Walter F. Good, et al.
A constrained ROC formulation from probability summation is proposed for measuring observer performance in detecting abnormal findings on medical images. This assumes the observer's detection or rating decision on each image is determined by a latent variable that characterizes the specific finding (type and location) considered most likely to be a target abnormality. For positive cases, this 'maximum- suspicion' variable is assumed to be either the value for the actual target or for the most suspicious non-target finding, whichever is the greater (more suspicious). Unlike the usual ROC formulation, this constrained formulation guarantees a 'well-behaved' ROC curve that always equals or exceeds chance- level decisions and cannot exhibit an upward 'hook.' Its estimated parameters specify the accuracy for separating positive from negative cases, and they also predict accuracy in locating or identifying the actual abnormal findings. The present maximum-likelihood procedure (runs on PC with Windows 95 or NT) fits this constrained formulation to rating-ROC data using normal distributions with two free parameters. Fits of the conventional and constrained ROC formulations are compared for continuous and discrete-scale ratings of chest films in a variety of detection problems, both for localized lesions (nodules, rib fractures) and for diffuse abnormalities (interstitial disease, infiltrates or pnumothorax). The two fitted ROC curves are nearly identical unless the conventional ROC has an ill behaved 'hook,' below the constrained ROC.
Observer Performance and Expertise
icon_mobile_dropdown
Do subtle breast cancers attract visual attention during initial impression?
Calvin F. Nodine, Claudia Mello-Thoms, Susan P. Weinstein, et al.
Women who undergo regular mammographic screening afford mammographers a unique opportunity to compare current mammograms with prior exams. This comparison greatly assists mammographers in detecting early breast cancer. A question that commonly arises when a cancer is detected under regular periodic screening conditions is whether the caner is new, or was it missed on the prior exam? This is a difficult question to answer by retrospective analysis, because knowledge of the status of the current exam biases the interpretation of the prior exam. To eliminate this bias and provide some degree of objectivity in studying this question, we looked at whether experienced mammographers who had no prior knowledge of a set of test cases fixated on potential cancer-containing regions on mammograms from cases penultimate to cancer detection. The results show that experienced mammographers cannot recognize most malignant cancers selected by retrospective analysis.
Image structure and perceptual errors in mammogram reading: a pilot study
Claudia Mello-Thoms, Stanley M. Dunn, Calvin F. Nodine, et al.
Early detection of breast cancer is very desirable, considering that it can significantly change the prognosis for a woman diagnosed with this disease. Nonetheless 10 - 30% of all breast cancers are missed by the radiologist, albeit they are visible in the mammogram. In this work we have studied the underlying structure of the image in the location of the lesions that were missed and the ones that were found, as well as in the locations of the lesions that did not exist but were reported by the radiologist. We have shown that there is a statistically significant difference in the information content of different frequency bands that results in various decision types. We have also shown that it is possible to use a pattern classifier, based upon the information contents of the spectral decomposition of a local image region, to predict the most likely decision outcome.
Is airport baggage inspection just another medical image?
Alastair G. Gale, Mark D. Mugglestone, Kevin J. Purdy, et al.
A similar inspection situation to medical imaging appears to be that of the airport security screener who examines X-ray images of passenger baggage. There is, however, little research overlap between the two areas. Studies of observer performance in examining medical images have led to a conceptual model which has been used successfully to understand diagnostic errors and develop appropriate training strategies. The model stresses three processes of; visual search, detection of potential targets, and interpretation of these areas; with most errors being due to the latter two factors. An initial study is reported on baggage inspection, using several brief image presentations, to examine the applicability of such a medical model to this domain. The task selected was the identification of potential Improvised Explosive Devices (IEDs). Specifically investigated was the visual search behavior of inspectors. It was found that IEDs could be identified in a very brief image presentation, with increased presentation time this performance improved. Participants fixated on IEDs very early on and sometimes concentrated wholly on this part of the baggage display. When IEDs were missed this was mainly due to interpretative factors rather than visual search or IED detection. It is argued that the observer model can be applied successfully to this scenario.
Unobtrusive method for monitoring visual attention during mammogram reading
Claudia Mello-Thoms, Calvin F. Nodine, Susan P. Weinstein, et al.
The use of feedback to the observer of the regions of the image that attract prolonged visual dwell (> 1000 ms) has been shown to improve nodule detection performance in reading chest x-rays. The application of such a feedback mechanism in mammography seems appropriate, but it is often discouraged by the inherent difficulties of using an invasive eye-tracking system. In this paper we discuss the use of an alternative method, namely, a digital zoom window, to monitor where the observer's attention is focused on the image. We have shown that the order in which the zooms occur, as well as the duration of certain zooms, is statistically correlated with decision outcome for a given region of the image. Furthermore we show a strong correlation between zooming and prolonged fixation.
Decision-making differences: novices, experts, and a neural network
David Manning, Sam Bunting, John Leach
We investigated the decision making performance of trained radiographers, novice radiographers and a neural network in the detection of fractures. Ground truth was established by the independent agreement of experienced radiologists for 740 single view digitized radiographs of the wrist. The images were categorized into negatives and positives; 520 of these were used to train the back propagation, three layer neural network in a supervised mode, and the remainder were used to create a test bank. The test was presented to 20 novice observers, 12 experienced radiographers trained in the detection of skeletal trauma and then to the trained neural network. ROC Az values for all the decision makers were not significantly different (p > 0.1) but there were significant differences in the values of True Positive and True Negative Fractions. The neural network showed a greater aptitude for distinguishing the normals. By filtering the neural net decisions through the human data we simulated the effect of assisted reporting. Results suggest that if fracture prevalence is very low in a population, a neural network demonstrating high specificity may have utility in reducing the number of images which must be reviewed by human experts.
Evaluation of head CT exams: resident and attending diagnoses
Elizabeth A. Krupinski, William Berger, William Erly
The goal of this study was to evaluate performance of radiology resident in interpretation of head CT exams ordered by emergency room physicians, and to compare their accuracy with the attending radiologists. 1324 consecutive CT head exams ordered by the ER were interpreted by radiology residents. They reported whether the case was normal or abnormal, noted the relevant findings, and reported their decision confidence using a 6-point scale. Attending neuroradiologists subsequently interpreted the exams. The exams were grouped into 3 categories based on correlation of readings: agree, disagree-insignificant, disagree-significant. There was 91% agreement between resident and attending diagnoses, 7% disagree-insignificant and 2% disagree- significant. Disagreements occurred more often on abnormal than normal cases. Disagreements occurred more often with 1st and 2nd year residents than with 3rd and 4th. Resident confidence was highest for 3rd years, followed by 4th, 2nd and 1st. The less confident a resident was in their diagnosis, the more likely a disagreement occurred. Cases in which a resident expresses a low level of confidence should be carefully checked by the attending since these cases were more often associated with a disagreement than those with high confidence.
Technology Evaluation and Observer Performance
icon_mobile_dropdown
Low contrast detectability and dose savings with an amorphous silicon detector designed for x-ray radiography
Ping Xue, Scott F. Schubert, Richard Aufrichtig
In an observer study we compare low contrast detectability and dose efficiency of an amorphous silicon x-ray detector versus a standard thoracic screen-film (Kodak InSight HC/InSight IT). Twelve images of a CDRAD contrast-detail phantom were acquired with the screen-film system using an entrance exposure corresponding to a conventional chest x-ray. Using the same x- ray system with an interchanged digital detector, we acquired four digital image sets (12 images each) at dose levels corresponding to 27%, 41%, 63% and 100% of the film dose. Prior to laser printing, the digital images were processed to match the film contrast and optical density level. A 4- alternative forced choice (4-AFC) paradigm with seven observers was used to measure the threshold contrasts of disk sizes from 0.5 to 4.0 mm. Further, we estimated the equivalent perceptual dose (EPD), which is the dose level of digital for which the same contrast detectability as film is obtained. Contrast detectability is significantly improved with the digital detector. On average, all disk shaped objects detected from the digital detector have lower threshold contrasts than those from film at the same dose level. The EPD value averaged over disk size is 44%, which corresponds to a 56% dose savings for the digital detector.
Pilot clinical evaluation of prototype system for visually reporting the results of ultrasound examination
Alexander Akimov, Vagan Terziyan, Artem Garmash
A dedicated optic-mechanical device, attachable to an ultrasound scanner, has been developed that allows visual documenting of ultrasound examination by recording multiple gray-scale images, i.e. ultrasound tomography (UST), to be performed routinely and at low cost. The device is operated by one hand without interrupting the examination. Each page of UST report is composed by deliberate positioning multiple images within the 2X4 framework and recorded on a 35-mm microfilm. If necessary, graphic reconstructions were composed from standard graphic components and interposed between the original images on the same frame. UST report is communicated to a patient and/or a referring doctor on a compact reflective hardcopy. Trial-and-error search for optimal number of images that constitute adequate UST document resulted in plateau of average 30 +/- 1.8 images per patient visit. Practice of UST is well accepted by local medical community and by patients, and the first year since its introduction yielded two-fold growth of referral to Ultrasound Studio. Low-cost optic- mechanical UST system could create locally culture of reporting ultrasound examination in images rather than verbally, and facilitate for further introduction of more advanced digital systems.
Physical and psychophysical evaluation of a new digital specimen radiography system for use in mammography
Elizabeth A. Krupinski, Hans Roehrig, William V. Schempp
The goal of this project was to physically and psychophysically evaluate a new digital detector for surgical breast specimen radiography. It is an optical imaging system with a 1K X 1K CCD detector with 24 micron pixels and a 2:1 fiber reducer. Physical evaluation of linearity, noise and spatial resolution were conducted. To measure observer performance, two contrast-detail phantoms were imaged and displayed in four conditions: plain film, CR printed to film, CR displayed on a monitor, and the new digital images displayed on a monitor. Images were acquired at 25 and 30 kV at high, medium and low exposures. 10 observers participated. The system is linear, noise goes as the square-root of exposure, and resolution is in excess of 10 lp/mm. Observer performance was significantly higher with the new digital system (average 74% detections) than all other conditions at all exposure levels, both kVs and for both phantoms (film average 28%, CR film average 61%, CR monitor 50%). The digital specimen radiography system outperformed all other detectors. The system is compact and can easily be used in the operating room. The two versions of the contrast-detail phantoms used in the study are essentially comparable, although some differences were noted.
Comparison of low-contrast detail detectability with five different conventional and digital radiographic imaging systems
Ulrich Neitzel, Albrecht Boehm, Ingo Maack
Five different X-ray imaging systems were evaluated comparatively with respect to low-contrast detail deductibility. The systems included in this study were two screen-film systems (speed classes 200 and 400), a computed radiography system, a digital selenium-based system with electrometer scanning and an indirect-type flat-panel detector system. Images of a contrast-detail phantom were acquired with all systems at a set of exactly matched exposures. The digital images were processed in a way to approximate the density and contrast appearance of the conventional film images when printed on laser film. Six observers evaluated a total number of 46 films. With respect to the threshold contrast for each detail size. Correct observation ratios and threshold contrasts were determined for all sizes and conditions. The overall results show that the low-contrast deductibility with all digital imaging systems is equal to or better than that with the conventional film-screen systems. The advantage is more evident for the newer digital systems (selenium detector and flat-panel detector) whereas the CR images are more on a par with the conventional films. The results can be understood assuming that low-contrast detection is limited mainly by quantum noise in the images and taking into account the different levels of detective quantum efficiency of these imaging systems.
Objective evaluation of linear and nonlinear tomosynthetic reconstruction algorithms
Richard L. Webber, Paul F. Hemler, John E. Lavery
This investigation objectively tests five different tomosynthetic reconstruction methods involving three different digital sensors, each used in a different radiologic application: chest, breast, and pelvis, respectively. The common task was to simulate a specific representative projection for each application by summation of appropriately shifted tomosynthetically generated slices produced by using the five algorithms. These algorithms were, respectively, (1) conventional back projection, (2) iteratively deconvoluted back projection, (3) a nonlinear algorithm similar to back projection, except that the minimum value from all of the component projections for each pixel is computed instead of the average value, (4) a similar algorithm wherein the maximum value was computed instead of the minimum value, and (5) the same type of algorithm except that the median value was computed. Using these five algorithms, we obtained data from each sensor-tissue combination, yielding three factorially distributed series of contiguous tomosynthetic slices. The respective slice stacks then were aligned orthogonally and averaged to yield an approximation of a single orthogonal projection radiograph of the complete (unsliced) tissue thickness. Resulting images were histogram equalized, and actual projection control images were subtracted from their tomosynthetically synthesized counterparts. Standard deviations of the resulting histograms were recorded as inverse figures of merit (FOMs). Visual rankings of image differences by five human observers of a subset (breast data only) also were performed to determine whether their subjective observations correlated with homologous FOMs. Nonparametric statistical analysis of these data demonstrated significant differences (P > 0.05) between reconstruction algorithms. The nonlinear minimization reconstruction method nearly always outperformed the other methods tested. Observer rankings were similar to those measured objectively.
Poster Session
icon_mobile_dropdown
Computer-aided methods to recover strategies for visual search and navigation
Tatjana P. Belikova, Irina I. Stenina, Nadezsda I. Yashunskaya
A series of methods developed to recover strategies for visual search and navigation to support image analysis and classification in the case of uncertainty is presented. We used optimal filtering for better imaging of informative features, along with expert descriptions of processed images by the expert in terms of observed features. We collected these data in a database. Expert-guided analysis of the data in the database was applied to find discriminative features important for image interpretation. A formal decision rule was worked out for computer-aided image classification. The developed formal decision rule presented an effective strategy for image analysis and interpretation, orienting the user to look for specific features, and also showing how to classify the image on the basis of observed features. The methods were tested in the task of early peripheral lung cancer diagnosis. Experiments with more than 600 lung tomogram showed that an application of the methods gives essential (by 10% - 16%) improvement of diagnostic accuracy for physicians of different qualifications.
Evaluation of image quality of a new CCD-based system for chest imaging
C. Herrmann, P Sund, A Tingberg, et al.
The Imix radiography system (Oy Imix Ab, Finland) consists of an intensifying screen, optics, and a CCD camera. An upgrade of this system (Imix 2000) with a red-emitting screen and new optics has recently been released. The image quality of Imix (original version), Imix 2000, and two storage-phosphor systems, Fuji FCR 9501 and Agfa ADC70 was evaluated in physical terms (DQE) and with visual grading of the visibility of anatomical structures in clinical images (141 kV). PA chest images of 50 healthy volunteers were evaluated by experienced radiologists. All images were evaluated on Siemens Simomed monitors, using the European Quality Criteria. The maximum DQE values for Imix, Imix 2000 Agfa and Fuji were 11%, 14%, 17% and 19%, respectively (141 kV, 5 (mu) Gy). Using the visual grading, the observers rated the systems in the following descending order: Fuji, Imix 2000, Agfa, and Imix. Thus, the upgrade to Imix 2000 resulted in higher DQE values and a significant improvement in clinical image quality. The visual grading agrees reasonably well with the DQE results; however, Imix 2000 received a better score than what could be expected from the DQE measurements.
How does observer training affect imaging performance in digital mammography?
Walter Huda, Guoying Qu, Zhenxue Jing, et al.
Simulated mass lesions, superimposed onto an anthropomorphic breast phantom, were x-rayed using a small field of view digital mammography system. Eight radiologists and four scientists viewed the phantom images on a display monitor in a darkened room. Five readers had prior experience of reading these type of images. Readers assessed the probability of a simulated mass being present in each ROI, with the resultant data used to plot the corresponding Receiver Operating Characteristic (ROC) curves, and determine the corresponding area under the ROC curve (Az). Readers viewed the same set of images five successive times in a single session, and the time taken to read each image was recorded. The average time to complete the study for all twelve observers was 24 minutes (71 seconds/image). Experienced readers were quicker than novices, and radiologists were quicker than the scientists. The average Az value for the twelve readers for this detection task was 0.842 +/- 0.037 with coefficient of variations for individual readers ranging from 2.1% to 7.7%. Differences in imaging performance between the radiologists and scientists were very small. Analysis of the trends in measured imaging performance for each reader viewing successive (repeat) images showed that there was no improvement in imaging performance with increasing experience.
Diagnostic accuracy of remote frozen sections compared with paraffin-embedded sections: a telepathology project in Austria
Patrizia Moser, Peter I. Soegner M.D., Sonja Stadlmann, et al.
The purpose of the present study was to evaluate the diagnostic accuracy of remote frozen sections examined by telepathology. The gold standard was the diagnosis made using direct examination of paraffin-embedded sections. A consecutive series of 134 frozen-section cases were examined by six qualified pathologists. We used the Zeiss telepathology system with robot microscopy, which allowed different magnifications and fields of view to be chosen. The wide-area network used the TCP/IP protocol. The diagnosis made on the frozen sections was compared with the final diagnosis in the paraffin-embedded sections. Times were recorded for each telepathology session, as well as the users comments on usability and software, and on any communication problems which occurred. In addition, we evaluated the importance of the macroscopic sampling of the surgical specimen, applied to each type of tissue. The diagnostic evaluation showed complete agreement in approximately 80% of cases, in 20% diagnosis was not possible due to insufficient quality of the slides. The median time for the telemedicine diagnosis was 14 min 30 sec.
Improving the visualization of drusen in age-related macular degeneration through maximum entropy digitization and stereo viewing
Peter Soliz, Sheila Coyne Nemeth, Maria Swift, et al.
The purpose of this paper is to report on the effects of stereopsis, resolution, color, and contrast on detecting and evaluating lesions associated with age-related macular degeneration (ARMD) in retinal images. Ten stereo-pairs were scanned from 35-mm color photographs at two different spatial resolutions (33 pixels/mm and 100 pixels/mm). The effects of image quality were established by presenting the digital images to trained analysts with varying degrees of spatial resolution and contrast viewed in 2-D and stereo. Stereopsis was produced using shuttered goggles to view the stereo pair on the computer monitor. Resolution had less of an effect on feature detection than contrast for the ARMD lesions studied. Detection was reduced only 12% for a 4X reduction in spatial resolution. Detection of ARMD lesions was reduced by more than 50% when contrast was reduced by a factor of 4. There was a 10 - 20% increase in the ability to detect drusen using full color images compared to red-free. The stereo effect produced by the shutter glasses and computer monitor was similar to that observed in viewing stereo slides. With high digital image quality, the stereo viewing computer-based system demonstrated significant potential for grading and analysis of retinal images for ARMD.
Statistical-based sub-band filtering technique for digital mammogram compression, enhancement, and denoise
Hong-Dun Lin, Kang-Ping Lin, Shyh-Liang Lou
Since the breast cancer is one of the major mortality increasing to middle-aged women, the digital mammograms are used to diagnose the breast cancer broadly. However, the digital mammography requires high spatial resolution and high gray-level resolution. These requirements result in very large image file sizes. Thus, the image transmission and image quality are very important problems in clinical diagnose. Several lossy image compression techniques have been developed to handle this issue. To improve these obstacles, we develop a statistical-based sub-band filtering technique for digital mammogram to increase the compression ratio and apply on the digital mammogram to enhance the disease part, and inhibit the noise in the image. By the way, the digital mammogram data can be compressed effectively and the image quality can also be improved conspicuously.
Bases of a pre-attentional mechanism by means of presynaptic inhibition in the lateral geniculate nucleus
Roberto Moreno-Diaz Jr., Alexis Quesada Arengbia, Miguel Aleman-Flores
Presynaptic Inhibition (PI) basically consists of the strong suppression of a neuron's response before the stimulus reaches the synaptic terminals mediated by a second, inhibitory, neuron. It has a long lasting effect, greatly potentiated by the action of anesthetics, that has been observed in motorneurons and in several other places of nervous systems, mainly in sensory processing. In this paper we will focus on several different ways of modeling the effect of Presynaptic Inhibition (PI) in the visual pathway as well as the different artificial counterparts derived from such modelling, mainly in two directions: the possibility of computing invariant representations against general changes in illumination of the input image impinging the retina (which is equivalent to a low-level non linear information processing filter) and the role of PI as selector of sets of stimulae that have to be derived to higher brain areas, which, in turn, is equivalent to a 'higher-level filter' of information, in the sense of 'filtering' the possible semantic content of the information that is allowed to reach later stages of processing.