Medical Imaging 1997: Image Perception

Volume Details

Date Published: 16 April 1997

Contents: 6 Sessions, 35 Papers, 0 Presentations

Conference: Medical Imaging 1997 1997

Volume Number: 3036

All links to SPIE Proceedings will open in the SPIE Digital Library.

Show all abstracts

View Session

Modeling Visual Signal Detection I
Modeling Visual Signal Detection II
Display Parameters and Performance
Subjective and Objective Image Quality
Perceptual Processes and Performance
Poster Presentations

Modeling Visual Signal Detection I

Nodule detection in two-component noise: toward patient structure

Arthur E. Burgess, Xing Li, Craig K. Abbey

Show abstract

It is common, in discussions of lesion (signal) detection in radiology, to refer to patient anatomy as structured noise. This is, of course, a gross over-simplification -- because it does not take issues of phase coherence/incoherence into account. However, there are benefits from investigating phenomenological issues of signal detection in two component noise -- with one component being broad band (white) noise designed to simulate image noise and the other (background) component filtered to match the power spectrum of some aspect of imaged patient anatomy. The purpose of the experiments described in this paper is to develop an understanding of how the power spectrum of simulated patient structure affects detectability of simulated lesions. We report results of a number of investigations of human and model observer performance. Example tasks are: detection of simulated lung nodules in noise filtered to simulate background lumpy structure at a variety of scales, detection of nodules in fractal-like power law noise, and detection of simulated microcalcification clusters and simulated breast mass lesions in power law noise designed to simulate mammographic parenchymal structure. Human results are compared to three observer models and are fitted very well by a channelized Fisher-Hotelling model. The nonprewhitening model with eye filter does not agree with human results over much of the parameter ranges.

Comparison of the channelized Hotelling and human observers for lesion detection in hepatic SPECT imaging

Michael A. King, Daniel J. de Vries, Edward J. Soares

Show abstract

The relative rankings of the channelized Hotelling model observer were compared to those of the human observers for the task of detecting 'hot' tumors in simulated hepatic SPECT slices. The signal-to-noise ratios (SNRs) were determined using eighty images for each of three slice locations. The acquisition and processing strategies investigated were: (1) imaging solely primary photons, (2) imaging primary plus scatter within a 20% symmetric energy window for Tc-99m, (3) imaging with primary plus an elevated amount of scatter, (4) energy-spectrum-based scatter compensation of the primary plus scatter acquisitions, and (5) energy-spectrum-based scatter compensation of the acquisitions with an elevated amount of scatter. Both square non-overlapping channels (SQR), and overlapping difference- of-Gaussian channels (DOG) were incorporated into the Hotelling model observer. When the scatter compensation results were excluded, both channelized Hotelling model observers exhibited a strong correlation with the rankings of the human-observers. With the inclusion of the scatter compensation results, only with the DOG model observer was the null-hypothesis of no correlation rejected at the p equals 0.05 level. It is concluded that further investigation of the channel model used with the Hotelling observer is indicated to determine if better correlation can be obtained.

Modeling human visual detection of low-contrast objects in fluoroscopy image sequences

David L. Wilson, Kadri N. Jabri, Ping Xue

Show abstract

We have performed a large variety of perception experiments aimed at issues in x-ray fluoroscopy. These include effects of image acquisition rate, digital temporal filtering, motion, and x-ray system motion blur. In this report, two model structures were considered. Both were non-prewhitening matched filter models modified to include a spatio-temporal visual system contrast sensitivity function. The first model used a spatial template following temporal integration and the second used a spatio-temporal template. The first model best described all data. However, it did not describe all motion experiments as accurately as the second. Our conclusion was that given an experiment, we should use the simplest model which most accurately describes similar experiments. Models will help plan critical experiments, allow one to concisely describe results, and predict similar experiments. Both models will be used to further our ultimate goal of using quantitative image quality studies to minimize dose and maximize image quality in x-ray fluoroscopy.

Simple model for noise perception in digital hardcopy

Rodney Shaw

Show abstract

A model is developed for the stochastic noise in digital hardcopy in terms of the basic resolution element (pixel size) and the number of distinguishable gray levels printed within this element. This is expressed in Wiener (power) spectrum terms, and is used to establish a link with a perceptual scale based on previous work concerning the perception of noise for photographic (analog) images.

Observer detection performance loss: target-size uncertainty

Philip F. Judy, Marie Foley Kijewski, Richard G. Swensson

Show abstract

The effect of target size and target-size uncertainty on human observers' ability to detect Gaussian and disk targets in spatially uncorrelated and correlated noise was measured. Disk and Gaussian targets were centered in circular areas (diameter, 128 pixels) of uncorrelated noise and uncorrelated noise filtered to resemble CT noise. Size uncertainty was introduced in the target stimuli by presenting targets with effective areas that ranging from 15 to 10,000 pixels. A constant non-prewhitening, matched- filter signal-to-noise ratio (NPW-SNR) was maintained for all target sizes within each trial by adjusting target contrast. Stimulus sets were rendered on the gray-scale monitor of the computer workstation used to collect observer responses. Observers rated for each stimulus the likelihood that a target was present. The observer ratings were analyzed using a multiple-distributing extension of the bi- normal ROC curve fitting procedure. A control experiment evaluated the influence of the circular noise area on performance with size-specified targets. Observer detection performance loss, the ratio of d' to NPW-SNR, decreased for small and large targets. Variations of target shape and noise correlation had no significant effect on performance loss. Observer performance when target was uncertain was the same as observer performance when the target size was specified. In the control trials the investigator specifies the target size to the observer, yet the observer cannot exploit that information. The observer apparently uses same perceptual resources in both the experimental and control trials to render a rating and consequently performs similarly in size-uncertain and size-specified trials. These results suggest a substantial role of higher order mechanisms in the detection of compact targets in noisy backgrounds.

Modeling Visual Signal Detection II

What is degrading human visual detection peformance in natural medical image backgrounds?

Miguel P. Eckstein, Albert J. Ahumada Jr., Andrew B. Watson, et al.

Show abstract

Experiments on visual detection in computer simulated noise (e.g. white noise) show that random variations from location to location in the image (due to noise) degrade human performance. Psychophysical experiments of visual detection of signals superimposed on a known deterministic background ('mask') show that human performance can be degraded by the presence of a high contrast deterministic background through divisive inhibition. The purpose of this paper is to perform a psychophysical experiment to determine the relative importance of these two sources of performance degradation (random background variations and contrast masking effects) in human visual detection in natural medical image backgrounds. The results show that both contrast masking and random background variations degrade human performance for detecting signals in natural medical image backgrounds. These results suggest that current observer models which do not include a source of degradation due to the deterministic presence of the background might need to model such effects in order to reliably predict human visual detection in natural medical image backgrounds.

Evaluation of human vision models for predicting human observer performance

Warren B. Jackson, Maya R. Said, David A. Jared, et al.

Show abstract

We demonstrate that human-vision-model-based image quality metrics not only correlate strongly with subjective evaluations of image quality but also with human observer performance on visual recognition tasks. By varying amorphous silicon image system design parameters, the performance of human observers in target identification using the resulting test images was measured, and compared with the target weighted just-noticeable-difference produced by a human vision model applied to the same set of images. The detectability of model observer with the human observer was highly correlated for a wide range of image system design parameters. These results demonstrate that the human vision model can be used to produce human observer performance optimized imaging systems without the need for extensive human trials. The human vision based tumor detectors represent a generalization of channelized Hotelling models to non-linear, perceptually based models.

Importance of anatomical noise in mammography

Francois O. Bochud, Francis R. Verdun, Jean-Francois Valley, et al.

Show abstract

Normal tissue structure in a radiological image, called anatomical noise, can prevent radiologists from seeing the pathology they are looking for. The goal of this paper is to show the importance of this noise component compared to the system one (quantum mottle, screen structure and film noises) in the particular case of mammography. Two normal mammographic images are digitized in order to provide an anatomical structure. On this background, an object simulating a microcalcification (a sphere) or a tumor (a spherical lens) is filtrated by the modulation transfer function of the imaging system representative of a mammographic unit and superimposed. A two alternative forced choice experiment is effected on these synthesized images in which the amount of object contrast and system noise is varied. The images are presented on a high resolution screen to five observers used for this kind of experiment. The experimental results are then compared to the signal to noise ratios given by the non-prewhitening matched filter model in which the human visual transfer function is taken into account. When comparing the experimental results with the model in which the anatomical noise is considered as a noise or as a signal, it is observed that the experimental results are in-between. This means that the anatomical background has a component that can be considered as a noise and another that can be recognized as signal. At the present time, there is no observer model who takes the anatomical noise effect into account. The use of a homogeneous test object without anatomical structure to qualify the system performance or to optimize the radiological procedure then appears questionable.

Circle cue enhances detection of simulated masses on mammogram backgrounds

Harold L. Kundel, Calvin F. Nodine, Lawrence C. Toto, et al.

Show abstract

Tumor detection on chest images is improved when image sites that receive prolonged visual attention are identified by eye-position recording and physically circled on the image. Missed lesions on mammograms can also be identified by eye- position recording but there is no data about whether site identification and subsequent re-evaluation improves performance. The goal of this work it to determine if a circle cue improves the detectability of a mass on a mammogram independently of indicating high yield sites. Simulated masses were added to the exact center of one of a pair of 7.7 multiplied by 7.7 cm patches of either Gaussian noise or mammogram parenchyma displayed on a 2000 by 2000 pixel Tektronix monitor. The mammogram parenchyma patches were selected from a set of 10 digitized mammograms by first randomly selecting a mammogram and then randomly selecting a point within the mammogram to locate the center of the patch. On about half of the trials a 6.4 cm diameter circle was drawn about the center of both patches. The mass was always centered in the patch and in the circle. The subject had to indicate which side contained a mass on each trial. Five levels of mass contrast were used. In this 2 AFC task, each of three subjects saw each contrast level 200 times in the circle and no-circle condition and with a Gaussian noise and mammogram background for a total of 4000 trials per subject. The index of detectability, d', was calculated and the results were analyzed using an analysis of variance with pooling of the d' values over subjects. As would be expected the d' for the detection of the masses increase linearly with SNR under all conditions. The d' for detection increased for all subjects and at all signal levels when the mass was physically circled on the image. The increase was not statistically significant for the noise background (p equals. 19) but was statistically significant for the mammographic background (p equals .001). Regression analysis showed an increase in d' of .41 for the Gaussian noise background and .54 for the mammogram. Cues like arrows, circles, and luminance pedestals are thought to act by reducing uncertainty about the exact location of a target. The experimental situation used here where the tumor was always in the exact center of a square suggests that an additional perceptual mechanism is operating. The circle may limit the area over which the retina has to integrate image noise or it may improve retinal contrast sensitivity by providing a fiducial marker to help stabilize the fine eye movements.

Nodule polarity effects on detection and localization performance in liver CT images

Richard G. Swensson, Philip F. Judy, Christine Wester, et al.

Show abstract

Performance accuracy for detecting and localizing small nodules on liver CT images depends on whether an observer is required to find dark nodules or bright nodules on those images. We investigated these asymmetric polarity effects using simulated nodules of varying sizes placed on spiral CT scans of clinical patients acquired with intravenous contrast material, which made blood vessels appear brighter than liver background on the displayed CT images. A concurrent analysis of each observer's detection-rating and scored-localization data estimated separate perceptual effects for the nodules of different sizes, and for locations of the dark or bright hepatic findings that observers regarded as most suspicious on the CT images. The results were consistent with equal visibility for dark and bright nodules of identical size and CT-contrast, and a linear increase in visibility with nodule signal-to-noise ratio for a non-prewhitening matched-filter calculation (NPW-SNR). The substantial lower accuracy for detecting and localizing the bright nodules, compared to the dark nodules, was a polarity effect apparently produced by the non- stationary liver CT backgrounds -- i.e., the presence of stronger confusing signals from the bright hepatic findings on these (contrast-enhanced) CT images than from the dark hepatic findings.

Display Parameters and Performance

Evaluation of the effect of display luminance on the feature-detection rates of masses in mammograms

Bradley M. Hemminger, Alan W. Dillon, Richard Eugene Johnston

Show abstract

The purpose of this study was to determine the interaction of the luminance range of the display system with the feature detection rate for detecting simulated masses in mammograms. Simulated masses were embedded in cropped 512 by 512 portions of mammograms digitized at 50 micron pixels, 12 bits deep. The masses were embedded in one of four quadrants in the image. An observer experiment was conducted where the observer's task was to determine in which quadrant the mass is located. The key variables involved in each trial included the exposition of the mass, the contrast level of the mass, and the luminance of the display. The contrast of the mass with respect to the background was fixed to one of four selected contrast levels. The digital images were printed to film, and displayed on a mammography lightbox. The display luminance was controlled by the placing neutral density films between the laser printed films of mammogrpahic backgrounds and the lightbox. The resulting luminances examined in this study ranged from a maximum of 10 ftL to 600 ftL. Twenty observers viewed 20 different combinations of the 5 neutral density filters with the 4 contrast levels, for a total of 400 observations per observer, and 8000 observations overall. An ANOVA analysis showed that there was no statistically significant correlation between the luminance range of the display and the feature detection rate of the simulated masses in mammograms. None of the luminance display ranges performed better than any of the others.

Gray-scale transform and evaluation for digital x-ray chest images on CRT monitor

Isao Furukawa, Junji Suzuki, Sadayasu Ono, et al.

Show abstract

In this paper, an experimental evaluation of a super high definition (SHD) imaging system for digital x-ray chest images is presented. The SHD imaging system is proposed as a platform for integrating conventional image media. We are involved in the use of SHD images in the total digitizing of medical records that include chest x-rays and pathological microscopic images, both which demand the highest level of quality among the various types of medical images. SHD images use progressive scanning and have a spatial resolution of 2000 by 2000 pixels or more and a temporal resolution (frame rate) of 60 frames/sec or more. For displaying medical x-ray images on a CRT, we derived gray scale transform characteristics based on radiologists' comments during the experiment, and elucidated the relationship between that gray scale transform and the linearization transform for maintaining the linear relationship with the luminance of film on a light box (luminance linear transform). We then carried out viewing experiments based on a five-stage evaluation. Nine radiologists participated in our experiment, and the ten cases evaluated included pulmonary fibrosis, lung cancer, and pneumonia. The experimental results indicated that conventional film images and those on super high definition CRT monitors have nearly the same quality. They also show that the gray scale transform for CRT images decided according to radiologists' comments agrees with the luminance linear transform in the high luminance region. And in the low luminance region, it was found that the gray scale transform had the characteristics of level expansion to increase the number of levels that can be expressed.

Visual optimization of radiographic tone scale

Hsien-Che Lee, Scott J. Daly, Richard L. Van Metter

Show abstract

Radiographic tone scale can be optimized for visual perception by mapping equal log-exposure difference in the transmitted radiation field to equal discriminable brightness difference in the displayed image. The mapping will render objects equally visible independent of its exposure level and background. This visually optimized tone scale can be readily achieved in computed radiography. It can also be applied to the conventional screen/film system to provide an ideal aim curve for screen/film design. One of the key components in the construction of such a tone scale curve is the function that relates the physical luminance to the perceptual brightness. It is well known that the perceived brightness is a function of the viewing condition. By comparing the predicted brightness differences from the various brightness models, we found that the Michaelis- Menten function provides a visual tone scale that significantly improves the visibility of details in the high density regions over that produced by existing screen/film systems. Based on the basic visual tone scale, we parameterize the curve shapes at the toe and the shoulder, and thus obtain a family of tone scale curves for computed radiography. The toe and shoulder shapes can be customized for different examination types for achieving the best rendition under the constraints of the limited dynamic range of the output display and the minimum required image contrast.

Perceptual linearization as display standard: link between psychophysics and contrast discrimination models

Najoua Belaid, Ineke M. C. J. van Overveld, Jean-Bernard Martens

Show abstract

Given the increasing use of soft-copy in the medical field, standardization and fidelity of the display are gaining much interest. For medical imaging, it is of great importance that information in the data is accurately mapped to brightness sensation. The concept of perceptual linearization was introduced by Pizer [Computer Graphics and Image Processing, 17, 262 (1981)] to guarantee the display fidelity. It means that equal steps in the data evoke equal steps in sensation. We investigate the fidelity of achromatic displays prior to and after perceptual linearization. First, we use magnitude estimation of brightness differences between gray square patches embedded in a uniform background, to make equal-interval brightness series. Then, we extend the brightness contrast discrimination models of Whittle [Vision Research, 26, 1677 (1986)], and Kingdom and Moulden [Vision Research, 31, 851 (1991)], for gray patches in uniform background, to our supra-threshold data. A good fit is found. Finally, we apply the look-up tables that provide perceptual linearization for the gray patches to complex images. Brightness matching with a scale of reference gray patches is used to estimate gray levels at specified image locations. The experimental results indicate that the accuracy of this task is not necessarily affected by perceptual linearization.

Evaluation of signal detection performance with pseudocolor display and lumpy backgrounds

Hong Li, Arthur E. Burgess

Show abstract

Historically, gray scale has been the standard method of displaying univariate medical images. With the advent of digital imaging, color scales have been used for display of quantitative nuclear medicine images and for quantitative overlays in ultrasound images. There has been no interest shown in using color for anatomically based imaging such as radiography or CT. The one exception has been attempts to do multi-spectral (T1, T2, (rho) ) image display in MRI. A few color scales have been proposed and evaluated, but have had little acceptance by radiologists. It is possible that carefully designed scales might give lesion detection performance that equals gray scale and improves performance of other tasks. We investigated 13 display scales including the physically linear gray scale, the popular rainbow scale, and 11 perceptually linearized scales. One was the heated object scale and the other 10 were spiral trajectories in the CIELAB uniform color space. The experiments were performed using signals added to white noise and a statistically defined (lumpy) background. In general, the best performance was obtained using the gray scale and the heated object scale. Performance for the spiral trajectory scales was typically 25% lower. Performance for the rainbow scale was very poor (about 30% of gray scale performance).

Subjective and Objective Image Quality

Relationship of subjective ratings of image quality and observer performance

Howard E. Rockette, Christopher M. Johns, Jane L. Weissman, et al.

Show abstract

The relationship between radiologists' perception of image quality and their actual performance was assessed. If the two variables are strongly correlated, the easier obtained perception of quality index might be used as a prerequisite test to determine if a ROC study is justified. One-hundred seventy cases were evaluated for the presence or absence of interstitial disease and nodules by nine readers using seven display modes. Each reader also assigned each image a mode- specific perceived quality rating using a 5-category ordinal scale. Average perceived quality was highest for conventional film. It was slightly poorer for the subsets of cases with interstitial disease and for cases classified independently as 'subtle.' Trend tests indicated a relationship between area under the ROC curves (A_z) and perceived image quality for nodules. For interstitial disease, the relationship was weaker and of borderline statistical significance. The subjective image quality index was related to the area under the ROC curve, but the average difference between pairs of display modes was not a good predictor of difference in actual observer performances. A subjective quality index may have limited usefulness in screening differences between modalities prior to the performance of a ROC study.

Comparison of computer analysis of mammography phantom images (CAMPI) with perceived image quality of phantom targets in the ACR phantom

Dev Prasad Chakraborty

Show abstract

Computer analysis of mammography phantom images (CAMPI) is a method for making quantitative measurements of image quality on phantom images. The purpose of this work was to determine the correlation of the CAMPI measures with the perceived image quality of the microcalcification target objects in a phantom. A large existing phantom image database was subjected to CAMPI analysis. Extracted microcalcification regions from some of these images were compared in pairs by three observers. A global maximization technique was used to determine which linear combinations of the CAMPI measures most closely predicted the paired comparison observations. An analysis was also conducted to determine the validity of the linear model. It was found that the signal-to-noise- ratio measure (SNR) and the correlation measure (COR) most strongly correlated with the observed paired comparison results. A linear combination of the measures gave a slightly, but significantly, better correlation with the observations.

Quality control in digital mammography: automatic detection of under- and over-exposed mammograms

Chris Yuzheng Wu, Matthew T. Freedman M.D., Akira Hasegawa, et al.

Show abstract

We developed a quality control system (QCS) for digital mammography that can notify technologists in real time of mammograms of poor image quality due to under or over exposure. Mammograms are digitized by a Lumisys Scanner at 100 micron and 12 bits per pixel. An automatic image segmentation technique is employed to extract area inside the breast in mammogram. Histograms of the segmented areas are then calculated. By analyzing the composition of histograms, the computer program determines whether the original films have properly exposed. Traditional image segmentation techniques are based on histogram analysis of digitized mammograms. However, such methods often fail with mammograms of low contrast or that are under-exposed because the difference in brightness across the breast skin line is so small that it is difficult to define boundary by thresholding or region growing techniques. We proposed a novel method to detect breast skin line based on statistical changes of gradient. By analyzing the histogram composition of normal, under and over-exposed films, we defined an image feature that describes the image intensity content of underlying mammograms. The criterion for determining the category of a mammogram were established by studying a training database of normal, under, and over exposed films. We can then classify the mammograms using the image feature, based on the established criterion. Over 150 real mammograms of different exposure levels were analyzed. The images were classified by the computer system into groups of normal, slightly under-exposed, under-exposed, slightly over- exposed, and over-exposed. We compared the classification results by computer with a radiologist's evaluation. Our QCS system was able to correctly classify over 85% of the cases. Receiver operating curve (ROC) analysis will be employed to evaluate the performance of the QCS system in determining the image quality of digital mammograms. Our QCS program is able to automatically determine whether a mammogram is properly exposed and advise a technologist to re-take additional exposures. The QCS correctly identified 100% of over- and under-exposed mammograms and 92% of mammograms of normal exposure. The QCS can help reduce the cost of recalling patients and improve the overall quality of mammographic service.

ROC comparison between digital mammography and screen-film using an anthropomorphic breast phantom

Guoying Qu, Walter Huda, Barbara G. Steinbach, et al.

Show abstract

Mass lesion detection performance of a LoRAD Digital Spot Mammography (DSM) system was compared with a Kodak Min R screen-film combination exposed either in front of the DSM, or in the Bucky of a GE 600T mammography unit. Low-contrast objects simulating small masses were superimposed on an RMI 165 anthropomorphic breast phantom and radiographs obtained at 28 kVp and an mAs value, which resulted in a mean film density of approximately 1.1. DSM images were obtained at the same radiation exposure as used with screen-film. Fully masked radiographs were viewed on a mammography light box, and the DSM images were viewed on the DSM monitor in a darkened room. Of the 64 regions of interest (ROI) in each type of image, 28 (44%) contained the test object. For each imaging modality, six radiologists and six scientists assessed the probability of a simulated mass being present in each ROI. The resultant data were used to plot receiver operating characteristic (ROC) curves of twelve readers for each of the three imaging modalities investigated. There was no significant difference in reader performance between the screen-film combination exposed in front of the DSM system and exposed in the GE 600T system. Both screen-film imaging systems resulted in the same average area under the ROC curve, A_z, of 0.78. At the same level of radiation exposure, the DSM had an average ROC area, A_z, of 0.71 which was significantly inferior to the average performance achieved using screen-film (p less than 0.005). For this detection task, there were no significant differences in performance between the radiologists and scientists. Reader performance was found to improve with the number of images read, demonstrating an observer learning curve for this specific detection task.

Perceptual Processes and Performance

Perceptual processes involved in mammographic film interpretation

Mark D. Mugglestone, Alastair G. Gale, A. R. M. Wilson

Show abstract

A series of mammographic cases were presented to breast screening radiologists in two conditions. The cases were of three different types, being either; two mammographic films together but presented in inverted fashion, a single mammographic film, or a close up of a mammographic feature. In the first condition the subjects were only allowed 200 ms to view the case and in the second they had an unlimited amount of time. The amount of diagnostic information amiable in a brief presentation was analyzed and the results are discussed in relation to the perceptual and cognitive factors involved in radiographic image interpretation.

Radiologists' ability to discriminate computer-detected true and false positives from an automated scheme for the detection of clustered microcalcifications on digital mammograms

Robert M. Nishikawa, Dulcy E. Wolverton, Robert A. Schmidt, et al.

Show abstract

There is evidence that computer-aided diagnosis (CAD) can be used to improve radiologists' performance. However, one of the potential drawbacks of CAD is that a computer-detected false positive may induce a false positive by a radiologist. To examine this issue, we performed two experiments to compare radiologists' false positives with those of the computer and to determine radiologists' ability to discriminate between the computer's true- and false-positive detections. In the first experiment, radiologists were shown 50 mammograms and on each film were asked to indicate 3 regions that could contain clustered microcalcifications, and using a 100-point scale, to give their level of confidence that microcalcifications were present in the region. In the second experiment, the radiologists were shown regions-of-interest, printed on film, containing either a computer-detected true cluster or a computer- detected false positive. The radiologists gave their confidence that there were actual clustered microcalcifications present. There was less than 1% overlap between false positives by the computer and radiologists. Furthermore, based on ROC analysis, radiologists were able to discriminate between computer true and false positives.

Role of feedback in learning of screening mammography

Regina Pauli, Paul T. Sowden

Show abstract

The experiment reported here was designed to investigate further the role of feedback in learning complex visual discrimination tasks such as screening mammography. Previous research has not yet established how feedback affects learning in such tasks and whether it is an important contributing factor in the acquisition of target detection skills at all. In this study, observers were required to search a computer-generated display of random background noise for a probabilistically defined target under one of four feedback conditions. The experiment was designed to compare each observer's baseline performance with performance when given feedback to overcome the problem of large individual differences typically observed in such tasks. It was found that feedback which is rich in target information is superior in improving accuracy of performance in this task when compared to feedback which does not give any information about target location and features or no feedback. Secondly, information-rich feedback seems to motivate observers to search the display for longer. The results are discussed in relation to designing training which specifically incorporates information-rich feedback.

Time-of-day effects on mammographic film reading performance

Helen C. Cowley, Alastair G. Gale

Show abstract

In a wide variety of domains it has been found that an individual's performance can vary throughout the day. The time of day at which a radiologist reads mammographic films could therefore have an effect on their breast screening performance. A national self-assessment scheme that provides some insight into radiologists' breast screening skills has been operating for a number of years in the UK. Data from this scheme were assessed in terms of; the time of day that the radiologists read the mammographic film set, the time taken to complete the scheme, any performance variation during the time taken to complete the scheme and the variation in performance in comparison with the time of day that the radiologist usually diagnoses mammograms. Results suggest that there are no significant circadian variations in performance, but that performance does decrease after 70 or 80 minutes of screening.

Viewing-time differences for film versus monitor viewing of radiographs: what eye position reveals

Elizabeth A. Krupinski, Pamela J. Lund

Show abstract

The goal was to determine why viewing times are generally longer for images displayed on a monitor than on film for reading radiographic images. Eye-position of six readers was recorded as they searched 27 bone images on film or a monitor. Overall viewing time was longer with the monitor. Time to first fixate a lesion after search began and true negative dwell times were significantly longer for the monitor than film. Absolute number of clusters and dwell times were greater on diagnostic image areas on the monitor than on film, suggesting that more perceptual processing was taking place. This contrasts with the finding that the readers spent a greater percentage of their overall viewing time searching diagnostic areas on film than on the monitor. Twenty percent of the clusters for monitor viewing were on the image processing menu. The information that is processed during search of images on a monitor is different than on film. Additionally, the monitor has less spatial resolution and lower brightness than does film. These factors could lead to significant differences in the ways that readers search images and distribute their attentional and perceptual resources during search, which adversely influences monitor viewing time.

Effect of film size on human observer detection of nodules on chest CT images

Philip F. Judy, Steven E. Seltzer, Uri Feldman, et al.

Show abstract

The effect of displayed CT image size and observer viewing protocol on human observer ability to detect nodules was measured. Synthetic nodules (3.0 to 5.0 mm) were added to random locations within the lungs of 80 CT images from spiral CT scans of 13 patients. Each test set consisted of 160 images. Each CT image was presented twice in a trial, once with nodule present and once without nodule present. The images were rendered as film transparencies using 6 pixel sizes (0.074 to 0.259 mm). Four observers read the films using two viewing protocols. In one protocol, the observers could vary their viewing distance and were provided with a magnification lens. In the second protocol, the observers viewing distance was fixed at 55 cm. Observers rated the likelihood that a nodule was present in the image and indicated the lung most likely to contain a nodule. The ratings were used to estimate an ROC curve for each trial. Detectability was better using the variable distance viewing protocol compared to the fixed viewing distance protocol. The area under the ROC curve was constant as a function of pixel size for the variable viewing distance protocol (0.881 plus or minus 0.007) and decreased (0.894 plus or minus 0.013 to 0.668 plus or minus 0.053) as a function of pixel size for the fixed viewing distance protocol. Radiologists should be encouraged to vary their viewing distance when reading CT images rendered as films. In order to reduce costs, some radiology departments may be tempted to reduce the CT image size on the film and there by increase the number of CT images on each film. Our study suggests that this manipulation could impair the radiologist's ability to detect lung nodules on CT images of the chest.

Poster Presentations

Performance evaluation of a high-strip-density grid using a contrast-detail phantom

Terence B. Terilli, Maxine Barnes, Ajoy Dutta, et al.

Show abstract

A contrast-detail phantom was used to evaluate grid performance. The phantom is constructed of 1 cm of plastic. Holes of varying diameter (detail) and varying depth (contest) were drilled into the contrast-detail phantom. The phantom was placed next to the bucky assembly. A 15 cm block of lucite was placed between the x ray tube and the phantom. A set of radiographs were taken of the phantom at different kVps and different phantom thicknesses. This was done both with and without the grid in place. An ion chamber was used so that the bucky factor could be determined. This entire procedure was repeated for the conventional, reciprocating grid. Contrast-detail curves were generated from the data. As would be expected the reciprocating grid had a lower bucky factor. The contrast improvement factor (contrast with grid/contrast without grid) was higher for the reciprocating grid. The contrast improvement factor for the high-strip- density grid was comparable to that for the reciprocating grid at high kVps and also when a thinner block of lucite was used. Grid lines were seen on the radiographs of the high-strip-density grid.

Comparative evaluation of conventional and zero-crossover rare-earth screen-film systems in pediatric chest radiography

Thomas C. Fearon, Daniela Dumitru-Buna, David C. Kushner, et al.

Show abstract

This study compares the image quality of conventional and zero-crossover symmetric and asymmetric rare-earth screen film systems for the specific imaging task of chest radiography in pediatric patients by means of evaluation of the visualization of anatomic landmarks. Advances in anticrossover technology in double emulsion film have made it possible to design each emulsion layer in a double- emulsion film as if it were a distinct system. Chest radiographs of 120 pediatric patients performed with one of six screen-film systems (two conventional rare-earth, two zero crossover symmetric and two zero crossover asymmetric screen-film systems) were evaluated for image quality by three radiologist observers. The evaluation was based on the visualization of anatomic landmarks and the subjective evaluation of the physical properties of each film. InSight VHC Thoracic Imaging system with IPC film and Pediatric Insight screens with Insight Pediatric film were ranked highest by the radiologist observers both with respect to the visualization of anatomic landmarks and the subjective evaluation of the physical properties of the systems. The difference between these two systems was not statistically significant (p less than 0.05), however, the ranking of these systems is statistically different (p less than 0.05) relative to the remaining four screen-film systems. Zero- crossover screen-film systems have been shown to provide improved image performance for the specific task of pediatric chest radiography based upon the evaluation of the visualization of anatomic landmarks relative to conventional systems evaluated in this study.

Visual detectability of elastic contrast in real-time ultrasound images

Naomi R. Miller, Jeffery C. Bamber, Marvin M. Doyley, et al.

Show abstract

Elasticity imaging (EI) has recently been proposed as a technique for imaging the mechanical properties of soft tissue. However, dynamic features, known as compressibility and mobility, are already employed to distinguish between different tissue types in ultrasound breast examination. This method, which involves the subjective interpretation of tissue motion seen in real-time B-mode images during palpation, is hereafter referred to as differential motion imaging (DMI). The purpose of this study was to develop the methodology required to perform a series of perception experiments to measure elastic lesion detectability by means of DMI and to obtain preliminary results for elastic contrast thresholds for different lesion sizes. Simulated sequences of real-time B-scans of tissue moving in response to an applied force were generated. A two-alternative forced choice (2-AFC) experiment was conducted and the measured contrast thresholds were compared with published results for lesions detected by EI. Although the trained observer was found to be quite skilled at the task of differential motion perception, it would appear that lesion detectability is improved when motion information is detected by computer processing and converted to gray scale before presentation to the observer. In particular, for lesions containing fewer than eight speckle cells, a signal detection rate of 100% could not be achieved even when the elastic contrast was very high.

Acquiring skill at medical image inspection: learning localized in early visual processes

Paul T. Sowden, Ian R. L. Davies, Penny Roling, et al.

Show abstract

Acquisition of the skill of medical image inspection could be due to changes in visual search processes, 'low-level' sensory learning, and higher level 'conceptual learning.' Here, we report two studies that investigate the extent to which learning in medical image inspection involves low- level learning. Early in the visual processing pathway cells are selective for direction of luminance contrast. We exploit this in the present studies by using transfer across direction of contrast as a 'marker' to indicate the level of processing at which learning occurs. In both studies twelve observers trained for four days at detecting features in x- ray images (experiment one equals discs in the Nijmegen phantom, experiment two equals micro-calcification clusters in digitized mammograms). Half the observers examined negative luminance contrast versions of the images and the remainder examined positive contrast versions. On the fifth day, observers swapped to inspect their respective opposite contrast images. In both experiments leaning occurred across sessions. In experiment one, learning did not transfer across direction of luminance contrast, while in experiment two there was only partial transfer. These findings are consistent with the contention that some of the leaning was localized early in the visual processing pathway. The implications of these results for current medical image inspection training schedules are discussed.

ROC study of screen-film mammography and storage phosphor digital mammography: analysis of nonconcordant classifications and implications for the approval of digital mammography systems

Matthew T. Freedman M.D., Dorothy Steller Artz, Jacquelyn Hogge M.D., et al.

Show abstract

A recently completed ROC study of digital mammography using a 100 micron pixel storage phosphor receptor showed that digital mammography and conventional screen film mammography were essentially equivalent in areas under the ROC curve. In this study, there were 24 biopsy proven breast cancer cases, 25 benign biopsy cases and 48 clinically normal breast images each with matched screen film and storage phosphor images. Fifteen of the 24 cancer cases were 10 mm or less in size. Of these 10 presented with microcalcifications as the sign of disease. Six radiologists not involved with the research program and without prior experience with digital mammography and who met qualification criteria under the Mammography Quality Standards Act of 1992 served as readers. This poster looks at the cases in which there was variance between the radiologists ROC classification system for the digital and screen film system in order to analyze case specific discrepancies that may indicate benefits or deficits of the digital system. Aspects of the ROC ratings are also analyzed including an evaluation of the different thresholds used by radiologists on the digital and screen film systems, the distribution of ROC ratings in normal and abnormal cases, the effect of using different gold standards of proof on the results and the effect of substituting an ACR BIRADS category agreement study as proposed by the FDA compared to the ROC study outcome.

Single-image hard-copy display of the spine utilizing digital radiography

Dorothy Steller Artz, Timothy Janchar, David Milzman, et al.

Show abstract

Regions of the entire spine contain a wide latitude of tissue densities within the imaged field of view presenting a problem for adequate radiological evaluation. With screen/film technology, the optimal technique for one area of the radiograph is sub-optimal for another area. Computed radiography (CR) with its inherent wide dynamic range, has been shown to be better than screen/film for lateral cervical spine imaging, but limitations are still present with standard image processing. By utilizing a dynamic range control (DRC) algorithm based on unsharp masking and signal transformation prior to gradation and frequency processing within the CR system, more vertebral bodies can be seen on a single hard copy display of the lateral cervical, thoracic, and thoracolumbar examinations. Examinations of the trauma cross-table lateral cervical spine, lateral thoracic spine, and lateral thoracolumbar spine were collected on live patient using photostimulable storage phosphor plates, the Fuji FCR 9000 reader, and the Fuji AC-3 computed radiography reader. Two images were produced from a single exposure; one with standard image processing and the second image with the standard process and the additional DRC algorithm. Both sets were printed from a Fuji LP 414 laser printer. Two different DRC algorithms were applied depending on which portion of the spine was not well visualized. One algorithm increased optical density and the second algorithm decreased optical density. The resultant image pairs were then reviewed by a panel of radiologists. Images produced with the additional DRC algorithm demonstrated improved visualization of previously 'under exposed' and 'over exposed' regions within the same image. Where lung field had previously obscured bony detail of the lateral thoracolumbar spine due to 'over exposure,' the image with the DRC applied to decrease the optical density allowed for easy visualization of the entire area of interest. For areas of the lateral cervical spine and lateral thoracic spine that typically have a low optical density value, the DRC algorithm used increased the optical density over that region improving visualization of C7-T2 and T11-L2 vertebral bodies; critical in trauma radiography. Emergency medicine physicians also reviewing the lateral cervical spine images were able to clear 37% of the DRC images compared to 30% of the non-DRC images for removal of the cervical collar. The DRC processed images reviewed by the physicians do not have a typical screen/film appearance; however, these different images were preferred for the three examinations in this study. This method of image processing after being tested and accepted, is in use clinically at Georgetown University Medical Center Department of Radiology for the following examinations: cervical spine, lateral thoracic spine, lateral thoracolumbar examinations, facial bones, shoulder, sternum, feet and portable chest. Computed radiography imaging of the spine is improved with the addition of histogram equalization known as dynamic range control (DRC). More anatomical structures are visualized on a single hard copy display.

Adaptive reference/test forced-choice method with application to fluoroscopy perception

Ping Xue, Kadri N. Jabri, David L. Wilson

Show abstract

We developed a new, interspersed, adaptive forced-choice method of general applicability, and used it to study perception in x-ray fluoroscopy. We measured detectability of low-contrast objects in noisy image sequences and determined x-ray dose levels for equivalent detectability of test (typically pulsed fluoroscopy at 15 acq/sec, hereafter called pulsed-15) as compared to reference (conventional fluoroscopy at 30 acq/sec, hereafter called pulsed-30). We interspersed reference and test in order to reduce effects of subject effort and attention. After 200 total presentations, we obtained absolute detectability of reference and test and an equivalent perception dose ratio (EPDR) for test as compared to reference. For this technically demanding application, implementation features such as real-time creation of noisy image frames and fast maximum-likelihood estimation of detectability were critical. We derived parameter uncertainties and proved applicability with Monte Carlo simulations and experiments. The interspersed, reference/test method lowered experimental standard deviations due to the removal of day-to-day variations in absolute detectability. Reliability of comparisons of subject response times was also improved. A variety of results in x-ray fluoroscopy has been obtained with this new method. Examples are a dose savings of pulsed- 15 as compared to pulsed-30, a saturation of the detectability response as one increases the number of frames in the display loop, effects of temporal filtering, and effects of motion.

Al Hirschfeld's NINA as a prototype search task for studying perceptual error in radiology

Calvin F. Nodine, Harold L. Kundel

Show abstract

Artist Al Hirschfeld has been hiding the word NINA (his daughter's name) in line drawings of theatrical scenes that have appeared in the New York Times for over 50 years. This paper shows how Hirschfeld's search task of finding the name NINA in his drawings illustrates basic perceptual principles of detection, discrimination and decision-making commonly encountered in radiology search tasks. Hirschfeld's hiding of NINA is typically accomplished by camouflaging the letters of the name and blending them into scenic background details such as wisps of hair and folds of clothing. In a similar way, pulmonary nodules and breast lesions are camouflaged by anatomic features of the chest or breast image. Hirschfeld's hidden NINAs are sometimes missed because they are integrated into a Gestalt overview rather than differentiated from background features during focal scanning. This may be similar to overlooking an obvious nodule behind the heart in a chest x-ray image. Because it is a search game, Hirschfeld assigns a number to each drawing to indicate how many NINAs he has hidden so as not to frustrate his viewers. In the radiologists' task, the number of targets detected in a medical image is determined by combining perceptual input with probabilities generated from clinical history and viewing experience. Thus, in the absence of truth, searching for abnormalities in x-ray images creates opportunities for recognition and decision errors (e.g. false positives and false negatives). We illustrate how camouflage decreases the conspicuity of both artistic and radiographic targets, compare detection performance of radiologists with lay persons searching for NINAs, and, show similarities and differences between scanning strategies of the two groups based on eye-position data.

Visual study of perceptually optimized displays

Richard L. Van Metter, Thomas E. Kocher

Show abstract

Perceptually linear displays have been proposed as a standard for medical imaging. Current displays (display driver/monitor) have intrinsic display characteristics that differ from this proposal. Visual comparisons of the proposed perceptually linear displays and current technology have not been made to date. The subjective assessment presented in this paper is the first such comparison. Clinical images were printed on a 12-bit laser printer to simulate the display characteristics of perceptually linear and currently available 8-bit medium-resolution gray-scale displays. Images were compared subjectively and by means of a 4-alternative forced choice (4-AFC) protocol. In addition, predictions of visible differences were made with Daly's Visible Differences Predictor model. We find that currently available displays can produce clinical images that are visually indistinguishable from those that would be displayed on a perceptually linear display when viewed at currently available monitor luminance levels (200 nits). Therefore, intrinsic display functions may be sufficiently close to perceptually optimized performance that the expense associated with the design and fabrication of special perceptually linear display cards and/or monitors would not be justified. In any case, substantial deviation from perceptual linearity may be tolerable before visible differences will be discerned as long as the image is correctly mapped to the appropriate display function. Further study of the diagnostic benefits claimed for perceptually linear displays would be prudent before human visual models are adopted as the basis for display standardization.

Multiscale adaptive method for blood vessel enhancement in x-ray angiography

Zhenyu Wu, Ming Fang, JianZhong Qian, et al.

Show abstract

The goal of this work is to provide a powerful computer- aided-perception tool for physicians to visualize low- contrast blood vessel structures with exquisite details and hence to facilitate the extraction of valuable diagnostic information from angiographic images. In x-ray angiography, blood vessels often exhibit low intensity contrast with respect to their surrounding soft tissues. The problem is particularly severe for fine vessel structures. A major challenge for enhancement is the ability to emphasize vessel structures without creating artifacts such as edge overshot and noise magnification. In this work, a multi-scale adaptive contrast enhancement algorithm is developed. A pyramid of intensity images is generated using wavelet decomposition. At each pyramid level, an enhancement mask is computed which captures the fine vessel structures in the image at that scale. To generate this mask, we first compute directional sensitive Laplacian which is capable of extracting fine lines with very low contrast to its surroundings. An adaptive non-linear weighting function is then applied to the Laplacian to form an enhancement mask. The non-linearity is crucial for virtually eliminating edge overshots. These masks are then combined recursively to form a single composite mask of full resolution. Finally, the enhanced image is obtained by adding this composite mask to the original image. Extensive testing demonstrates remarkable contrast improvement in blood vessels without noticeable artifacts.

Medical Imaging 1997: Image Perception

Volume Details

Table of Contents

Table of Contents