Proceedings Volume 2179

Human Vision, Visual Processing, and Digital Display V

cover
Proceedings Volume 2179

Human Vision, Visual Processing, and Digital Display V

View the digital version of this volume at SPIE Digital Libarary.

Volume Details

Date Published: 1 May 1994
Contents: 8 Sessions, 42 Papers, 0 Presentations
Conference: IS&T/SPIE 1994 International Symposium on Electronic Imaging: Science and Technology 1994
Volume Number: 2179

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Spatial Vision and Image Quality
  • Perceptually Based Image Quality and Compression
  • Model-Based Halftoning
  • 3-D Image Analysis and Perception
  • Perception and Data Visualization
  • Color Image Rendering and Enhancement
  • Color Constancy and the Interaction of Lights with Surfaces
  • Perceiving and Using Color
Spatial Vision and Image Quality
icon_mobile_dropdown
Inverting the perceptual transform
Images are filtered by the various stages of visual processing and undergo a perceptual transformation. To begin to quantify this process, the appearance of sinewave gratings, images with a sinusoidal luminance profile, was studied. The most salient features of these images is that at high contrast they do not in general appear sinusoidal. A logarithmic transform, based on Fechner's law, is suggested as a first approximation to the perceptual transform in the luminance domain, and shown to characterize the apparent luminance profiles of sinewave gratings. Furthermore, the sinusoidal image profile can be pre-processed (inverse transformed) to create a new image which, when viewed, reverses to a large extent the perceptual distortions discussed above. This `corrected' image would appear as if the combined effects of the perceptual transform had not acted, making the now non-sinusoidal profile conform much more closely to our symmetrical expectations of a sinusoidal appearance. The principle of inversion used here is suggested as a general way of testing the validity of some human vision models proposed to achieve better image processing algorithms based on visual perception.
Hierarchical model for early visual processing
In this paper a hierarchical model for early visual processing is presented and its applicability to image processing problems is demonstrated. The model consists of arrays of processing units, or neurons, arranged in layers that are retinotopically connected. The principal feature of this architecture is that neighboring neurons exert mutually inhibitory interactions only. It is shown that the proposed model can generate the center-surround (CS) and the orientation- selective (OS) receptive field profiles observed in the early parts of the mammalian visual system. Our study reveals that the receptive field sensitivity profile depends on the lateral extent of the inhibitory interactions. The OS receptive field requires a larger number of lateral inhibitory interactions than does a CS receptive field. Furthermore, it is found that the lateral extent of inhibition can be reduced by cascading CS layers with OS layers. Finally, the applicability of the model to feature detection and image enhancement is also investigated.
Spatially congruent model for the striate visual cortex
A spatially congruent new model for the striate visual cortex (SVC) is proposed which accounts for some of the known functional and organizational properties of the superior mammalian SVC. Even though there is a broad consensus that the topographical representation of the visual field is one of the principal structuring principles underlying the SVC organization, the orientation maps in the SVC have often been described as non-topographical maps. In the present model, the adopted foot-of-normal representation of straight lines has allowed full congruency between the visual field topographic map and the orientation maps in the SVC. The proposed computational model includes three neural layers and assumes that the ocular dominance columns are already established at birth; three possibilities of neural mechanisms leading to orientation encoding are outlined and discussed. The model provides reasonable explanation to some of the most intriguing recently verified properties of the SVC such as the increased neural activity at the cytochrome oxidase blobs, the reduced orientation selectivity at these same places, and the pinwheel-like organization of the orientation selectivity in the SVC.
Study of transparency and coherence of motion in image sequences
Paulo Peixoto, Helder Araujo
As a consequence of the aperture problem, when a moving sine-grating is viewed through an aperture, the perceived motion is ambiguous. However, when we superimpose two of these gratings, the resulting pattern usually moves in a coherent way. It has been shown that the orientation, speed, contrast, and spatial frequency of each plaid are parameters that influence the perception of coherence. We have studied the stimulus conditions under which coherence does and does not occur, using human observers who assigned a probability of coherence to each sequence of images. We then tried to find a computational method whose result would correlate with the results from the human observers. The proposed method consists of computing the optical flow field on the image sequence, then using the direction of the optical flow in each point on the image to build a histogram. This histogram can provide information about the probability of coherence of the image sequences. We show that the proposed method provides results very similar to the ones obtained with the human observers in terms of probability of coherence.
Correcting the adjacent pixel nonlinearity on video monitors
Q. James Hu, Stanley A. Klein
In raster-scan CRT display systems, the luminous flux of a given pixel is affected by the preceding pixel along the raster direction. This spatial or adjacent pixel nonlinearity can adversely affect image quality. High contrast, high spatial frequency regions of an image will have wrong luminances. A simple lookup table (standard gamma correction) can not correct this nonlinearity. We measured the spatial nonlinearity under a variety of luminance conditions in two CRT displays. A model proposed by Mulligan and Stone was used in a 5 parameter nonlinear regression to fit the data. Results show that the model fit our data very well. We employed a 2-D lookup table to compensate for the spatial nonlinearity. The new lookup table has two entries: the intended luminance of the current pixel and the actual voltage of the previous pixel. The output of the new lookup table is the adjusted voltage which compensates for the pixel interaction and gives the correct average luminance for that pixel. Psychophysical experiments show that at small pixel sizes (less than 0.8 min), the compensation results in a sharp accurate image.
Gray-scale/resolution tradeoff
Spatial resolution and grayscale resolution are two image parameters that determine image quality. In this study we investigate the trade-off between spatial resolution and grayscale in terms of the discriminability of steps, measured in bits, away from a standard image. A CRT display was used to simulate black-and-white images with a square-pixel geometry. Natural images and a test pattern consisting of a radially symmetric spatial frequency chirp of increasing radial frequency (called a zone plate) were studied. Multiple versions of each image were produced by varying the simulated pixel size and the number of gray levels and by filtering. Discrimination thresholds for pixel size and number of gray levels were measured for several locations in the parameter space of spatial resolution and grayscale resolution for each image. Unfiltered, low-contrast, Nyquist-filtered, and Gaussian-filtered versions of the images were studied. Resolution levels were always integer divisors of the CRT display resolution, produced by subsampling and pixel-replication. Gray levels were steps that were linear in luminance and that spanned the entire CRT luminance range. Discrimination thresholds were measured using a three-alternative forced-choice one-up-two-down double- random-staircase procedure. Simulation device limitations caused some measurements to be less precise than was desired.
Uncertainty principle in human visual perception
Mikhael I. Trifonov, Dmitry A. Ugolev
The orthodox data concerning the contrast sensitivity estimation for sine-wave gratings were formally analyzed. The result of our analysis made feasible a threshold energy value (Delta) E -- energetic equivalent to quantum of perception -- as (Delta) E equals (alpha) (Delta) L(Delta) X2, where (alpha) is a proportionality coefficient, (Delta) L is a threshold luminance, and (Delta) X is a half-period of grating. The value of (Delta) E is a constant for a given value of mean luminance L of the grating and for a middle spatial frequency region. So the `exchange' between luminance threshold (Delta) L and spatial resolution (Delta) X2 values takes place; the increasing of one is followed by the decreasing of the other. We treated this phenomenon as a principle of uncertainty in human visual perception and proved its correctness for other spatial frequencies. Taking into account threshold wavelength ((Delta) (lambda) ) and time ((Delta) t) the uncertainty principle may be extended to a wider class of visual perception problems, including color and flicker objects recognition. So, we suggest the uncertainty principle proposed above is to be one of the cornerstones of the evolution of cognitive systems.
Gray scale enhances display readability of bitmapped documents
Olov Ostberg, Dennis Disfors, Yingduo Feng
Bitmapped images of high resolution, say 300 dpi rastered documents, stored in the memory of a PC are at best only borderline readable on the PC's display screen (say a 72 dpi VGA monitor). Results from a series of exploratory psycho-physical experiments, using the Adobe PhotoshopR software, show that the readability can be significantly enhanced by making use of the monitor's capability to display shades of gray. It is suggested that such a gray scale adaptation module should be bundled to all software products for electronic document management. In fact, fax modems are already available in which this principle is employed, hereby making it possible to read incoming fax documents directly on the screen.
Perceptually Based Image Quality and Compression
icon_mobile_dropdown
Global brightness contrast and the effect on perceptual image quality
Jacques A. J. Roufs, Victor J.F. Koselka, Arthur A.A.M. van Tongeren
The perceptual image quality of natural scenes as a function of the physical system parameter gamma has a definite optimum. This optimum is subject-independent and greater than 1, but was found to vary from one scene to another. If gamma is varied, brightness contrast is the most obviously changing perceptual attribute. Subjects appear to be able to make consistent, global judgments of brightness contrast in natural scenes, despite the fact that local brightness contrast may vary considerably. If scaled perceptual quality is plotted against scaled (perceived) brightness contrast, all curves coincide, suggesting that under the given conditions brightness contrast is the dominant psychological dimension of the perceptual image quality. Taking into account the grey-level distribution of the scene in combination with the luminance- reproduction function of the imaging chain, an effective gamma value can be defined. If scaled perceptual quality and global brightness contrast are plotted against this effective gamma, the differences between scenes disappear, although there are clear differences in the relative sizes of the light and dark parts of the various test scenes. An analysis by scaling global brightness of Gaussian blobs of randomly distributed sizes, modulation depths, and polarities shows that skewness of this distribution does indeed have only a weak effect over a considerable range. The data suggest that the ratio of maximum and minimum luminance determines global brightness contrast for complex scenes under these conditions.
Processing image sequences based on eye movements
Lew B. Stelmach, Wa James Tam
Subjects rated the subjective image quality of video sequences that were processed using gaze- contingent techniques. Gaze-contingent processing was implemented by adaptively varying image quality within each video field such that image quality was maximal in the region most likely to be viewed and was reduced in the periphery. This was accomplished by blurring the image or by introducing quantization artifacts. Results showed that provision of a gaze- contingent, high-resolution region had a modest beneficial effect on perceived image quality, compared to having a high-resolution region that was not gaze-contingent. Given the modest benefits and high cost of implementation, we conclude that gaze-contingent processing is not suitable for general purpose image processing.
Discrete cosine transform (DCT) basis function visibility: effects of viewing distance and contrast masking
Andrew B. Watson, Joshua A. Solomon, Albert J. Ahumada Jr., et al.
Several recent image compression standards rely upon the discrete cosine transform (DCT). Models of DCT basis function visibility can be used to design quantization matrices for arbitrary viewing conditions and images. Here we report new results on the effects of viewing distance and contrast masking on basis function visibility. We measured contrast detection thresholds for DCT basis functions at viewing distances yielding 16, 32, and 64 pixels/degree. Our detection model has been elaborated to incorporate the observed effects. We have also measured detection thresholds for individual basis functions when superimposed upon another basis function of the same or a different frequency. We find considerable masking between nearby DCT frequencies. A model for these masking effects also is presented.
Deblocking DCT compressed images
Albert J. Ahumada Jr., Rensheng Horng
Image compression based on quantizing the image in the discrete cosine transform (DCT) domain can generate blocky artifacts in the output image. It is possible to reduce these artifacts and rms error by correcting DCT domain measures of block edginess and image roughness, while restricting the DCT coefficient values to values that would have been quantized to those of the compressed image.
JPEG compliant encoder utilizing perceptually based quantization
The international JPEG (Joint Photographics Experts Group) standards for image compression deal with the compression of still images. It specifies the information contained in the compressed bit stream, and a decoder architecture that can reconstruct an image from the data in the bit stream. However, the exact implementation of the encoder is not standardized. The only requirement on the encoder is that it generate a compliant bit stream. This provides an opportunity to introduce new research results. The challenge in improving these standards based codecs is to generate a compliant bitstream that produces a perceptually equivalent image as the baseline system that has a higher compression ratio. This results in a lower encoded bit rate without perceptual loss in quality. The proposed encoder uses the perceptual model developed by Johnston and Safranek to determine, based on the input data, which coefficients are perceptually irrelevant. This information is used to remove (zero out) some coefficients before they are input to the quantizer block. This results in a larger percentage of zero codewords at the output of the quantizer which reduces the entropy of the resulting codewords.
Perceptual image distortion
Patrick C. Teo, David J. Heeger
In this paper, we present a perceptual distortion measure that predicts image integrity far better than mean-squared error. This perceptual distortion measure is based on a model of human visual processing that fits empirical measurements of: (1) the response properties of neurons in the primary visual cortex, and (2) the psychophysics of spatial pattern detection. We also illustrate the usefulness of the model in measuring perceptual distortion in real images.
Model-Based Halftoning
icon_mobile_dropdown
Graphic arts perspective on digital halftoning
Michael A. Rodriguez
Advances in the halftone process over the last century have served to continually improve color image reproduction. This development is approaching a mature phase where, with the help of digital electronics, we are formulating algorithms for optimal quality, consistency, and productivity. The problem definition is complex when all issues are considered so that, rather than delving into specific algorithms, this paper discusses the general technical considerations that have guided the development of electronic halftoning in the commercial graphic arts industry. A look at the advantages of a new strategy, stochastic screening, also is included.
Error diffusion with a more symmetric error distribution
In this paper a new error diffusion algorithm is presented that effectively eliminates the `worm' artifacts appearing in the standard methods. The new algorithm processes each scanline of the image in two passes, a forward pass followed by a backward one. This enables the error made at one pixel to be propagated to all the `future' pixels. A much more symmetric error distribution is achieved than that of the standard methods. The frequency response of the noise shaping filter associated with the new algorithm is mirror-symmetric in magnitude.
Threshold modulation in error diffusion on nonstandard rasters
Threshold modulation has been shown to produce strong effects in error diffusion images. In particular a threshold modulation that is proportional to the input image induces edge enhancement in the output error diffused image. In this paper, examples are shown for image- dependent threshold modulation incorporated into error diffusion implemented on a non- standard raster, such as a serpentine raster or a space filling curve. The amount and type of edge enhancement induced is shown to vary for different raster directions.
Nonlinear detail enhancement of error-diffused images
Stefan Thurnhofer, Sanjit K. Mitra
Error diffusion distributes the quantization error over the local neighborhood in such a way that the local mean of the halftoned image is adjusted to the local mean of the original image. This works fairly well in homogeneous portions of the image. But along edges and in areas with fine texture and details, spreading the error out over some pixels inevitably blurs the image locally and obscures details. In this paper, a variation of the standard error diffusion algorithm is proposed using a nonlinear filter for feature extraction. With this filter the error diffusion process can be adapted to the local image characteristics in such a way that the quantization error is not diffused across edges. This leads to sharper halftoned images with more detail information. It also increases the local image contrast which results in an overall perceptually more pleasing rendering of the halftoned image.
Blue noise and model-based halftoning
Mark A. Schulze, Thrasyvoulos N. Pappas
`Model-based' halftoning techniques use models of visual perception and printing to produce high quality images using standard laser printers. Blue-noise screening is a dispersed-dot ordered dither technique that attempts to approximate the performance of error diffusion with much faster execution time. We use printer and visual system models to improve the design of blue-noise screens using the `void-and-cluster' method. We show that, even with these improvements, the performance of blue-noise screens does not match that of the model-based techniques. We show that using blue-noise screened images as the starting point of the least- squares model-based (LSMB) algorithm results in halftones inferior to those obtained with modified error diffusion (MED) starting points. We also use simulated annealing to try to find the global optimum of the least-squares problem. Images found this way do not have significantly lower error than those resulting from the simple iterative LSMB technique starting with MED. This result indicates that the simple iterative LSMB algorithm with a MED starting point yields a solution close to the globally optimal solution of the least-squares problem.
Applications of fractal analysis in the evaluation of halftoning algorithms and a fractal-based halftoning scheme
Theophano Mitsa, Jennifer R. Alford
Fractals are mathematical sets that model many natural phenomena and physical object such as clouds, mountains, trees, and coastlines. The principal features of fractal objects are: (1) a large degree of heterogeneity, (2) scaling similarity over many scales of observation, and (3) the lack of a well-defined (or characteristic) scale. In this paper, we investigate the applications of fractal analysis in halftoning. Specifically, we first investigate and compare the fractal properties of aperiodic constant-gray-level halftone patterns produced by error diffusion, the blue-noise mask, and white noise. Then, given that the fractal dimension of an image area can predict the perceived smoothness or roughness of this area's texture, we describe an error diffusion scheme where the error weights depend on the local fractal dimension of the gray scale image prior to halftoning. The resulting halftones have less grainy flat areas and sharper edges than standard error diffusion with perturbed weights. This can be attributed to incorporation of texture information in the ehalftoning scheme that distributes the halftoning error in proportion to local texture roughness and therefore diffuses it in the areas where it is least visible.
Least-squares model-based video halftoning
Dennis P. Hilgenberg, Thomas J. Flohr, Clayton Brian Atkins, et al.
A technique for halftoning video sequences is presented. First, an error metric is designed, incorporating psychophysical spatio-temporal contrast sensitivity data for the human visual system. An efficient strategy for producing halftone sequences that minimize the error is then developed. The technique is compared with video halftoning algorithms that do not consider the temporal aspect of the human visual system.
3-D Image Analysis and Perception
icon_mobile_dropdown
Shape from texture: an evaluation of visual cues
Wolfgang Mueller, Axel Hildebrand
In this paper an integrated approach is presented to understand and control the influence of texture on shape perception. Following Gibson's hypotheses, which states that texture is a mathematically and psychological sufficient stimulus for surface perception, we evaluate different perceptual cues. Starting out from a perception-based texture classification introduced by Tamura et al., we build up a uniform sampled parameter space. For the synthesis of some of our textures we use the texture description language HiLDTe. To acquire the desired texture specification we take advantage of a genetic algorithm. Employing these textures we practice a number of psychological tests to evaluate the significance of the different texture features. A comprehension of the results derived from the psychological tests is done to constitute new shape analyzing techniques. Since the vanishing point seems to be an important visual cue we introduce the Hough transform. A prospective of future work within the field of visual computing is provided within the final section.
Investigations of three-dimensional shape perception for telepresence using superquadric primitives
Roger A. Browse, James C. Rodger
Graphic displays for virtual environments and telerobotics require effective communication of the details of 3-D object shape. This paper presents empirical evidence on the relation between human perception and several properties of graphic shape depiction. A series of experiments examined a 3-D shape discrimination task requiring judgments of superquadric volume primitives varying in shape within different rendering and display conditions. The displays were dynamic, with constant rotational motion. Over the series of experiments, the contributions of diffuse and specular shading, occluding contour, aspect ratio, and covarying size were evaluated. The results revealed a consistent sensitivity to differences in superquadric shape parameters, that was surprisingly robust over rendering variations. One major finding was that the presence of specular highlights did not enhance shape discrimination performance beyond that observed for purely diffuse reflectance. The results suggest strategies for optimizing interface properties where 3-D shape is a primary component of the display. They also support the use of superquadric primitives in situations where humans interact with shape display systems.
CUBICORT: simulation of the visual cortical system for 3D image analysis, synthesis, and hypercompression for digital TV, HDTV, and multimedia
Pascal Leray, F. Guyot, Patrick Marchal, et al.
We describe simulation elements of a new kind of 3D vision simulator, for preprocessing objects and movement analysis in 3D, using the biological concept of the cortical column paradigm in the visual area. The target simulator is primarily dedicated to ultra high image compression for the telecommunication of digital TV images (MPEG4), HDTV, and 3D TV, but can also be used for automatic modeling, digitizing, robotics, and image synthesis. This simulator extracts 3D objects and movements by using the properties of hypercolumns within the visual cortex for spatio-temporal pyramidal filtering, learning, and performs inter and intra-cooperation between these simulated hypercolumns. The simulation process has four levels for analysis - synthesis: pixels, zones, objects and labels. Final synthesis (reconstruction) is processed by reverse filtering, using non-orthogonal basis filters. Substantial upgrades in terms of compression ratio have been estimated using this algorithm as a whole, or partially, with integrated VLSI.
Subjective image position in stereoscopic TV systems: considerations on comfortable stereoscopic images
Tetsuo Mitsuhashi
In stereoscopic picture systems, the image should be reproduced in a psychologically proper position that allows comfortable stereoscopic perception. Position matching in observer space between the stereoscopic image and a small marker was successfully adopted to measure the subjective image position. The subjective image position was measured for geometric figures and actual scenes while varying the resolution, contrast, brightness, and color between the L- and R-pictures. The image moves closer to the screen in proportion to the increase in resolution difference. A pair of color and darker monochrome pictures can reproduce acceptable stereoscopic images without conspicuous color rivalry. Mental fatigue is also an important factor in comfortable stereoscopic image viewing. The CFF is discussed together with other factors such as accommodation characteristics, picture quality, and viewing conditions that relate to future stereoscopic TV systems that allow comfortable viewing.
Perception and Data Visualization
icon_mobile_dropdown
Visual inspection of data: does the eyeball fit the trend?
Gene S. Fisch, Anthony F. Porto
Graphs are important tools for conveying quantitative information. However, studies have shown that visual inspection of data may not be very reliable. Therefore, it is essential to identify factors that affect visual inspection. We sought to determine how subjects' accuracy in visual inspection were affected when they constructed best-fitting lines to point-to-point functions generated by lag one auto-regressive equations. Subjects were tested then retested after completing a course on experimental methodology. When asked to identify treatment effects and/or trends, subjects mean correct responses were low (36%). On retest, their correct response rates showed little change. Mean correct responses to best-fitting lines was much higher (60%), and increased on retest (71%). The disparity between correct responses for identifying treatments and/or trends and constructing best-fitting lines was in part due to parameters in the auto-regression equation. Test-retest reliability was also better for best- fitting functions than for descriptive, multiple-choice responses. Subjects continue to demonstrate difficulty in separating trends from treatment effects.
Interactive exploration of multidimensional data
David A. Rabenhorst
This paper describes a system called diamond for interactive exploration of multidimensional data. Diamond takes advantage of human pattern recognition and processing capacity, and puts major emphasis on performance and responsiveness. This creates a highly productive symbiosis between the human and the system. The basic philosophy of diamond is to depict the data with pictures, and to help the user manipulate the pictures to rapidly gain insight. Examples of some of the data visualizations the system provides are histograms, 2- and 3- dimensional scatter plots, parametric snake plots, parallel coordinate plots, and novel fractal foam plots. Numerous data presentations can be simultaneously displayed, and easily compared and contrasted. New data visualizations can be dynamically created at any time by making selections among the currently existing ones. The formulation and testing of hypotheses by the human user is expedited by permitting classifications and transformations upon the data from whichever perspective is convenient or interesting.
Using perceptual rules in interactive visualization
Bernice E. Rogowitz, Lloyd A. Treinish
In visualization, data are represented as variations in grayscale, hue, shape, and texture. They can be mapped to lines, surfaces, and glyphs, and can be represented statically or in animation. In modem visualization systems, the choices for representing data seem unlimited. This is both a blessing and a curse, however, since the visual impression created by the visualization depends critically on which dimensions are selected for representing the data (Bertin, 1967; Tufte, 1983; Cleveland, 1991). In modem visualization systems, the user can interactively select many different mapping and representation operations, and can interactively select processing operations (e.g., applying a color map), realization operations (e.g., generating geometric structures such as contours or streamlines), and rendering operations (e.g., shading or ray-tracing). The user can, for example, map data to a color map, then apply contour lines, then shift the viewing angle, then change the color map again, etc. In many systems, the user can vary the choices for each operation, selecting, for example, particular color maps, contour characteristics, and shading techniques. The hope is that this process will eventually converge on a visual representation which expresses the structure of the data and effectively communicates its message in a way that meets the user's goals. Sometimes, however, it results in visual representations which are confusing, misleading, and garish.
Color Image Rendering and Enhancement
icon_mobile_dropdown
Empirical assessment of selected color-quantizing algorithms
Susy S. Chan, Rosalee Nerheim-Wolfe
We develop an empirical approach for evaluating the performance of color-quantizing algorithms in controlled laboratory experiments. We chose a 4 X 4 X 2 factorial experimental design to examine the performance of four algorithms at four palette sizes in two arrangements. The independent variables are algorithm, palette size and arrangement, and the dependent variables are perceived color fidelity, contrast, and overall quality. Subjects observed 128 quantized images on slides. We find significant between-subject effects of palette size and algorithms on all dependent variables, and observe consistent patterns of significant palette size effects. The 16-color palette results in the poorest perceived image quality, and the 64-color palette is poorer than the 256-color palette. However, there is homogeneity in performance between 64 and 128 colors, and between 128 colors and 256 colors. The effect of algorithm is less consistent.
Model-based color image sequence quantization
Clayton Brian Atkins, Thomas J. Flohr, Dennis P. Hilgenberg, et al.
We investigate the display of color image sequences using a model-based approach to multilevel error diffusion. We extend Bouman and Kolpatzik's technique for design of optimal filters to the temporal dimension. Our model for the human visual system accounts for the spatial and temporal frequency dependence of the contrast sensitivity of the luminance and chrominance channels. We observe an improvement in image quality over that yielded by frame-independent quantization, when the frame rate is sufficiently high to support temporal averaging by the human visual system.
Modeling of screened color prints through singular value decomposition
Mikael Wedin, Bjoern Kruse
To model the complex printing process and to be able to study the behavior of the ink together with the paper is of vital importance for improving printing quality. The properties of the total printing transfer function are studied using test prints and signal analysis of the scanned data. Analyzing the color distribution of the different screen cell densities makes it possible to characterize the printing configuration. Mechanical dotgain is studied with a normalized screen cell color distribution. The normalization is based on the colorimetrical center of gravity for the tint and the paper of each sampled density. The dot area of each screen cell is determined by assigning a transition area between the tint and the paper background. The transition area is dependent on ink type, properties of the paper, and the printing process. The model of the colorimetric variation of the screen cell is made through dyadic decomposition of the calculated singular value decomposition (SVD) and the model is used to determine the mechanical dotgain for densities not available in the test ensemble. The results are then used to take the mechanical dotgain into account prior to printing.
Color palette restoration
Barbara E. Schmitz, Robert L. Stevenson
When designing hardware, it is often desirable to represent images as economically as possible. Due to this, algorithms have been developed to create reduced palette images. Much better viewing results can be obtained by first reconstructing a full color image from the reduced palette image. This creates a need for a palette restoration algorithm. This paper develops an algorithm to reconstruct high resolution color image data from reduced color palette images. The algorithm is based on stochastic regularization using a non-Gaussian Markov random field model for the image data. This results in a constrained optimization algorithm that is solved using an iterative constrained gradient descent computational algorithm. During each iteration the potential update must be projected onto the constraint space. In this paper a projection operator that maps a vector onto a quantized constraint space is developed. Results of the proposed palette restoration algorithm have indicated that it is effective for the reconstruction of palettized images. Quantitative as well as visual results of the experiments are presented.
Color Constancy and the Interaction of Lights with Surfaces
icon_mobile_dropdown
Color perception under various light sources using spectral numerical models
Richard Mitanchey, Marc R. Fontoynont
A high spectral sampling density numerical model is proposed for use for accurate colorimetric calculations in geometrically complex architectural spaces. The main applications are oriented toward lighting engineering and computer graphics, solving color appearance match between rendering and the actual space with a color perception model deriving from brightness to luminance relations. Some other color information (correlated color temperatures and color rendering indices) available on all surfaces help the lighting designer to appreciate complex color radiation problems in architectural spaces.
Color constancy and a changing illumination
Graham D. Finlayson
The color constancy problem has proven to be very hard to solve. This is even true in the simple Mondriaan world where a planar patchwork of matte surfaces is viewed under a single illuminant. In this paper we consider the color constancy problem given two images of a Mondriaan viewed under different illuminants. We show that if surface reflectances are well modeled by 3 basis functions and illuminants by up to 5 basis functions then we can, theoretically, solve for color constancy given 3 surfaces viewed under 2 illuminants. The number of recoverable dimensions in the illuminant depends on the spectral characteristics of the sensors. Specifically if for a given sensor set a von Kries type, diagonal model of color constancy is sufficient then we show that at most 2 illuminant parameters can be retrieved. Recent work has demonstrated that for the human visual system a diagonal matrix is a good model of color constancy given an appropriate choice of sensor basis. We might predict therefore, that we can recover at most 2 illuminant parameters. We present simulations which indicate that this in fact the case.
Bayesian method for recovering surface and illuminant properties from photosensor responses
David H. Brainard, William T. Freeman
The goal of computational color constancy is to recover the physical properties of illuminants and surfaces from photosensor responses. We formulate computational color constancy as a statistical estimation problem. We assume that the likelihood that any particular illuminant or surface will occur in a scene is governed by a prior probability distribution. In particular, we assume that illuminant spectral power distributions are drawn according to multivariate normal distribution over the weights of a finite dimensional linear model, and similarly for surface reflectance functions. Given a set of photosensor responses, Bayes rule may be applied to derive the posterior distribution for illuminants and surfaces. We discuss how to use the posterior to estimate the illuminant. We use simulation to compare the performance of a Bayesian algorithm to that of two previously reported color constancy algorithms. For our simulation conditions, the Bayesian algorithm results in the smallest expected estimation error.
Perceiving and Using Color
icon_mobile_dropdown
Use of a model of the human visual system to determine the attributes of aperture colors: hue, brightness, and saturation
Robert Siminoff
A previously developed model of the human fovea is modified for analysis of colored stimuli. Psychophysical data on perception of aperture colors are used as a guide for the development of color perception by the model. The channels carrying information about color are represented by the summed responses of midget C- and L-type cells. The spectral energy distribution of any colored stimulus produces within the cell types a unique pattern of activities from which the amounts of each of the hues, whiteness, brightness, and saturation can be determined. The present analysis is restricted to the blue-cone region of the parafovea, which surrounds the blue-cone-free central fovea. For aberration-free dispersion and for no or little self-adaptation, the psychophysical and model data are in good agreement as to the 3 attributes hue, brightness, and saturation. As a result of univariance, information as to the spectral distribution of the stimulus cannot be used, and this creates a problem as how to normalize colored stimuli to a common set of standards. A method based on normalization of the outputs of the retinal cells to a common white is presented.
Separation of luminance and chromatic information by Hebb-Stent rules
William McIlhagga, Graeme Cole
The P cell pathway in primates is involved in both luminance and chromatic perception. This correlates well with P cell receptive fields, which are both spatially and chromatically opponent. Since, however, the luminance and chromatic channels found in psychophysics are independent, the mixed luminance and chromatic information in P cell signals must be demultiplexed in cortex. We have examined the ability of an unsupervised neural network to demultiplex P cell signals, using realistic visual inputs. Digitized images, corrected to be statically similar to retinal images, were sampled by a simulated retinal mosaic, and filtered by difference-of-Gaussians P cell receptive fields. The simulated P cell signals were used as inputs to a network designed to maximize unit responses while minimizing the correlation between units. After a period of training, we evaluated the receptive fields formed in the network. The neurons clearly fell into two categories. The first were those sensitive to changes in intensity in the retinal image; that is, luminance selective units. The second were those sensitive to a color difference in the retinal image; that is, chromatically selective units.
Compression of digital color images based on the model of spatiochromatic multiplexing of human vision
Eugenio Martinez-Uriegas, John D. Peters, Hewitt D. Crane
SRI International has developed a new technique for compression of digital color images on the basis of its research of multiplexing processes in human color vision. The technique can be used independently, or in combination with standard JPEG or any other monochrome procedure, to produce color image compression systems that are simpler than conventional implementations. Specific applications are currently being developed within four areas: (1) simplification of processing in systems that compress RGB digital images, (2) economic upgrading of black and white image capturing systems to full color, (3) triplication of spatial resolution of high-end image capturing systems currently designed for 3-plane color capture, and (4) even greater simplification of processing in systems for dynamic images.
Color scene analysis
This paper describes a color scene analysis method for the object surfaces appearing in the noisy and imperfect images of natural scenes. It is developed based on the spatial and spectral grouping property of the human visual system. The uniformly colored surfaces are recognized by their monomodal 3-D color distributions and extracted in the spatial domain using the lightness and chromaticity network of the Munsell system. The textured image regions are identified by their irregular histogram distributions and isolated in the image plane using the Julesz connectivity detection rules. The method is applied to various color images corrupted by noise and degraded heavily by under-sampling and low color-contrast imperfections. The method was able to detect all the uniformly colored and heavily textured object areas in these images.
Adaptive quality improvement method for color images
Akira Inoue, Johji Tajima
This paper proposes a new method for making automatic improvements in digital color image quality. With this method, images obtained either with a video camera or by scanning photographs are automatically enhanced on a display, and no prior knowledge is required with regard to the image data distortions introduced by optical scanners, cameras, or transmission systems. In this paper, the feature values of three important factors for color image quality -- image sharpness, contrast, and saturation -- are defined based on human visual perception, and algorithms used for image enhancement are introduced. The method has been experimentally applied with excellent results to the enhancement of images obtained by scanning photographs: quality improvements achieved automatically were 70% of those achieved manually by human subjects. This simple, effective method appears to hold great promise for application to a variety of imaging devices.
Adaptive learning systems and qualitative manipulation of digital imagery
Recent developments in adaptive learning systems allow quantifying of a user's qualitative aesthetics and provide an alternative to more traditional approaches to image manipulation. Image enhancement or other desired manipulations can be thought of as nonlinear transformations from an input space of arbitrary images into an output space of desired aesthetic images. Derivation of imaging manipulations of this type can be cast as supervised learning problems. Approaches to reduce the dimensionality of the transformations described above are highly desirable. One approach is to define transformations through more structured descriptors than raw image pixels. Transformations are then learned between sets of image metrics as opposed to sets of image pixels. Adaptive neural networks can be used to learn arbitrary imaging transformations from example images. An alternative approach that is functionally equivalent is to use an adaptive fuzzy logic controller. Fuzzy logic can be thought of as a linguistically understandable meta-representation of an underlying functional transformation. Fuzzy logic also provides a possible link between semantic labeling of qualitative image characteristics and the underlying raw image data.
How to identify up to 30 colors without training: color concept retrieval by free color naming
Gunilla A. M. Derefeldt, Tiina Swartling
Used as a redundant code, color is shown to be advantageous in visual search tasks. It enhances attention, detection, and recall of information. Neuropsychological and neurophysiological findings have shown color and spatial perception to be interrelated functions. Studies on eye movements show that colored symbols are easier to detect and that eye fixations are more correctly directed to color-coded symbols. Usually between 5 and 15 colors have been found useful in classification tasks, but this umber can be increased to between 20 to 30 by careful selection of colors, and by a subject's practice with the identification task and familiarity with the particular colors. Recent neurophysiological findings concerning the language-concept connection in color suggest that color concept retrieval would be enhanced by free color naming or by the use of natural associations between color concepts and color words. To test this hypothesis, we had subjects give their own free associations to a set of 35 colors presented on a display. They were able to identify as many as 30 colors without training.