Human spatial processing accounts for dynamic range and color
Silver halide photography has a fixed response to scenes. Films count photons, and that results in a count-dependent optical density. The range of the film's sensitivity to light, as well as its optical density, is determined in the factory where the film is made. However, digital imaging makes it possible to render wider ranges of light intensity, thereby more closely reproducing the variations in illumination that are found within scenes. With better imaging hardware, it is now possible to improve the range of light sensors and displays, and increase the number of digital quantization levels. It should also be possible to make more accurate reproductions over a greater range of light intensities. Raw formats in today's commercial cameras enable more bits per pixel.1 A sensor's dynamic range is the ratio of the amount of light that causes its maximum response to the light that causes its minimum response. Active pixel processing has led to techniques that can increase the sensor's dynamic range to a ratio of 1010 to 1.2–4 For displays, Seetzen et al. used an ‘unsharp mask’ technique to illuminate a high-resolution liquid-crystal display with an out-of-focus, modulated array of LED illuminators,5 and make digital high-dynamic-range (HDR) displays with the ratio of maximum light emission to minimum emission of 105 to 1 and a 1500cd/m2 maximum screen luminance.6 As we start to incorporate these techniques in imaging, the question becomes, how much of the 1010 range of the sensor response and the 105 range of the displays can we actually use?
In 1941, Jones and Condit measured the range of light in 128 outdoor scenes and the camera responses to them.7 They also measured the range on each camera's film plane, and showed that the optical veiling glare (scattered light), not the sensor signal-to-noise, sets the usable dynamic range of cameras. They designed negatives that captured all the information in any scene above the glare limit, roughly 4 log units). However, not all this information was necessarily accessible in a print. High-contrast print paper caused loss of scene detail. Recent measurements of the veiling glare limits in cameras and in human vision also show strong scene dependency. Camera optics add an unwanted fraction of the scene's light to the image of the scene on the sensor.8 This unwanted glare is present in all multiple exposures, and makes calculations of scene radiance unreliable.9
Further, intraocular glare limits the range of light falling on the retina.10–12 As with cameras, the range of usable light for humans varies with the scene content: with a maximum white background it is about 1.5 log units; with a 50% white background around 2 log units; and with an unlit background slightly over 4 log units. Optical veiling glare limits the range of scene radiances cameras can capture accurately, whereas intraocular glare limits the range of light from the scene that falls on the retina.
Nevertheless, HDR imaging dramatically improves scene rendering. There must be reasons, other than accurate scene rendition, that account for the success of HDR imaging. Two important, but paradoxical observations can help to explain this. First, artists since the Renaissance have rendered HDR scenes in the low-dynamic range of paintings. Outdoor scenes with sun and shade have 1000 to 1 ranges of light, while oil paints have a range of only 30 to 1. Second, humans see over 4 log units of detail in shadows using optic nerves that can transmit only 2 log units of range.13
We have found that the paradoxes can be resolved by considering how human spatial image processing counteracts the effects of glare. Humans have two independent and opposing spatial mechanisms. Intraocular veiling glare reduces the luminance range on the retina, while neural ‘simultaneous contrast’ effects increase the apparent differences. Figure 1 shows the classic simultaneous contrast with two identical scene luminance gray patches. The one surrounded by white has more glare than the gray-in-black. If retinal luminance predicted appearance, then the gray-in-white would appear lighter than the gray-in-black. However, through human spatial image processing, simultaneous contrast makes the lower contrast gray-in-white look darker. In this case, contrast overcompensates for glare.
Glare distorts the luminances of the scene in the image on the retina by lowering retinal contrast. However, simultaneous neural spatial processing works to counteract glare. The image with lower actual contrast appears to have a higher apparent contrast. Both glare and simultaneous contrast are spatial processes and depend on the content of the entire scene.
Color constancy is usually considered to be unrelated to HDR scene rendition. Unlike films that have fixed spectral sensitivities, humans see objects with constant color appearances from variable spectral stimuli. This has traditionally been attributed to pixel-based normalization, similar to white balancing a camera. Many assume that retinal cones adapt to changes in illumination to discount it, so that appearance correlates with reflectance. Such hypothetical mechanisms can account for flat 2D color test targets, but fail to predict the variable appearance of constant paints in complex illumination on 3D colored blocks.14 Figure 2 shows photographs of two identical arrangements of 3D colored blocks called Mondrians: one in nearly uniform illumination, and the other in directional (HDR) illumination. Adaptation, or any other single-pixel process, cannot predict colors in 3D Mondrians.15 The small departures from perfect color constancy observed are predicted by spatial comparisons with colors in the rest of the view.16 Color constancy depends on spatial comparisons that synthesize how colors appear in relation to all the segments of the scene.17
At first, HDR imaging may have seemed best suited for improved recordings of scene radiances. However, glare limits the range of light that can be detected by cameras or the retina. All scene regions below middle gray are influenced, more or less, by the glare from the bright scene segments. Instead of accurate radiance reproduction, HDR imaging works well because it preserves the details in the scene's shadows. Spatial image processing preserves this information, but distorts accurate reproduction. Similarly, color constancy is the result of color comparisons of the entire scene.
My future research will focus on studying the correlations between HDR image-processing algorithms and models of human vision.
The author is grateful for the collaboration of Alessandro Rizzi, Carinna Parraman, Vassilios Vonikakis, Bob Sobel, and Mary McCann.
John McCann graduated from Harvard in 1964. He managed Polaroid's Vision Research Laboratory from 1961 to 1996. He studied human color vision, digital-image processing, large-format instant photography, and the reproduction of fine art. His 120 publications have studied Retinex theory, rod/long-wave cone colors at low-light levels, and appearance with scattered light/HDR imaging. He currently consults on color imaging.