The image quality of general purpose video cameras has been improving dramatically, due largely to higher pixel resolutions as well as increases in processing power via Moore's law. However, one problem still remains: such cameras offer relatively poor performance when the illumination of a scene is uneven. The dynamic range (DR) of video cameras is still well below the effective DR of the human eye, and as a result, most cameras properly expose only part of a poorly lit scene to a reasonable level of sharpness. The remainder of the output will be either under- or overexposed, with suboptimal definition of details.
To address this problem, some manufacturers have introduced cameras that take a number of pictures in quick succession—called multiple image capture1—and combine the resulting images according to proprietary algorithms.2 Others change the parameters of the detection circuitry according to predetermined schemes3,4 such as well capacity adjustment, or use nonlinear sensors5 such as logarithmic detectors. However, although these efforts can certainly enhance image quality, additional improvements can still be made.
Applying a technique similar to image compression methods such as MPEG-2/4, we can use the brightness patterns of recent frames to predict the likely brightness of upcoming frames:6 areas that were dark in the recent past are likely to remain dark in the near future, whereas bright areas are likely to remain bright. Even in moderately dynamic scenes, the differences between consecutive images are rather small, with most pixel levels remaining the same or changing within the range of the system's A/D converter.
The predicted brightness pattern can then be used to program the local operating parameters of the image sensor, such as integration time and reverse bias voltage, such that it captures a maximum amount of information from the image. The relatively few pixels that undergo drastic brightness transitions are then processed separately as the camera revises its predicted brightness pattern at the end of each frame.
The structure of a high DR camera using brightness prediction is shown in Figure 1. It consists of a random access sensor array and a processor/memory block. Within a given frame, the dynamic range of each sensor cell in the array is assumed to depend on one or more key parameters, such as the previously mentioned integration time or bias voltage. The storage elements or activation nodes for the relevant key parameter(s) of each cell must be randomly accessible.
Figure 1. A video camera using brightness prediction includes a sensor array and a processor/memory block. The sensor passes a video stream to the processor/memory block, which can in turn change the sensor's parameters based on brightness patterns within the stream.
In normal operation, the array captures an input image and converts it into a ‘video out’ stream of serial information, including analog voltages and digital sequences. Then, in addition to being delivered to the client application, the stream is also fed to the processor/memory block. There, the areas of the image outside a certain acceptable range are targeted for key parameter adjustment. The nature of the adjustment may depend on interaction with an operator (through user input) or on stored algorithms.
Figure 2 shows the block diagram of an adaptive system, where the key dynamic range parameter is integration time. The system can easily be separated into a set of two integrated circuits: a camera sensor and a camera processor.
Figure 2. An adaptive system using integration time as the key dynamic range parameter is composed of a camera sensor and a camera processor.
The operation of the camera sensor is straightforward: it checks whether the pixel located at the applied address has reached its saturation level. The novelty of the system is in the way the pixel addresses themselves are generated. Instead of producing them linearly according to a predetermined order, the camera processor generates the addresses of the brightest pixels first, then of the next brightest pixel group, and continues until all pixels are read. The addresses (not values) of the brightest pixels are stored in the memory block Map 1, the next brightest in the block Map 2, and so on (see Figure 3). The maps have fixed locations, and each is read for a fixed number of times, or scans, before the processor moves on to the next one.
Figure 3. The camera processor stores the addresses of pixels based on their brightness values. The brightest pixels are stored in the Map 1 block, the next brightest in Map 2, and so on, until those with the lowest brightness levels are stored in the last block.
Before a new image is read, all of the pixels in the array are reset and the data counter is loaded with its initial, maximum value. Then the addresses stored in Map 1—the brightest pixels—are sequentially read out to the sensor array, and the corresponding pixels are checked for saturation. Those that have reached saturation will trip the comparator and generate a ‘write pixel’ signal. The relevant pixel data is then stored in a dual ported image buffer at the end of the correct raster address.
The pixel data itself is the current value of the data counter, corrected for various eye or system nonlinearities. A saturated pixel is marked by setting the ‘pixel read’ flag associated with it. Marked pixels are then skipped from further scans, thus lowering the overall processor bandwidth. Following each scan the data counter is decremented. After a preset number of scans are performed for a given map, the remaining addresses (still unsaturated pixels) are moved to the pixel address buffer, and the processor moves on to the next map.
As a result, pixels that are too dark for their current map are moved via the buffer to a ‘darker’ map. The addresses of pixels already found to be saturated during the first scan of a new map are also moved to the buffer temporarily, but are moved up to a ‘brighter’ map when the current image read cycle ends. A successive approximation algorithm is used to select the proper relocation map.
The process is repeated until all pixels are read, or until the data counter reaches its minimum value. At that point the image read cycle is completed. The limits of the data counter may be adjusted—they are user programmable, or may change in accordance with the dynamic range of the image—and the addresses in the pixel address buffer are distributed among different maps. A faster sorting process is achieved by replacing the comparator at the output of the camera sensor with a coarse A/D converter.
Predicting the brightness patterns of future images can enhance the dynamic range of general purpose video cameras without multiple image capture techniques or the use of different sensors. Although the end result is still far from the 180dB effective dynamic range of the human eye, this technique has the potential to raise the performance of standard cameras from their current 50–80dB range to a more respectable 120dB.