Beyond the resolution limit: hyperacuity in animals and now in silicon

Photodetectors with nonlinear angular sensitivity provide subpixel resolution for feature extraction in real time.
19 June 2007
Mike Wilcox

The military has a long-standing interest in biomimetic vision systems. Animals provide an existence proof that highresolution information processing can be performed at high speed without the burden of powerful computational resources. Over a century ago, Santiago Ramón y Cajal showed that the cellular architecture in retinas from animals as diverse as mollusks, insects, and vertebrates is identical and concluded that the information-processing principles must also be similar.1 All those retinas are capable of subpixel resolution, a phenomenon known as hyperacuity.2 The center surround profile of retinal ganglion cells is a Gaussian (i.e., bell) shape. Current models propose that retinas send luminance changes to the brain and that high-level cortical processing uses this shape to compute the best guess for motion displacement and image features, making primates very good guessers.



Figure 1. Poles and power lines. When image feature size, for example, these overhanging electrical wires, approaches pixel size in the camera and is shared by two or four pixels, the feature virtually disappears against the bright background, as shown by the 8bit gray-scale values of the wires in the inset.


Figure 2. A monolithic pixel has no sensitivity to motion of a point source within its boundary. Motion sensitivity occurs only when a target or feature crosses from one pixel to a neighbor, a zero crossing (ZC). For a Cartesian array there are n+1 ZCs or 7 ZCs for six pixels. In the overlapped (Venn-diagram-like) sample, with sensor size equal to the pixels above, there are 2n or 64 ZCs for six photodetectors, even when the detectors are monolithic. The depicted trajectory cannot be resolved by the Cartesian array, but it can be resolved from the Venn diagram. The circles represent only the half-maximum of the Gaussian angular sensitivity of the photoreceptors. Resolution is limited only by the signal-to-noise ratio.


Figure 3. (a) The convolution of two circles along one dimension shows that the superposition of a round target and a round detector the same size can provide position information with more accuracy than dictated by the physical size or spacing of the detector. (b) The convolution of an Airy disk (i.e., a Bessel function) and a top-hat function is virtually indistinguishable from a Gaussian function.

Insects and mollusks do not have a cortex but do have hyperacuity that outperforms human vision systems. Therefore, the processing must occur at a stage prior to the cortical level. Photoreceptors in all these animals operate at the diffraction limit (i.e., the photodetector's physical size is only twice the wavelength of light), yet they are capable of subpixel resolution often better than one-tenth the pixel spacing, something current digital cameras cannot do. Perhaps that inability is due to our own misinterpretation of how a vision system works.



Figure 4. The response of three of the seven photodetectors in our prototype as a black edge moves across their receptive fields, producing a smooth output proportional to the position of the edge with resolution much higher than detector size or spacing.

Harry Nyquist, the father of modern digital communication, needed immediate solutions to analog television transmission. He interpreted video imaging in a cinematography model where sequential frame rates produce, for a human observer, the illusion of smooth motion. Slow frame rates and a finite number of pixels per frame limit image resolution. The brute force approach is larger pixel arrays. Transmitting large amounts of data caps our ability to employ the resolution current displays can show. However, James Bucklew and Bahaa Saleh3 showed that resolution is simply a matter of contrast. The concept is illustrated below. Overhead electrical wires having the same contrast value (5) against an open sky (254) apparently disappear as the black feature is shared by two or four pixels and the contrast ratio degrades. This is similar to image blur degrading by the square root and fourth root as the object size doubles and quadruples, shown in Figure 1 as 8bit pixel gray-scale values (from 5 to 211 for the wires).


Current electro-optical sensors are monolithic transducers, that is, their energy collection area has uniform sensitivity across its surface, producing a ‘top-hat’ sensitivity profile. Consequently, the only time there is any response to motion is at a boundary between adjacent detectors, a concept known as zero crossing (ZC).4 Six photoreceptors in a Cartesian array have n+1=7 ZCs (see Figure 2). An overlapped sensor array resembles a Venn diagram with 2n= 64 ZCs. The depicted point-source trajectory is indistinguishable from a straight line across the detectors in a Cartesian array. However, the same motion trajectory across sensors in the Venn diagram with 64 ZCs can be reconstructed. In contrast, all electromagnetic antennas have Gaussian input sensitivity profiles. That geometry is not always exploited. Signal rejection often replaces the shoulder region of the profile in order to augment signal-to-noise ratios.


We and others have documented that individual photoreceptors have Gaussian acceptance profiles. The circles in Figure 2 represent the half-maximum of the Gaussian function for the receptor profile, so we can exploit these nonlinear characteristics to improve spatial resolution and extract more specific information than that dictated by either detector size or spacing, effectively providing the basis of hyperacuity. The result is illustrated with a simple example. If the feature size matches that of the detector, its exact position can be extracted from the output. The overlapping area is the convolution of two circles, as depicted in Figure 3(a). The maximum response occurs when the two circles are superimposed, as a singularity (unique point) that drops off as the centers of the target and detector separate. In a more realistic scenario, the smallest image of a point source is twice its wavelength (the diffraction limit), described by a second-order Bessel (i.e., recurring) function. If the detector has a top-hat sensitivity profile and its dimension matches the diffraction limit, convolution of the two functions produces the blue line in Figure 3(b), while the dotted red line is Gaussian. They are virtually indistinguishable. We showed at least three ways for a photodetector to reproduce this profile.5 A dark edge moving across three receptors yields the outcome (see Figure 4), using only sums.


It is thus possible to extend the spatial resolution limit without loss of photosensitivity. Since we have rejected the brute force approach of multiplying the number of pixels in the array, we realize advantages of data reduction, high spatial frequency detection, and high temporal and subpixel resolution (hyperacuity). The circuitry in our prototype chip is all analog. We extract a vector without memory, a CPU, or numerical computation, and yet the output of an array lends itself to a digital communication format, including all forms of compression and numerical analysis. In the temporal domain, we can obtain directionally selective motion detection using simple arithmetic consisting only of sums. All other algorithms require a nonlinear process in the form of multiplication to compute motion vectors. Our prototype exploits an optical nonlinearity to extract position information with higher resolution than that dictated by the pixel spacing. There is still a way to go to duplicate animal performance, but there is no reason we cannot accomplish that goal and even operate at the diffraction limit.



Recent News
PREMIUM CONTENT
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research