Beyond the resolution limit: hyperacuity in animals and now in silicon
The military has a long-standing interest in biomimetic vision systems. Animals provide an existence proof that highresolution information processing can be performed at high speed without the burden of powerful computational resources. Over a century ago, Santiago Ramón y Cajal showed that the cellular architecture in retinas from animals as diverse as mollusks, insects, and vertebrates is identical and concluded that the information-processing principles must also be similar.1 All those retinas are capable of subpixel resolution, a phenomenon known as hyperacuity.2 The center surround profile of retinal ganglion cells is a Gaussian (i.e., bell) shape. Current models propose that retinas send luminance changes to the brain and that high-level cortical processing uses this shape to compute the best guess for motion displacement and image features, making primates very good guessers.
Insects and mollusks do not have a cortex but do have hyperacuity that outperforms human vision systems. Therefore, the processing must occur at a stage prior to the cortical level. Photoreceptors in all these animals operate at the diffraction limit (i.e., the photodetector's physical size is only twice the wavelength of light), yet they are capable of subpixel resolution often better than one-tenth the pixel spacing, something current digital cameras cannot do. Perhaps that inability is due to our own misinterpretation of how a vision system works.
Harry Nyquist, the father of modern digital communication, needed immediate solutions to analog television transmission. He interpreted video imaging in a cinematography model where sequential frame rates produce, for a human observer, the illusion of smooth motion. Slow frame rates and a finite number of pixels per frame limit image resolution. The brute force approach is larger pixel arrays. Transmitting large amounts of data caps our ability to employ the resolution current displays can show. However, James Bucklew and Bahaa Saleh3 showed that resolution is simply a matter of contrast. The concept is illustrated below. Overhead electrical wires having the same contrast value (5) against an open sky (254) apparently disappear as the black feature is shared by two or four pixels and the contrast ratio degrades. This is similar to image blur degrading by the square root and fourth root as the object size doubles and quadruples, shown in Figure 1 as 8bit pixel gray-scale values (from 5 to 211 for the wires).
Current electro-optical sensors are monolithic transducers, that is, their energy collection area has uniform sensitivity across its surface, producing a ‘top-hat’ sensitivity profile. Consequently, the only time there is any response to motion is at a boundary between adjacent detectors, a concept known as zero crossing (ZC).4 Six photoreceptors in a Cartesian array have n+1=7 ZCs (see Figure 2). An overlapped sensor array resembles a Venn diagram with 2n= 64 ZCs. The depicted point-source trajectory is indistinguishable from a straight line across the detectors in a Cartesian array. However, the same motion trajectory across sensors in the Venn diagram with 64 ZCs can be reconstructed. In contrast, all electromagnetic antennas have Gaussian input sensitivity profiles. That geometry is not always exploited. Signal rejection often replaces the shoulder region of the profile in order to augment signal-to-noise ratios.
We and others have documented that individual photoreceptors have Gaussian acceptance profiles. The circles in Figure 2 represent the half-maximum of the Gaussian function for the receptor profile, so we can exploit these nonlinear characteristics to improve spatial resolution and extract more specific information than that dictated by either detector size or spacing, effectively providing the basis of hyperacuity. The result is illustrated with a simple example. If the feature size matches that of the detector, its exact position can be extracted from the output. The overlapping area is the convolution of two circles, as depicted in Figure 3(a). The maximum response occurs when the two circles are superimposed, as a singularity (unique point) that drops off as the centers of the target and detector separate. In a more realistic scenario, the smallest image of a point source is twice its wavelength (the diffraction limit), described by a second-order Bessel (i.e., recurring) function. If the detector has a top-hat sensitivity profile and its dimension matches the diffraction limit, convolution of the two functions produces the blue line in Figure 3(b), while the dotted red line is Gaussian. They are virtually indistinguishable. We showed at least three ways for a photodetector to reproduce this profile.5 A dark edge moving across three receptors yields the outcome (see Figure 4), using only sums.
It is thus possible to extend the spatial resolution limit without loss of photosensitivity. Since we have rejected the brute force approach of multiplying the number of pixels in the array, we realize advantages of data reduction, high spatial frequency detection, and high temporal and subpixel resolution (hyperacuity). The circuitry in our prototype chip is all analog. We extract a vector without memory, a CPU, or numerical computation, and yet the output of an array lends itself to a digital communication format, including all forms of compression and numerical analysis. In the temporal domain, we can obtain directionally selective motion detection using simple arithmetic consisting only of sums. All other algorithms require a nonlinear process in the form of multiplication to compute motion vectors. Our prototype exploits an optical nonlinearity to extract position information with higher resolution than that dictated by the pixel spacing. There is still a way to go to duplicate animal performance, but there is no reason we cannot accomplish that goal and even operate at the diffraction limit.