Selecting the most effective visual information for retinal prosthesis
Each time we open our eyes, millions of photoreceptors convert light into neural signals and pass them to our inner nuclear layers, and ultimately our optic nerves, via ganglion cells.1 In visually-impaired patients with age-related macular degeneration (AMD) and retinitis pigmentosa (RP), however, the photoreceptor cells cease to function although the inner layers of the retina are intact. Over the last decade, in their quest to partially restore visual perception in patients suffering from AMD or RP, researchers have been exploring the possibility of simulating the function of photoreceptors by implanting electrodes inside the retina to generate neural signals.
Broadly, there are two hardware configurations proposed for a visual prosthesis. The system can be totally intraocular or can combine intraocular and extraocular hardware.2,3 The latter approach incorporates an extraocular camera and processing unit in addition to a retinal stimulation device4 (see Figure 1). This topology increases the image-processing capability while reducing the amount of hardware contained within the eye. This is important because the most challenging aspect of a retinal prosthesis is the limit on the density of the electrode array that arises from practical considerations of power, heat generation, and space.
To put this in context, an image generated by a charge-coupled device (CCD) camera could have a resolution anywhere from 336 × 244 to 3060 × 2060 pixels while the electrode array might be limited to only 25 × 25 electrodes. The image processing demand is therefore to reduce the resolution from at least 336 × 244 to 25 × 25 while maintaining the effective visual information.5 The activation levels of each electrode are also limited. For example, a CCD camera denotes the light intensity in eight bits representing 256 distinct levels. The activation levels of each electrode, on the other hand, could be as low as four. The retinal prosthesis, therefore, requires a reduction in the spectral resolution of the image from 256 grey levels to four. Such a reduction is achieved by progressively selecting pixels in the image between an upper and lower brightness threshold and setting them to a single grey level.
There are many options for selecting the most useful visual information with least resolution.6 The first step in this process is to find the region of interest (ROI) in an image, ignore the rest of the image, and then reduce the resolution of that region. Many resolution-reduction algorithms treat the entire image uniformly and reduce the resolution through averaging, sub-sampling, or through transform operations such as wavelet transform (Figures 24).
To effectively exploit the precious few electrodes available to represent an image, we propose a non-uniform resolution reduction. The multiresolution nature of the eye is such that the center of vision in the fovea is has higher resolution than the surrounding regions. Consistent with this, we simulated a method of multiple resolution reduction. The matrix chosen is exponential in nature, with a pixel size four times larger than those in the adjacent pixel towards the center.
Each of the outer pixels in the final image combine 32 × 32 pixels from the input image. Closer to the center, each pixel combines 16 × 16 input-image pixels. In the center, each pixel combines 8 × 8 pixels from the input image. Thus, the image can be represented by 640 stimulation points. This is a dramatic decrease in the number of pixels required to represent the entire image in 8 × 8 pixels, over 4000 in this case. For comparison, a 25 × 25 matrix requires 625 stimulation points. As such, the acuity of central vision is enhanced without increasing the overall number of stimulation points required. A comparison of multiresolution reduction and homogenous reduction is shown in Figure .Figure 5
Although a retinal prosthesis patient sees only a reduced field of vision (FoV), it is possible for the image-processing unit to undertake context-based processing on the full camera FoV. This allows us to incorporate some intelligent control into the device that can improve the level of assistance provided by the prosthesis and compensate for the crude level of vision available.
Context-based processing can also be used to analyze the scene for key visual information and determine the ROI, and the actual data from that region that is presented to the patient. This would, for example, allow paths and steps to be shown while surplus information such as textured surfaces and surrounding objects could be removed from the field of view.
In our work, we have explored a wide range of issues associated with the reduction of a camera image to a stimulation pattern for an artificial retina. We have simulated a series of resolution-reduction methods and compared them, as well as considering implications of FoV and eye movement. We are continuing our research into approaches for expanding the functionality of the prosthesis through higher-level image processing.
Golshah Naghdy is an associate professor at the School of Electrical, Computer and Telecommunication Engineering University of Wollongong. She received her BSc in Electrical Engineering and Electronic Engineering from Aryamehr University, Tehran in 1977. She did her post-graduate studies in England, where she received an MPhil in Control Engineering and a PhD in Electrical and Electronic Engineering in 1982 and 1986, respectively.