High-resolution 3D light-field display
Well-known light-field (LF) optics are currently being exploited in many so-called LF (or plenoptic) cameras, i.e., a special kind of camera with which the focal conditions of the pictures can be altered after they have been obtained. The 2D images can then be generated by sectioning the acquired data at a specific selected point.1 Given this capability, it should also theoretically be possible to reconstruct the original 3D scene from the 2D data. Such 3D reconstructions, however, have not previously been realized. Moreover, the most common methods for 3D display are stereoscopic or multiview displays. With these devices, however, 3D optical images are not reconstructed (rather, they make a user feel ‘solidity’ through visual congestion). In addition, it has been reported that viewing these types of displays often causes users to experience ‘3D sickness’ because of the visual contradictions.
LF cameras can be used to realistically record 3D scenes because they record the conditions of light rays (i.e., where each ray is expressed as a combination of its position and direction) rather than actual images.2 The rays are thus described in four dimensions (in contrast to the original 2D image). At first glance, LF data seems to be no different from an ordinary 2D image. By enlarging the LF data, however, it can be seen that small circular cells are tightly packed all over the image (see Figure 1). These cells are images of the aperture of the camera lens, each of which is a microlens. The coordinates of each cell indicate the position of the ray, whereas the position within each cell provides the direction of the ray (because each cell is optically conjugated with the aperture).
Schematic ray-tracing diagrams in Figure 2 illustrate how LF cameras are used to record a point source in 3D space on a 2D plane (i.e., a detector). Rays from the point source are split by the microlens array, and a discrete light distribution is acquired on the detector. In the two illustrated examples, the incident light is divided into five and three parts—see Figure 2(a) and (b), respectively—according to the distance between the detector and the lens array. Example 2D distributions—known as ‘patterns’—that are thus produced on the detector are shown in Figure 3. These patterns are therefore a 2D representation of the 3D position of a point source. The total area of the pattern is independent of its shape and is the same size as a single lenslet in the array. Furthermore, the ability to record depth information depends on the area of the pattern. LF data can thus be thought of as a geometric hologram, although its capability is limited by the size of the individual lenslets. This transformation (i.e., which creates the patterns) is reversible, and a point source can therefore be reconstructed from a single pattern. Moreover, as a 3D optical image is a collection of point sources, a whole 3D image can be transformed to 2D LF data by superimposing countless patterns.
In our work,3 we have exploited this LF technique to develop an LF display that can reconstruct real 3D images, without producing visual contradictions (and without inducing 3D sickness). Our LF display is a reverse version of an LF camera, and can be thought of as a decoding device for transforming LF 2D data back into real 3D images. The actual structure of our LF display is quite simple. It consists of a lens-array plate and a flat display. We can easily set the lens-array plate directly on the flat display and find that display adjustments are unnecessary. The off-axis of the lens array can be regarded as a tilted optical axis (along which the image appears) in our LF display. When the display's screen shows LF data, a 3D volume image therefore appears near to the lens array (i.e., because of the reverse transformation process).
To project 3D images from LF data with our display, we require a simple data processing method. For example, because the viewing direction is on the opposite side of the detector than with an LF camera, the perspective of the reconstructed images is reversed. To correct this reversed perspective, we therefore use our data processing method to flip the LF image from every microlens. Through this procedure (which we perform in the Fourier domain, rather than the real domain), we thus map each point source to the other side of the the lens array and we generate an image that is symmetrical to the original. We also normalize the perspective of each image.4
To obtain a resolution for our display that is even finer than the microlens density, we can shift the focal point of the microlens array by a specific distance. The amount of blur is thus equal to the resolution of a single lenslet. We show our LF display in use with a common cellular phone in Figure 4.5 In this example, a cat can be seen as the 3D volume image. We used about 120 lenslets (in the horizontal direction) for this display and found that the resolution of the reconstructed image was higher than the original data's resolution.
In summary, we have developed a new display system for 3D reconstructions from 2D LF data. We have also used a real 3D display example to illustrate the success of our technique. In our 3D decoding approach we do not assign any plane to the 3D images. The display and image are therefore not necessarily within the same optical plane (i.e., it is unnecessary for the display plane to be optically conjugated with the image plane). In our future work we plan to improve the design of our LF displays.6 We are also planning to develop a tablet-type display with which images on other planes can be shown. We also hope to realize thin, head-mounted, flat displays that can be set directly in front of the eyes.7
Toru Iwane originally studied astrophysics at Kyoto University, Japan. He then joined Nikon and changed focus, to study optics and photonics. As part of this work, he has been investigating and developing a focus-sensing system for cameras for several years, and is now investigating light-field optics and its applications.