Systems with 3D imaging technologies have entered the consumer market, providing effects not previously available. They use many different techniques such as depth recovery from defocused images,1, 2 stereoscopic cameras,3 and structured light.4 Stereoscopic imagers are generally not optimal for consumer applications due to cost, size, and difficult-to-implement defocus depth recovery. Other forms of 3D imaging, such as structured light, require angle and projector diversity that also complicate consumer systems.
We have developed a technique that uses an astigmatic projector and a simple commercial camera to retrieve 3D images. The advantages with such a system are that it is inexpensive, both the projector and camera are aligned to the same axis, and it is possible to maintain the full resolution one would expect from a commercial camera. The push for 3D imaging has continued throughout the last decade. We believe two key aspects are necessary for a commercial 3D system. First, the system should not be more complicated to use than a traditional camera. Second, the system must maintain the ability to take full-resolution 2D images.
Our system enables depth encoding in the projected pattern itself. We used a projector and cylinder lens to create a region of differential focus—vertical lines of a projected pattern focus closer to our camera than horizontal ones due to the pattern's astigmatic nature. This differential focus region allows us to take contrast ratios at different image points and to use calibration to convert them to distances from the camera.5 This method relies only on ratios of different intensity. Consequently, it is fairly insensitive to illumination level offsets.
Figure 1. (a) The projected pattern after subtracting the background image. Note that the horizontal lines are much brighter than the vertical ones. (b) The vertical component of the fifth-level wavelet transform. (c) The horizontal component of the fifth-level wavelet transform.
To gather 3D image data, we took two images with the camera: one with the projected pattern and one without it. These images were then subtracted, and the remaining pattern was post-processed by taking the fifth-level wavelet transform using a Haar wavelet. We chose a Haar wavelet due to similarities with the structure of our projected pattern (see Figure 1).
Figure 2. (a) The original 2D image captured by our camera. (b) The measured depth mask. Darker blue represents distances close to the camera, while red represents distances farther away. This technique also captures cylindrical objects. (c) A zoomed view of the projected pattern incident upon nearer targets. (d) A zoomed view of the pattern incident upon farther targets showing the vertical and horizontal line contrast flips.
After taking the wavelet transform, we retrieved two images, one vertical and one horizontal. Since the projected pattern is astigmatic, the wavelet transform values in both directions can be used to retrieve a unique ratio across the differential focus region that relates directly to a target's distance to the camera. The two wavelet transform images are then thresholded (a segmentation method) and divided, creating a depth map. This is the final system output that essentially provides the user with all the 3D scene data.
Our results on a random collection of objects have been very promising (see Figure 2). The objects vary in size, flatness, texture, and reflectivity. After capturing and processing an image, the depth information produced correctly shows surfaces that are tilted, cylindrical, and of different reflectivity. The contrast of the vertical and horizontal line changes is based on how far the object is from the camera, giving us a fundamental depth metric. After rough calibrations, we can resolve targets separated by less than 1 inch in depth approximately 3 feet from the camera.
In summary, we have shown a new and inexpensive technique for gathering 3D image data from a traditional 2D camera and astigmatic projector. The system integrates projected patterns and software post-processing such that the images' wavelet transform can produce a 3D data scene map. The 3D depth mask produced by this system allows for many unique post-processing options, such as selecting planes of focus, changing depth of focus, and other effects not available to consumers today and that enable greater creative freedom. In the future, we plan to integrate the camera and projector system more closely as well as to use a projected pattern in the near IR.
The authors are grateful to the College of Optical Science and their colleagues for their support. This work is patent pending.
Gabriel C. Birch, J. Scott Tyo, Jim Schwiegerling
College of Optical Sciences
University of Arizona
A. N. Rajagopalan, S. Chaudhuri, A variational approach to recovering depth from defocused images, IEEE Trans. Pattern Anal. Machine Intell. 19
, no. 10, pp. 1158-1164, 1997. doi:10.1109/34.625126
V. Aslantas, D. T. Pham, Depth from automatic defocusing, Opt. Express 15
, no. 3, pp. 1011-1023, 2007. doi:10.1364/OE.15.001011
K. Atanassov, V. Ramachandra, S. R. Goma, M. Aleksic, 3D image processing architecture for camera phones, Proc. SPIE
7864, pp. 786414, 2011. doi:10.1117/12.872617
J. Geng, Structured-light 3D surface imaging: a tutorial, Adv. Opt. Photon. 3
,no. 2, pp. 128-160, 2011. doi:10.1364/AOP.3.000128
G. Birch, J. S. Tyo, J. Schwiegerling, 3D image capture through the use of an astigmatic projected pattern, Proc. SPIE
8129, 2011. Paper accepted at SPIE Opt. Photon. in San Diego, CA, 21–25 August 2011.