Analyzing computational imaging systems

A novel framework, which takes into account optical multiplexing, sensor noise characteristics, and signal priors, can analyze any linear computational imaging camera.
19 November 2013
Oliver Cossairt, Kaushik Mitra and Ashok Veeraraghavan

Over the last decade, a number of computational imaging (CI) systems have been proposed for tasks such as motion deblurring, defocus deblurring, and multispectral imaging. In conventional imaging, a lens produces an image of a scene on a sensor. In CI, additional optics are used to produce a multiplexed measurement of the scene, increasing the amount of light reaching the sensor. However, the image recorded at the sensor is not directly perceptually meaningful, and the deleterious effects of multiplexing, such as blurred edges and loss of image detail, are removed via an appropriate reconstruction algorithm. Given the widespread appeal of CI techniques in low-light imaging applications such as photography with consumer camera phones, microscopy, and military night vision, a detailed performance analysis of the benefits conferred by this approach is important. A comparison between conventional and computational imaging is shown in Figure 1.


Figure 1. In conventional imaging, a lens is used to create an image of a scene on a sensor. The image requires no further processing before viewing. In computational imaging (CI), a lens is used in conjunction with optical coding methods, producing a multiplexed measurement of the scene. A perceptually meaningful image is recovered from captured data via digital decoding.

The question of exactly how much performance improves by multiplexing has received a fair amount of attention.1–9 It is well understood that multiplexing gives the greatest advantage at low light levels where signal-independent read noise dominates, but this advantage diminishes with increasing light where signal-dependent photon noise dominates.1 Previous work has analyzed the properties of the optical coding schemes but largely ignored the effects of signal priors,1–7 which are statistical models used in image restoration algorithms to recover images from noisy and blurry measurements with high fidelity. They are at the heart of every state-of-the-art reconstruction algorithm, including dictionary learning,10 block-matching and 3D filtering (BM3D),11 and Gaussian mixture model (GMM) algorithms,12,13 and so it is impractical to study the effects of multiplexing alone. Signal priors can dramatically increase performance in problems of deblurring and denoising, typically with greater improvement as noise increases and the light level decreases. Although both signal priors and multiplexing increase performance at low light levels, the former only requires algorithmic changes, but the latter often requires hardware modifications. Thus, it is imperative to understand the improvement due to multiplexing above and beyond that due to signal priors. However, a comprehensive analysis of the effect of signal priors on CI systems has remained elusive because state-of-the-art priors often use signal models unfavorable to analysis.

We have developed an analysis framework that jointly takes into account multiplexing, noise, and signal priors.14 We use a GMM as a signal prior for two reasons. First, the GMM is universal in that it can be used to approximate any probability density function, given an appropriate number of mixtures.15, 16 Second, the GMM allows us to derive simple expressions for the minimum mean square error (MMSE), which we use as a metric to characterize the CI systems' performance. Our analysis shows that signal priors and new CI optical designs both lead to performance improvements, and multiplexing provides significant performance gains above and beyond those from signal priors.

Most CI systems can be modeled as a linear multiplexing system due to the linearity of light and its interaction with traditional optics. As a consequence, CI systems developed for a large range of problems can be uniformly described and analyzed using linear operators. While our framework can be used to analyze all linear CI systems, we have focused on CI systems designed to handle image blur due to defocus or motion. The conventional method of eliminating blur is to reduce aperture size and shutter exposure time, reducing the amount of incident light, and hence also the signal-to-noise ratio (SNR) of captured images. Following the convention developed by Cossairt et al.,3 we refer to this method of blur removal as impulse imaging. Several CI techniques have been introduced that produce well-conditioned blur without sacrificing light. For defocus deblurring, extended depth-of-field (EDOF) systems have been proposed that encode defocus blur using attenuation masks,17–19 refractive masks,20 or motion.21, 22 Motion deblurring cameras have been proposed that encode motion blur using either a fluttered shutter23 or camera motion.24, 25

Signal priors improve image quality for both motion and defocus deblurring cameras, as well as their impulse imaging counterparts. For impulse imaging, application of a signal prior is equivalent to simply denoising the image. The key question is how much improvement can be achieved using a CI technique above and beyond the use of signal priors. As an example, consider the simulations in Figure 2, which show captured and reconstructed images for flutter shutter, motion invariant, and impulse imaging using a shorter exposure time. The captured impulse image has a very meager SNR of −1.7dB. After deblurring with the GMM prior, both flutter shutter (bottom middle of Figure 2) and motion-invariant (bottom right) cameras produce significantly greater SNR. This would seem to indicate the CI improves performance more than impulse imaging, but the signal prior has not been applied to the impulse image and so this is not a fair comparison. After denoising the impulse image using the GMM prior (bottom left image), the motion-invariant camera produces a 7.4dB SNR gain over impulse imaging, indicating that, for this experiment, CI can be used to improve performance relative to simply using a shorter exposure time.


Figure 2. The first row shows the captured image by impulse, flutter shutter,23 and motion-invariant24 imaging systems. The read noise is 4e and the photon noise is 3.2e. Note that the image captured by the impulse imaging system is noisy but without motion blur thanks to the short exposure time. For the focal flutter shutter and motion-invariant system, we have set the exposure time to be 33 times that of the impulse imaging system. Hence, the captured images suffer from motion blur but are much less noisy. We then denoise the impulse image and deblur the flutter shutter and motion-invariant images using the Gaussian mixture model (GMM) prior. The results are shown in the second row. The motion-invariant camera produces a 7.4dB signal-to-noise ratio (SNR) gain over impulse imaging, indicating that, for this experiment, CI improves performance relative to simply using a shorter exposure time.

Our goal is to characterize the SNR gain produced by CI techniques. However, the SNR of all the cameras we considered depends on the scene illumination level. We considered two noise types: photon noise (signal dependent) and read noise (signal independent). Photon noise is typically measured in terms of the number of photogenerated electrons (e) freed during the photo-electric conversion process. For example, a photon noise of J=4e corresponds to four electrons generated from incident photons. Likewise, it is typical to measure the read noise in terms of the number of randomly generated free electrons that corrupt the photo-generated electrical signal. We calculated the average signal level (also equal to the photon noise variance) in photo-electrons J of the impulse camera using the expression:3 where Isrc is the illumination level (in lux), R is the average scene reflectivity, and the camera parameters are the aperture setting described by the F-number (F/#), exposure time (t), sensor quantum efficiency (q), and pixel size (δ). In our experiments, we assumed R = 0.5, q = 0.5, F/# = 11, and t = 6ms. We used a pixel size of δ = 2.5μm, which is typical of a machine vision camera. We assumed a sensor read noise of 4e, which is typical for today's CMOS sensors. We modeled photon noise as a zero mean Gaussian with variance equal to J.

For different illumination levels, we used the photon noise J to calculate the MMSE analytically from the multiplexing matrix for a given CI technique with signal priors taken into account. We used a set of GMM prior parameters learned from a large collection of about 50 million image patches.14 For defocus deblurring, we used 16×16 pixel image patches. For motion deblurring, the patches were 256×4 pixels. The prior parameters were the covariance, mean, and mixing weight for each of the approximately 1700 Gaussian mixtures used for our GMM. The SNR gain of the multiplexing system relative to the impulse imaging system was given by 10log10(MSEimpulse/MSEmultiplexing) where MSE is the mean square error. Our results are shown in Figures 3 and 4, which demonstrate that the maximum increase in SNR from multiplexing can be significantly greater than that from using signal priors. The maximum SNR gain for motion deblurring was 7dB (see Figure 3) and for EDOF cameras about 9dB (see Figure 4).


Figure 3. The analytic SNR gain (relative to impulse imaging) vs. illumination level (in lux) for the motion-invariant24 and flutter shutter23 cameras. The motion-invariant camera achieves a peak SNR gain of 7.3dB and an average SNR gain of about 4.5dB.

Figure 4. SNR gain (relative to impulse imaging) of various extended depth-of-field (EDOF) systems as a function of the illumination level (in lux). The EDOF systems considered are: cubic phase wavefront coding,20 focal sweep camera,22 and coded aperture designs17, 19 Wavefront coding achieves a peak SNR gain of 8.8dB and an average SNR gain of about 7dB.

While the results reported here are specific to EDOF and motion deblurring cameras, the framework can be applied to analyze any linear CI camera. We plan to use our framework to learn priors and analyze multiplexing performance for other types of datasets such as videos, hyperspectral volumes, light fields, and reflectance fields. Analyzing underdetermined linear CI systems with fewer measurements than unknowns (i.e., compressed sensing) is of particular interest. We would like to apply our framework to the problem of optimizing multiplexing matrices for compressed sensing systems.

Kaushik Mitra and Ashok Veeraraghavan acknowledge support through National Science Foundation Grants NSF-IIS:1116718 and NSF-CCF:1117939, and a Samsung Global Research Outreach grant.


Oliver Cossairt
Electrical Engineering and Computer Science Department
Northwestern University
Evanston, IL

Oliver Cossairt is an assistant professor with research interests in optics, computer vision, and computer graphics. He earned his PhD from Columbia University, where his research focused on computational imaging. He earned his MS from the Massachusetts Institute of Technology, where he focused on 3D displays.

Kaushik Mitra, Ashok Veeraraghavan
Electrical and Computer Engineering Department
Rice University
Houston, TX

Kaushik Mitra is currently a postdoctoral researcher with interests in computational imaging, computer vision, and statistical signal processing. He earned his PhD in electrical and computer engineering from the University of Maryland, College Park, where his research focus was the development of statistical models and optimization algorithms for computer vision problems.

Ashok Veeraraghavan is an assistant professor. He previously worked at Mitsubishi Electric Research Labs in Cambridge, MA. He received his master's and PhD degrees from the Department of Electrical and Computer Engineering at the University of Maryland, College Park, in 2004 and 2008, respectively. At Rice, he directs the Computational Imaging and Vision Lab. His research interests are broadly in the areas of computational imaging, computer vision, and robotics.


References:
1. M. Harwit, N. J. Sloane, Hadamard Transform Optics, Academic Press, New York, 1979.
2. O. Cossairt, Tradeoffs and Limits in Computational Imaging. PhD thesis, Columbia University Department of Computer Science, 2011.
3. O. Cossairt, M. Gupta, S. K. Nayar, When does computational imaging improve performance?, IEEE Trans. Image Process. 22(1–2), p. 447-458, 2013.
4. G. Wetzstein, I. Ihrke, W. Heidrich, On plenoptic multiplexing and reconstruction, Int'l J. Comp. Vis. 101(2), p. 384-400, 2013.
5. Y. Schechner, S. Nayar, P. Belhumeur, Multiplexing for optimal lighting, IEEE Trans. Pattern Analys. Machine Intell. 29(8), p. 1339-1354, 2007.
6. A. Wuttig, Optimal transformations for optical multiplex measurements in the presence of photon noise, Appl. Opt. 44, p. 2710-2719, 2005.
7. N. Ratner, Y. Schechner, Illumination multiplexing within fundamental limits, Proc. IEEE Conf. Comp. Vis. Pattern Recog., 2007. doi:10.1109/CVPR.2007.383162
8. S. W. Hasinoff, K. N. Kutulakos, Light-efficient photography, Computer Vision — ECCV 2008, Part IV 5305, p. 45-59, 2008.
9. S. Hasinoff, K. Kutulakos, F. Durand, W. Freeman, Time-constrained photography, Proc. 12th IEEE Int'l Conf. Comp. Vis, p. 333-340, 2009.
10. M. Aharon, M. Elad, A. Bruckstein, K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation, IEEE Trans. Signal Process. 54(11), p. 4311-4322, 2006.
11. K. Dabov, A. Foi, V. Katkovnik, K. Egiazarian, Image denoising by sparse 3-D transform-domain collaborative filtering, IEEE Trans. Image Process. 16(8), p. 2080-2095, 2007. doi:10.1109/TIP.2007.901238
12. G. Yu, G. Sapiro, S. Mallat, Solving inverse problems with piecewise linear estimators: from Gaussian mixture models to structured sparsity, IEEE Trans. Image Process. 21(5), p. 2481-2499, 2012. doi:10.1109/TIP.2011.2176743
13. K. Mitra, A. Veeraraghavan, Light field denoising, light field superresolution and stereo camera based refocussing using a GMM light field patch prior, IEEE Conf. Comp. Vis. Patt. Recog. Workshops, p. 22-28, 2012. doi:10.1109/CVPRW.2012.6239346
14. K. Mitra, O. Cossairt, A. Veeraraghavan, A framework for analysis of computational imaging systems with practical applications. arXiv:1308.1981v2 [cs.CV],  23 Oct 2013.
15. H. W. Sorenson, D. L. Alspach, Recursive Bayesian estimation using Gaussian sums, Automatica 7, p. 465-479, 1971.
16. K. N. Plataniotis, D. Hatzinakos, Gaussian mixtures and their applications to signal processing, Advanced Signal Processing Handbook: Theory and Implementation for Radar, Sonar, and Medical Imaging Real Time Systems, ch.3, CRC Press, 2000.
17. A. Levin, R. Fergus, F. Durand, W. Freeman, Image and depth from a conventional camera with a coded aperture, ACM Trans. Graphics—Proc. ACM Siggraph 2007 26(3), p. 70, 2007. doi:10.1145/1276377.1276464
18. A. Veeraraghavan, R. Raskar, A. Agrawal, A. Mohan, J. Tumblin, Dappled photography: Mask enhanced cameras for heterodyned light fields and coded aperture refocusing, ACM Trans. Graphics—Proc. ACM Siggraph 2007 26(3), p. 69, 2007. doi:10.1145/1276377.1276463
19. C. Zhou, S. Nayar, What are good apertures for defocus deblurring?, IEEE Int'l Conf. Comput. Photo., 2009. doi:10.1109/ICCPHOT.2009.5559018
20. E. R. Dowski Jr., W. T. Cathey, Extended depth of field through wave-front coding, Appl. Opt. 34(11), p. 1859-1866, 1995.
21. G. Häusler, A method to increase the depth of focus by two step image processing, Opt. Commun. 6, p. 38-42, 1972.
22. S. Kuthirummal, H. Nagahara, C. Zhou, S. K. Nayar, Flexible depth of field photography, IEEE Trans. Pattern Analys. Mach. Intell., 2010.
23. R. Raskar, A. Agrawal, J. Tumblin, Coded exposure photography: motion deblurring using fluttered shutter, ACM Trans. Graphics—Proc. ACM Siggraph 2006 25(3), p. 795-804, 2006. doi:10.1145/1141911.1141957
24. A. Levin, P. Sand, T. Cho, F. Durand, W. Freeman, Motion-invariant photography, ACM Trans. Graphics—Proc. ACM Siggraph 2008 27(3), p. 71, 2008. doi:10.1145/1360612.1360670
25. T. Cho, A. Levin, F. Durand, W. Freeman, Motion blur removal with orthogonal parabolic exposures, IEEE Int'l Conf. Comput. Photo., 2010. doi:10.1109/ICCPHOT.2010.5585100
PREMIUM CONTENT
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research