Analyzing computational imaging systems
Over the last decade, a number of computational imaging (CI) systems have been proposed for tasks such as motion deblurring, defocus deblurring, and multispectral imaging. In conventional imaging, a lens produces an image of a scene on a sensor. In CI, additional optics are used to produce a multiplexed measurement of the scene, increasing the amount of light reaching the sensor. However, the image recorded at the sensor is not directly perceptually meaningful, and the deleterious effects of multiplexing, such as blurred edges and loss of image detail, are removed via an appropriate reconstruction algorithm. Given the widespread appeal of CI techniques in low-light imaging applications such as photography with consumer camera phones, microscopy, and military night vision, a detailed performance analysis of the benefits conferred by this approach is important. A comparison between conventional and computational imaging is shown in Figure 1.
The question of exactly how much performance improves by multiplexing has received a fair amount of attention.1–9 It is well understood that multiplexing gives the greatest advantage at low light levels where signal-independent read noise dominates, but this advantage diminishes with increasing light where signal-dependent photon noise dominates.1 Previous work has analyzed the properties of the optical coding schemes but largely ignored the effects of signal priors,1–7 which are statistical models used in image restoration algorithms to recover images from noisy and blurry measurements with high fidelity. They are at the heart of every state-of-the-art reconstruction algorithm, including dictionary learning,10 block-matching and 3D filtering (BM3D),11 and Gaussian mixture model (GMM) algorithms,12,13 and so it is impractical to study the effects of multiplexing alone. Signal priors can dramatically increase performance in problems of deblurring and denoising, typically with greater improvement as noise increases and the light level decreases. Although both signal priors and multiplexing increase performance at low light levels, the former only requires algorithmic changes, but the latter often requires hardware modifications. Thus, it is imperative to understand the improvement due to multiplexing above and beyond that due to signal priors. However, a comprehensive analysis of the effect of signal priors on CI systems has remained elusive because state-of-the-art priors often use signal models unfavorable to analysis.
We have developed an analysis framework that jointly takes into account multiplexing, noise, and signal priors.14 We use a GMM as a signal prior for two reasons. First, the GMM is universal in that it can be used to approximate any probability density function, given an appropriate number of mixtures.15, 16 Second, the GMM allows us to derive simple expressions for the minimum mean square error (MMSE), which we use as a metric to characterize the CI systems' performance. Our analysis shows that signal priors and new CI optical designs both lead to performance improvements, and multiplexing provides significant performance gains above and beyond those from signal priors.
Most CI systems can be modeled as a linear multiplexing system due to the linearity of light and its interaction with traditional optics. As a consequence, CI systems developed for a large range of problems can be uniformly described and analyzed using linear operators. While our framework can be used to analyze all linear CI systems, we have focused on CI systems designed to handle image blur due to defocus or motion. The conventional method of eliminating blur is to reduce aperture size and shutter exposure time, reducing the amount of incident light, and hence also the signal-to-noise ratio (SNR) of captured images. Following the convention developed by Cossairt et al.,3 we refer to this method of blur removal as impulse imaging. Several CI techniques have been introduced that produce well-conditioned blur without sacrificing light. For defocus deblurring, extended depth-of-field (EDOF) systems have been proposed that encode defocus blur using attenuation masks,17–19 refractive masks,20 or motion.21, 22 Motion deblurring cameras have been proposed that encode motion blur using either a fluttered shutter23 or camera motion.24, 25
Signal priors improve image quality for both motion and defocus deblurring cameras, as well as their impulse imaging counterparts. For impulse imaging, application of a signal prior is equivalent to simply denoising the image. The key question is how much improvement can be achieved using a CI technique above and beyond the use of signal priors. As an example, consider the simulations in Figure 2, which show captured and reconstructed images for flutter shutter, motion invariant, and impulse imaging using a shorter exposure time. The captured impulse image has a very meager SNR of −1.7dB. After deblurring with the GMM prior, both flutter shutter (bottom middle of Figure 2) and motion-invariant (bottom right) cameras produce significantly greater SNR. This would seem to indicate the CI improves performance more than impulse imaging, but the signal prior has not been applied to the impulse image and so this is not a fair comparison. After denoising the impulse image using the GMM prior (bottom left image), the motion-invariant camera produces a 7.4dB SNR gain over impulse imaging, indicating that, for this experiment, CI can be used to improve performance relative to simply using a shorter exposure time.
Our goal is to characterize the SNR gain produced by CI techniques. However, the SNR of all the cameras we considered depends on the scene illumination level. We considered two noise types: photon noise (signal dependent) and read noise (signal independent). Photon noise is typically measured in terms of the number of photogenerated electrons (e−) freed during the photo-electric conversion process. For example, a photon noise of J=4e− corresponds to four electrons generated from incident photons. Likewise, it is typical to measure the read noise in terms of the number of randomly generated free electrons that corrupt the photo-generated electrical signal. We calculated the average signal level (also equal to the photon noise variance) in photo-electrons J of the impulse camera using the expression:3 where Isrc is the illumination level (in lux), R is the average scene reflectivity, and the camera parameters are the aperture setting described by the F-number (F/#), exposure time (t), sensor quantum efficiency (q), and pixel size (δ). In our experiments, we assumed R = 0.5, q = 0.5, F/# = 11, and t = 6ms. We used a pixel size of δ = 2.5μm, which is typical of a machine vision camera. We assumed a sensor read noise of 4e−, which is typical for today's CMOS sensors. We modeled photon noise as a zero mean Gaussian with variance equal to J.
For different illumination levels, we used the photon noise J to calculate the MMSE analytically from the multiplexing matrix for a given CI technique with signal priors taken into account. We used a set of GMM prior parameters learned from a large collection of about 50 million image patches.14 For defocus deblurring, we used 16×16 pixel image patches. For motion deblurring, the patches were 256×4 pixels. The prior parameters were the covariance, mean, and mixing weight for each of the approximately 1700 Gaussian mixtures used for our GMM. The SNR gain of the multiplexing system relative to the impulse imaging system was given by 10log10(MSEimpulse/MSEmultiplexing) where MSE is the mean square error. Our results are shown in Figures 3 and 4, which demonstrate that the maximum increase in SNR from multiplexing can be significantly greater than that from using signal priors. The maximum SNR gain for motion deblurring was 7dB (see Figure 3) and for EDOF cameras about 9dB (see Figure 4).
While the results reported here are specific to EDOF and motion deblurring cameras, the framework can be applied to analyze any linear CI camera. We plan to use our framework to learn priors and analyze multiplexing performance for other types of datasets such as videos, hyperspectral volumes, light fields, and reflectance fields. Analyzing underdetermined linear CI systems with fewer measurements than unknowns (i.e., compressed sensing) is of particular interest. We would like to apply our framework to the problem of optimizing multiplexing matrices for compressed sensing systems.
Kaushik Mitra and Ashok Veeraraghavan acknowledge support through National Science Foundation Grants NSF-IIS:1116718 and NSF-CCF:1117939, and a Samsung Global Research Outreach grant.
Oliver Cossairt is an assistant professor with research interests in optics, computer vision, and computer graphics. He earned his PhD from Columbia University, where his research focused on computational imaging. He earned his MS from the Massachusetts Institute of Technology, where he focused on 3D displays.
Kaushik Mitra is currently a postdoctoral researcher with interests in computational imaging, computer vision, and statistical signal processing. He earned his PhD in electrical and computer engineering from the University of Maryland, College Park, where his research focus was the development of statistical models and optimization algorithms for computer vision problems.
Ashok Veeraraghavan is an assistant professor. He previously worked at Mitsubishi Electric Research Labs in Cambridge, MA. He received his master's and PhD degrees from the Department of Electrical and Computer Engineering at the University of Maryland, College Park, in 2004 and 2008, respectively. At Rice, he directs the Computational Imaging and Vision Lab. His research interests are broadly in the areas of computational imaging, computer vision, and robotics.