Imaging plays a key role in many diverse areas of application, such as astronomy, remote sensing, microscopy, and tomography. Owing to imperfections of measuring devices (e.g., optical degradations, limited size of sensors) and instability of the observed scene (e.g., object motion, media turbulence), acquired images can be indistinct, noisy, and may exhibit insufficient spatial and temporal resolution. In particular, several external effects blur images. We will call these effects volatile blurs to emphasize their unpredictable and transitory behavior.
Techniques for recovering the original image include blind deconvolution (to remove blur) and superresolution.1,2 The stability of these methods depends on having more than one image of the same frame. Differences between images are necessary to provide new information, but they can be almost imperceivable, for example, subtle spatial shifts or slight modification of the acquisition parameters, such as focus length and aperture size.
For a single observation g(i,j) the problem of recovering the original undegraded image is underdetermined (i.e., there are more unknowns than knowns) and lacks a stable solution. To partially overcome this equivocation, we need to have K (K>1) images of the original scene and then face the so-called multichannel (multiframe) dilemma. Superresolution (SR) is the process of combining a sequence of low-resolution (LR) images to produce a better result. It is unrealistic to assume that the superresolved image can recover the original scene o(x,y) exactly. Rather, a reasonable goal of SR is a discrete version of o(x,y), which has spatial resolution higher than that of the LR images and which is free of blurs (deconvolved). The acquisition model then becomes
where k = 1,…,K is the acquisition index, vk is the volatile blur, nk(i,j) is additive noise, and Wk denotes the geometric deformation (warping), in general different for each acquisition. Wk is assumed to be only unknown translation, and therefore it is calculated automatically by vk. D(·) is the decimation operator, which models the function of charge-coupled device (CCD) sensors. It consists of convolution with a sensor blur followed by a sampling operator, which we define as multiplication by a sum of delta functions placed on a grid. This model represents the state of the art, as it takes all possible degradations into account.
Current multiframe blind deconvolution techniques require no or very little prior information about the blurs, and they are sufficiently robust to noise to provide satisfying results in most real applications. By the same token, they can hardly cope with low-resolution images since in this case the decimation D and convolution with vk cannot switch their order. State-of-the-art SR techniques achieve remarkable results in resolution enhancement by estimating the subpixel shifts between images, but they lack any apparatus for calculating the blurs. The SR methods either assume that none exists or that it can be estimated by other means.
The standard SR approach consists of subpixel registration, overlaying the LR images on a high-resolution (HR) grid, and interpolating the missing values. The subpixel shift between images thus constitutes an essential feature. Considering volatile blurs in the model explicitly brings about a more general and robust technique, with the shift being a special case thereof.Blind superresolution
Recently, we proposed a unifying method that simultaneously estimates the volatile blurs and HR image.3 The only prior knowledge required is estimates of the blur size and level of noise in the LR images, which renders it a truly blind SR (BSR) method. The key idea was to determine subpixel shifts by calculating volatile blurs. As these are estimated in the HR scale, positions of their centroids correspond to subpixel shifts. By estimating blurs, we automatically estimate shifts with subpixel accuracy, which is essential for good SR performance.
We formulate the problem as constrained least squares with appropriate regularization terms, which guarantee a close-to-perfect solution in the noiseless case. Considering the problem in 3D allows us to apply the proposed method not only to volumes (as in imaging from confocal microscopy or electron tomography) but also to video sequences, where the third dimension is time.
The complex SR problem was solved by minimizing a regularized energy function that was carried out in both the image and blur domains. Image regularization is based on variational integrals, and a consequent anisotropic diffusion with good edge-preserving capabilities. The relevant term is based on our generalized result of blur estimation in the SR case. To tackle the minimization task, we used an alternating approach consisting of two simple linear equations. We have presented details of the method elsewhere.3Performance experiments
The first experiment demonstrates a test of license plate recognition. Using an Olympus C5050Z digital camera, we took eight photos, registered them with cross-correlation, and cropped each to 156(×)56 pixels. Figure 1(a) shows three of the LR images enlarged with zero-order interpolation. The BSR method returned a well-reconstructed HR image—Figure 1(b)—which is comparable to the ground truth acquired with the optical zoom: Figure 1(d). The so-called SR-only technique results in lower performance: see Figure 1(c). SR-only is a maximum a posteriori (MAP) formulation of the SR problem proposed, for example, by the Hardie and Segall groups.4,5 This method uses a MAP framework to jointly estimate image registration parameters and the HR image, assuming only sensor blur and no volatile blurs. For an image prior, we used edge-preserving Huber–Markov random fields (MRFs).2 Note that as the SR factor increases, we need to take more LR images, and the stability of BSR decreases. Hence we restrict the SR factor to between 1 and 2.5 in most practical applications.
Figure 1. (a) Shown are three low-resolution frames. (b) The BSR result. (c) The SR result. (d) Optical zoom reference.
For the second experiment we used a handheld webcam (a Logitech Quickcam for Notebooks camera) to capture a short video sequence of a toy dog (see Figure 2). Then we extracted eight consecutive frames. The long shutter speed (1/10s) together with inevitable hand motion introduced blurring into the images. Figure 2(b) shows a reference HR ground truth acquired with the optical zoom, whereas Figure 2(c) shows the BSR result. In this experiment the SR factor was set to 2. The BSR algorithm removed blurring and performed SR correctly, as can be seen by comparing Figure 2(e) and (f).
Figure 2. (a) Shown is one low-resolution frame from a sequence of eight images. (b) Reference (optical zoom X2). (c) BSR result. (d–f) Zoomed version of the same region of the previous three images.
We have described a general method for blind deconvolution and resolution enhancement, and have shown that the SR problem permits a stable solution even in the case of unknown blurs. The fundamental idea is to split radiometric deformations into sensor and volatile parts and assume that only the sensor part is known. A regularized energy-minimization approach assures that the solution is robust. The proposed BSR method goes far beyond the standard SR techniques. The introduction of volatile blurs makes the method particularly appealing for real situations. Further research is needed to deal with more complex missalignment between images and to overcome problems involving compressed images.
Gabriel Cristobal, Elena Gil
Instituto de Optica
Consejo Superior de Investigaciones Científicas (CSIC)
Gabriel Cristobal is a research scientist at the CSIC. From 1989 to 1992, he was a visiting scholar at the same institution. His current interests are joint representations, vision modeling, resolution enhancement, and image compression.
Elena Gil is a PhD candidate at the CSIC. Her current focus is SR and medical imaging.
Filip Sroubek, Jan Flusser
Institute of Information Theory and Automation
Academy of Sciences
Prague, Czech Republic
Filip Sroubek is with the Institute of Information Theory and Automation and the Institute of Photonics and Electronics, Academy of Sciences, Czech Republic. From 2005 to 2006 he was a posdoctoral fellow at the CSIC. His current interests are image fusion, blind deconvolution, SR, and related topics.
Jan Flusser has been with the Institute of Information Theory and Automation since 1985. From 1995 to 2006 he held the position of head of the Department of Image Processing. In 2007, he was appointed director of the institute. His current interests include all aspects of image processing and pattern recognition. He has authored and coauthored more than 150 research publications in these areas.