Mutual information improves image fusion quality assessments
The goal of image fusion techniques is to combine and preserve all of the important visual information present in multiple input images in a single output image. In many applications, the quality of the fused images is of fundamental importance and is usually assessed by visual analysis subjective to the interpreter. Many objective quality metrics exist in image fusion,1 but when no clearly-defined ground truth exists, we must construct an ideal fused image to use as a reference for comparing with the experimental results.
Among the available ways to measure quality, both the mean square error (MSE) and signal-to-noise ratio (SNR) metrics are widely employed because they are easy to calculate and typically have low computational costs. Other metrics such as the Wigner signal-to-noise ratio (SNRw)2 and structural similarity quality index (SSIM)3 have been recently proposed, but these metrics require a reference image together with the processed image.
Non-reference metrics are much more difficult to define, as knowledge of ground truth is not assumed. These metrics are not relative to an original image.4 Here we use mutual information (MI) as an information measure for evaluating image fusion performance. This measure represents how much of the information in the final fused image was obtained from the input images.Mutual information and Tsallis entropy
Mutual information has been used previously in image registration.5 As detailed below, MI can be used to measure the dependence between two random variables. The definition is related to Kullback–Leibler distance and is defined for two images as:
where p(a,b) is the joint distribution of two images with p(a) and p(b) as marginal probability functions.
If we consider the image intensity values a and b of a pair of corresponding pixels in the two images as random variables A and B, estimations of the joint and marginal distributions can be obtained by normalization of the joint and marginal histograms of both images as:
where h(a,b) is the joint histogram.
Tsallis proposed a divergence measure6 that represents the degree of dependence between two discrete random variables as:
In this case, p and r denote the probability distributions of interest and q is a real parameter with q≠ 1.
Using Equation 5 and replacing p and r by joint and marginal density functions between fused image F and input image A as follows:
and using the same procedure with fused image F and input image B, the image fusion performance measure can be defined as:
From the definition of MI for two random variables I(A,B)= H(A)+H(B)−H(A,B), we define a non-reference normalized quality metric of order q:
We tested the proposed normalized fusion metric and evaluated several image fusion algorithms using a toolbox developed in Matlab7 (see Figure 1). The evaluation tool also presents joint histograms to document the alignment of the images (see Figure 2). Misregistered images will result in a scattered joint histogram. If the images are correctly aligned, however, the resulting joint histogram will only be nonzero along the line y=x.
In the following experiments we considered two grayscale input images of size 256×256 (see Figure 3). In scheme 1, we fused two images using the discrete wavelet transform (DWT),8 and the larger values of low-pass and detail coefficients were selected for reconstructing the new image. In scheme 2, the low-pass and detail coefficients were averaged for reconstructing the new image. Finally, in scheme 3, a new feature-level fusion rule used a previously described algorithm.9 To reconstruct the new image, such a feature-level fusion procedure is based on preserving the edges of the input images by applying a Canny filter. Image results from all three schemes are shown in Figure 4.
The resulting performance metrics for the various schemes are given in Table 1. The sigma value smooths images to eliminate noise, while the hysteresis threshold is used to detect true edges. (For this experiment, we used σ=0.6, low threshold=10, high threshold=30 for the Canny filter.)Table 1. Fusion performance for three schemes and their Tsallis entropy values for q=0.43137. In the fourth column we compared the fused image with an artificial ground truth using the structural similarity quality index (SSIM).3 Note that both NMqFAB and SSIM are normalized between 0-1.
We proposed a normalized metric for image fusion based on mutual information and Tsallis entropy. An evaluation tool was developed in Matlab for perceptual quality assessment of images. We evaluated the performance of three different schemes, demonstrating that the metric calculation is concise and explicit. This method is limited to the intermodality case (two images), because mutual information of multiple random variables is not necessarily positive, resulting in an inadequate image similarity measure. However, in the case of two images, the procedure can provide a valuable quality assessment for image fusion.
Rodrigo Nava received a Bachelor of Computer Engineering degree in 2004 and a Master of Computer Science and Engineering degree in 2007, both with honors, from the National Autonomous University of Mexico.
Boris Escalante–Ramírez received a Bachelor of Electrical Engineering degree from the National University of Mexico in 1985, a Master of Electronic Engineering degree from the Philips International Institute of Technological Studies, Eindhoven, The Netherlands, in 1987, and a PhD degree from the Eindhoven University of Technology in 1992. Since then he has been with the Graduate Division of the School of Engineering, National University of Mexico.
Gabriel Cristóbal received MSc and PhD degrees in telecommunication engineering from the Universidad Politécnica of Madrid, Spain, in 1979 and 1986, respectively. He is currently a research scientist at the Instituto de Optica, Spanish Research Council (CSIC). He was a postdoctoral fellow at the International Computer Science Institute and the Electronic Research Lab (UC Berkeley) from 1989 to 1992. His current research interests are in joint representations, vision modeling and image compression.