Enhancing passport images for face recognition
Face recognition (FR) has a variety of uses in commercial and government applications that include searching for potential terrorists and criminals, performing additional security measures for automated teller machines (ATMs), and preventing people from obtaining fake identification. Face images can be captured at a distance without physical or other interaction with the individual concerned, and this makes FR valuable. It can also serve as a crime deterrent because previously recorded face images can be used later to identify a person.
According to a recent report from the National Institute of Standards and Technology, facial recognition has improved significantly for ideal cases such as visa and mugshot photographs.1 However, much remains to be explored under non-ideal conditions. For instance, in passport facial matching, a border officer may need to compare a security marked passport photo with a higher resolution database mugshot or visa photo. Factors related to the individual's appearance, to the physical photograph, or the device used to record the image data, can all hinder successful identification.
Variations between the passport and data images are due to three types of factors. Person-related factors are individual variations such as aging or changes in hairstyle, expression, and pose. Ageing is also included because days or even years could have passed between the acquisition of passport photos and their high-resolution counterparts. Document-related factors significantly perturb the biometric content of the image, such as security watermarks embedded on passport photos (which can distort the facial image), variations in image quality, tonality across the face, and color cast of the photographs. Device-related factors are noise from a device (scanner, fax machine, etc.) used to extract facial images. Examples include limited device resolution, lighting artifacts, variations in document photos, image file format or compression, and operator variability.
The holder's identity may be obscured by a security watermark and automated FR systems may make a false match. An official will possibly have to stop the system, search through a multitude of photos and make a visual verification, during which the officer may be distracted and likely to make errors. We are working to create a sophisticated process to eliminate diverse watermark traces, regardless of the passport country of origin, while improving overall identification.
We developed techniques that focus on the restoration and evaluation of passport photos for improved performance. The experimental setup is as previously described.2, 3 To restore photos, we ‘inpaint’ or interpolate the effects of a security watermark by viewing that area as a damaged region and also minimizing the total variation (TV) of the image.2 As shown in Figure 1, looking at the TV is advantageous because of its effectiveness of smoothing away noise while preserving edges. Additionally, we adopted an existing general thresholding and ‘denoising’ approach to account for device-related noise.3, 4 To evaluate success, we wanted to assess image quality as well as identification. To measure the improvement in image quality, we employed Wang and Bovik's universal image quality (UIQ) metric.5 The UIQ considers three main factors: loss of correlation, luminance distortion, and contrast distortion.
Understanding facial recognition performance is necessary to assess improvement in identification. We did this using academic facial recognition algorithms developed by WVU (that, together with UIQ, determined the performance baseline) and a commercial algorithm (known as G8) that is typically part of a state-of-the-art FR system. Figure 1 demonstrates that, when using the UIQ quality evaluation metric, the combination of image inpainting and denoising results in better restored face images, compared to other restoration approaches. One might conclude that the general denoising3, 4 actually does more harm than good in terms of image restoration, but this is not the case when we compare facial recognition performance by matching the test passport photo and its high-quality counterpart image of the same subject (see Figure 2).
Considering the academic algorithms (local binary patterns, LBP, and local ternary patterns, LTP) further shows that the proposed inpainting and denoising approach provides the most promising solution (see Figure 3). Rank determines the level of similarity between images where the number goes from 1 to N. The lower the rank number, the higher the similarity between the images being compared. LBP has a 46.2% score of rank 1 matches (81% rank 5 matches), and LTP results are a little lower but still confirm the UIQ results. The commercial software (G8) gave mixed results. Without image restoration techniques (i.e., raw data) rank 1 matches are approximately 81%, and the results are lower when we use either general denoising or inpainting and denoising to restore the image. This is because G8 has its own restoration scheme, which operates as a ‘black box’ on top of our own restoration methodology. This hinders the proposed preprocessing methodology in addition to negatively affecting overall FR performance.
In summary, we created a sophisticated process to eliminate diverse watermark traces, regardless of the passport country of origin, while improving overall identification. We interpolate the effects of a security watermark by viewing that area as a damaged region and minimize the TV of the image. We combine this with an existing general thresholding and ‘denoising’ approach to account for device-related noise. Results show that this is a promising solution, although it does not yet reach the success of a leading commercial matcher. We are continuing to engineer and test alternative methodologies for passport image restoration while improving the overall FR performance. We are collecting an approximately seven times larger data set with more variations of images from different passports as well as images without security markings. We will use the set to develop and test more methodologies for image restoration and face recognition.
This work was funded by the Center for Identification Technology Research (CITeR) at West Virginia University (WVU). We are grateful to all faculty and students who assisted us with this work. Special acknowledgement to Arvind Jagannathan (West Virigina University) and Lacey Best-Rowden (Laboratory for Physical Sciences, University of Maryland) for their assistance in data collection and experiments, as well as Arun Ross and Anil Jain of Michigan State University (MSU) and Hao Min Zhou (Georgia Tech) for their expertise and direction.
University of Maryland
Antwan Clark is a research scientist at LPS. He received a PhD in Applied Mathematics at Rensselaer Polytechnic Institute (RPI) in Troy, NY. His research areas include identification sciences as well as image processing. Additionally, he holds academic appointments with RPI and West Virginia University.
West Virginia University
Thirimachos Bourlai is an assistant professor at the LDCSEE. He received a PhD in Electrical and Electronic Engineering from the University of Surrey, UK. His research interests include biometrics and biomedical imaging. He is a senior member of IEEE, and a member of the IEEE Signal Processing Society, the IEEE Biometrics Compendium, and the National Defense Industrial Association (NDIA).