Advancing state-of-the-art 3D human facial recognition

Novel approaches to 3D human facial recognition achieve improved results using the complex-wavelet structural similarity metric index.
13 May 2007
Shalini Gupta, Mia K. Markey, and Alan C. Bovik

Automated human face recognition has numerous applications in areas of security and surveillance, database retrieval, and human-computer interaction. Over two decades of research have resulted in successful techniques for recognizing color/intensity 2D frontal facial images.1 However, performance declines severely with variations in pose, expression, and ambient illumination.2 Achieving robust and accurate automatic face recognition remains an important and open problem.

Recently, researchers have proposed the use of 3D models.3 In addition to providing explicit information about the shape of the face, these models can easily correct for pose by rigid rotation in 3D space, and are scale and illumination invariant. However, successful 3D facial recognition techniques2 based on rigid surface matching suffer from an exceptionally high computational cost that makes them unsuitable for real-time operation. Existing 3D techniques are also unable to handle changes in facial expression. We have attempted to resolve some of these issues.

Features for face recognition are numerical quantities that vary considerably between individuals yet remain constant for different instances of the same individual. To date, 3D recognition algorithms that are based on local features have not embodied a sound understanding of discriminatory facial structural characteristics. As a starting point, we extensively investigated the existing literature on anthropometric facial proportions.4 We identified measurements reported to be highly diverse across different age and gender cohorts and ethnic groups. We employed these to design novel and effective algorithms.5

Specifically, we used 3D euclidean and along-the-surface geodesic distances between 25 manually located facial fiducial points (i.e., loci of interest, see Figure 1), associated with the highly variable anthropometric measurements, as features for face recognition.5 We studied geodesic distances because a recent study6 suggests that changes in expression may be modeled as isometric deformations of the face, under which geodesic distances between pairs of points on the facial surface remain constant. We employed stepwise linear discriminant analysis (LDA) to select a subset of the most discriminatory features and followed it with an LDA classifier. With a gallery set containing 1 image of each of 105 subjects and a probe set containing 663 images of the same subjects, our algorithm produced an equal error rate of 1.4% and a rank 1 recognition rate of 98.6%. It significantly outperformed the existing baseline algorithms based on principal component analysis (PCA) and LDA applied to face range images (see Figure 2). Our proposed technique was also robust with respect to changes in facial expressions.

Figure 1. (a) The facial fiducial points associated with discriminatory facial anthropometric measurements on a color image. (b) These points on a facial 3D depth map image (reproduced from Gupta et al.5).

Figure 2. Cumulative match characteristic curves for 3D face recognition algorithms based on distances between anthropometric facial fiducial points, and principa component analysis (PCA) and linear discriminant analysis (LDA) applied to range images.

For rigid facial surface matching, the existing approaches rigidly and closely align a pair of surfaces using an optimization procedure, then compare them by means of a distance metric. These techniques suffer from the very high computation cost for calculating the point-to-closest-point mean squared error (MSECP)/Hausdorff distance (HD) metrics and for the iterative alignment procedure. We attempted to accelerate the process by employing the recently developed complex-wavelet structural similarity metric (CW-SSIM)7 to match coarsely aligned facial range images.8 This index is more computationally efficient than MSECP and HD, and is also robust with respect to small alignment errors. It thus eliminates the need to finely align 3D faces before matching.

With a database of 360 3D facial images of 12 subjects (30 images per subject), our proposed technique achieved a rank 1 recognition rate of 98.6%. It significantly outperformed facial surface matching based on MSECP and HD, both in terms of accuracy and computational cost (see Figure 3). This study also extended the scope of application of CW-SSIM to 3D range images.

Figure 3. Cumulative match characteristic curves for rigid facial surface matching algorithms based on different distance metrics.

To summarize, we systematically identified discriminatory facial structural information from other areas of quantified investigation and employed it to design effective 3D facial recognition algorithms. In the future we will also investigate techniques to automatically locate the associated facial fiducial points. Moreover, we have developed novel techniques for rigid facial surface matching using the CW-SSIM similarity index that are more accurate and computationally efficient than existing techniques.

The authors gratefully acknowledge Advanced Digital Imaging Research, LLC (Friendswood, TX), for funding and 3D face data.

Shalini Gupta
Laboratory for Image and Video Engineering,
Department of Electrical and Computer Engineering,
University of Texas at Austin
Austin, TX 

Shalini Gupta is a doctoral student in the Department of Electrical and Computer Engineering at the University of Texas (UT) at Austin. She is currently developing novel techniques for 3D human facial recognition. While earning a master's degree at UT, she developed novel algorithms for computer-aided diagnosis of breast cancer. She received her bachelor's degree in electronics and electrical communication engineering from Punjab Engineering College, Chandigarh, India.

Mia K. Markey
Biomedical Informatics Laboratory,
University of Texas at Austin,
Department of Biomedical Engineering
Austin, TX

Mia K. Markey earned her BS in computational biology from Carnegie Mellon and her PhD in biomedical engineering from Duke University. She is an assistant professor in the University of Texas Department of Biomedical Engineering, where her primary research interests are biomedical informatics and biomedical signal processing.

Alan C. Bovik
Laboratory for Image and Video Engineering,
University of Texas at Austin
Austin, TX
Alan C. Bovik holds the Curry/Cullen Trust Endowed Chair in the Department of Electrical and Computer Engineering at the University of Texas at Austin, where he directs the Laboratory for Image and Video Engineering. He has published over 450 papers, numerous books, and holds two US patents. He received the Technical Achievement Award of the IEEE Signal Processing Society in 2005.

Recent News
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research
Forgot your username?