Human identification using biometrics has been a major issue in forensics and security applications for many decades. The growing complexity that is introduced by vast databases and advanced spoofing attacks (when one person or program masquerades as another by falsifying data to gain an illegitimate advantage) has rendered human identification an increasingly challenging problem. The enhancement of classic biometric approaches—such as fingerprint,1 face, iris or retina, and palm geometry recognition2—has recently become a primary focus of many researchers. However, various drawbacks of these physical traits (e.g., their vulnerability to being copied, imitated, or falsified) and the inherent obtrusiveness in recognizing these traits (e.g., the need for touch or proximity sensors and strict protocols for their analysis) are driving current research trends towards so-called behavioral biometrics.
Behavioral biometrics are based on activity-related traits that can be extracted unobtrusively from frequent periodical physical movements, such as walking and prehension (grasping and reaching). Because the method recognizes the natural habits of people, behavioral biometric analysis can reveal unique behavioral, stylistic, and affective patterns of movements related to the physical, physiological, and habitual state of the ‘user’ (i.e., the person being monitored). Following our recent results on gait recognition,3 user recognition can be performed by monitoring specific actions and the way these actions are executed by an individual using dynamic and activity-related features.
Figure 1. Schematic representation of the combined authentication scenario. The user walks along random path to his office (gait recognition) and starts working at his desk (activity-related recognition).
The unobtrusiveness of activity-related biometric recognition offers two advantages. First, unlike the majority of existing commercial biometric systems, there is no specific recognition protocol that the user must follow, such as placing his or her finger or eye on a scanning device. Second, any body-attached sensors that require special or uncomfortable preparation (such as wearing specific clothing) can be avoided. Following these principles, we developed the first multimodal system for unobtrusive human recognition that is solely based on sensorless tracking of user activities. In particular, we obtained samples of distinctive activities that are expected to reveal valuable information about the identity of the user. We obtained these samples by conducting an experiment in an ambient intelligence workplace environment—see Figure 1—where the participants were asked to act normally, as they would have done in their usual workplace. We began the experiment by analyzing the walking patterns of the users as they headed to their office in non-predefined paths. Then, extending the theory of Rosenbaum et al.,4 we studied several prehension activities (such as picking up a telephone and interacting with a desk panel) that are frequently performed during a day at the office. We recorded these activities using two depth cameras that capture images at a rate of 30 frames per second.
Figure 2. Data recording and vision-based trajectory extraction from the depth camera.
From the captured images we extracted the depth-enhanced gait silhouettes and apply several 1D transformations on them, such as Radon integral or Krawtchouk moments transforms. Both of these manipulations can form efficient descriptors of grayscale images to give compact representations of the original image.3 Then, we track the head and hands of the user by performing intense processing on the corresponding recordings, such as skin-color filtering, motion detection, and background removal in combination with advanced head tracking (see Figure 2).5 We further filter the noisy, raw tracked points by applying a series of efficient post-processing algorithms5 to obtain smooth and uniform motion trajectories (see Figure 3). To verify the tracking accuracy of the proposed tracker, we simultaneously tracked the movements of the head and all the joints of the arm (i.e., the shoulder, elbow, and palm) using a magnetic tracker (Ascension Technology Corp.).
Figure 3. Comparison of extracted motion trajectories of (a) the same user and (b) different users during a phone conversation experiment. Rep: Repetition.
Figure 4. Relative entropy values from features extracted using a magnetic tracker during the phone conversation experiment. The rightmost bar is for the total dynamic spatial cost of all three arm parts.
By processing the spatiotemporal activity-related trajectories, we could then extract several features, including the speed, acceleration, jerk, and curvature of the movement. However, not all of these features have a high discrimination capacity. We evaluated the distinctiveness of the features according their relative entropy and mutual information (see Figure 4). We identified that the features with the highest value for incorporation into our system were the trajectories of the head and palm, the velocity of the hand, and the total spatial cost and torsion of the palm. Although the trajectories of the elbow and the shoulder are of equally high distinctiveness, they can be omitted because they are highly correlated with the trajectories of the palm (the end effector). The same finding was implied by Rosenbaum et al.4 in 1995.
Table 1.Phone-conversation-based equal error rate (EER) authentication scores. ED: Euclidean distance. DTW: Dynamic time warping. HMM: Hidden Markov models.
|ED||DTW||HMM||HMM and Ergonomy
With these features identified, we evaluated the system based on the estimation of the equal error rates (EERs) among the subjects in the ACTIBIO dataset,5, 6 which includes data from 29 subjects performing a series of workplace-related activities in different time sessions. We calculated the EER for different classifiers (see Tables 1 and 2) at the optimal performance point (as defined by a set of weighting factors for the extracted features, which follows from the findings shown in Figure 4). The first experiment involved the analysis of a short phone activity: someone picking up the telephone, taking it to his or her ear, and then placing it back on the base. When we use a simple Euclidean distance classifier to compare feature vectors, we obtain an EER score of ∼20% (see Table 1). This classifier is calculated by measuring the Euclidean distance between points on two trajectories (and similarly for other features such as velocity) and summing up all the distances (the sum is known as the L1 norm). The EER value is dramatically improved to about 10% when the temporal information of the movement is taken into account by using either dynamic time warping or a hidden Markov model. We used these two algorithms as classifiers that provide a matching score between the gallery subject-specific signature and the incoming probe vectors. Additionally, the incorporation of ergonomic restrictions can lead to further slight improvements of about 1% in the recognition performance.5
Although these EER scores of about 10% are very promising, traditional obtrusive biometric systems achieve much better authentication performance, with typical EER scores ≤1% for commercial systems. Behavioral biometric modalities are therefore far from forming stand-alone recognition systems. However, their added value can be easily exploited in multimodal architectures. In this respect, we combined gait recognition3 with behavioral traits from another prehension activity: a user interacting with a person identification number (PIN)-protected electronic locker (panel).7 The resulting EERs of the multimodal system are shown in Table 2, whereas similar improvements have been seen in identification performance.7
Table 2.Multimodal (gait and behavioral trait recognition) system EER authentication scores. AWGN: Additive white Gaussian noise. RIT: Radon integral transform. KRM: Krawtchouk moments tranform.
In summary, we have presented a novel system for unobtrusive human recognition from simple activities that are performed regularly on daily basis. A possible real-world application of such a system would be its integration to high-security controlled areas and infrastructures, such as nuclear plants or classified military areas, where only a small number of users are authorized to have access. We have used invariant and partially view-invariant activity-related traits that are exclusively extracted from camera sensors with depth estimation capabilities. Our system achieves very promising results that favorably outperform state-of-the-art unimodal recognition systems in both performance and level of obtrusiveness. However, it should be noted that being still in their infancy, these behavioral biometric systems can only be used in an augmentative manner in conventional security solutions. Our future work includes the efficient incorporation of soft biometric traits within an effective recognition framework to achieve further improvements in recognition performance. For example, height and stride in the gait experiment could be used for recognition purposes in high-security facilities (such as nuclear factories) and to augment existing recognition methods (such as PINs and identification cards).
This work was supported by the European Union-funded ACTIBIO Information Society Technology Specific Targeted Research Project (FP7-215372).
Informatics & Telematics Institute
Centre for Research and Technology Hellas (CERTH)
Dimitrios Tzovaras is a senior researcher whose main research interests include visual analytics, 3D object recognition, search and retrieval, behavioral biometrics, assistive technologies, multimodal interfaces, computer graphics, and virtual reality.
Department of Electrical and Electronic Engineering
Imperial College London
Anastasios Drosou holds an MEng in electrical and computer engineering and an MSc in communication engineering. Currently, he is completing his PhD in the Communications and Signal Processing Group of Imperial College London while concurrently working as a research assistant at CERTH.
1. D. Maltoni, D. Maio, A. K. Jain, S. Prabhakar, Handbook of Fingerprint Recognition , Springer Professional Computing, 2009.
2. A. K. Jain, A. Ross, S. Prabhakar, An introduction to biometric recognition, IEEE Trans. Circuits Syst. Video Technol.
14(1), p. 4-20, 2004. doi:10.1109/TCSVT.2003.818349
3. D. Ioannidis, D. Tzovaras, I. G. Damousis, S. Argyropoulos, K. Moustakas, Gait recognition using compact feature extraction transforms and depth information, IEEE Trans. Inf. Forensics Security
2(3), p. 623-630, 2007. doi:10.1109/TIFS.2007.902040
4. D. A. Rosenbaum, L. D. Loukopoulosa, R. G. J. Meulenbroekb, J. Vaughanc, S. E. Engelbrecht, Planning reaches by evaluating stored postures, Psychol. Rev.
102(1), p. 28-67, 1995. doi:10.1037/0033-295X.102.1.28
5. A. Drosou, D. Ioannidis, K. Moustakas, D. Tzovaras, Spatiotemporal analysis of human activities for biometric authentication, Comput. Vision Image Understanding
116(3), p. 411-421, 2012. doi:10.1016/j.cviu.2011.08.009
6. http://www.actibio.eu/ The ACTIBIO Project (Unobtrusive Authentication Using Activity Related and Soft Biometrics). Accessed 19 April 2012.
7. A. Drosou, G. Stavropoulos, D. Ioannidis, K. Moustakas, D. Tzovaras, Unobtrusive multi-modal biometric recognition approach using activity-related signatures, IET Comput. Vision
5(6), p. 367-379, 2011. doi:10.1049/iet-cvi.2010.0166