Automatic Target Recognition XXI

Front Matter: Volume 8049

Show abstract

This PDF file contains the front matter associated with SPIE Proceedings Volume 8049, including the Title Page, Copyright information, Table of Contents, and the Conference Committee listing.

Object classification using local subspace projection

Jennifer Nealy, Robert Muise

Show abstract

We consider the problem of object classification from image data. Significant challenges are presented when objects can be imaged from different view angles and have different distortions. For example, a vehicle will appear completely different depending on the viewing angle of the sensor but must still be classified as the same vehicle. In regards to face recognition, a person may have a variety of facial expressions and a pattern recognition algorithm would need to account for these distortions. Traditional algorithms such as PCA filters are linear in nature and cannot account for the underlying non-linear structure which characterizes an object. We examine nonlinear manifold techniques applied to the pattern recognition problem. One mathematical construct receiving significant research attention is diffusion maps, whereby the underlying training data are remapped so that Euclidean distance in the mapped data is equivalent to the manifold distance of the original dataset. This technique has been used successfully for applications such as data organization, noise filtering, and anomaly detection with only limited experiments with object classification. For very large datasets (size N), pattern classification with diffusion maps becomes rather onerous as there is a requirement for the eigenvectors of an NxN matrix. We characterize the performance of a 40 person facial recognition problem with standard K-NN classifier, a diffusion distance classifier, and standard PCA. We then develop a local subspace projection algorithm which approximates the diffusion distance without the prohibitive computations and shows comparable classification performance.

Method of recognition and pose estimation of multiple occurrences of multiple objects in visual images

Deepak Khosla, David J. Huber

Show abstract

This paper describes a system for multiple-object recognition and segmentation that (1) correctly identifies objects in a natural scene and provides a boundary for each object, (2) can identify multiple occurrences of the same object (e.g., two identical objects, side-by-side) in the scene from different training views. The algorithm is novel in that it employs statistical modeling to efficiently prune features from an identified object from the scene without disturbing similar features elsewhere in the scene. The originality of the approach allows one to analyze complex scenes that occur in nature contain multiple instances of the same object.

Bio-inspired 'surprise' for real-time change detection in visual imagery

David J. Huber, Deepak Khosla

Show abstract

This paper describes a fast and robust bio-inspired method for change detection in high-resolution visual imagery. It is based on the computation of surprise, a dynamic analogue to visual saliency or attention, that uses very little processing beyond that of the initial computation of saliency. This is different from prior surprise algorithms, which employ complex statistical models to describe the scene and detect anomalies. This algorithm can detect changes in a busy scene (e.g., a person crawling in bushes or a vehicle moving in a desert) in real-time on typical video frame rates and can be used as a front-end to a larger system that includes object recognition and scene understanding modules that operate on the detected surprising regions.

Metal object detection using a forward-looking polarimetric ground penetrating radar

Ethan H.-Y. Chun, Cornell S. L. Chun

Show abstract

The usefulness of ground penetrating radar to detect landmines has been limited because of low signal-to-clutter ratios which result in high false alarm rates. We describe a method using polarimetric radar to measure the polarizability angle, the relative phase, and the target magnitude. These three independent quantities are directly related to target shape and dimensions and are invariant with respect to rotation about the sensor-to-target axis. We built a forward-looking polarimetric ground penetrating radar and used it to collect data on an automobile disk brake rotor on the surface of dry sand and buried 1 in under the surface of the sand. Measurements were made over a frequency range of 1.35-2.14 GHz. We also performed a computer simulation using the Method of Moments of a target roughly shaped like the rotor. For the simulation and the measured data, the target magnitude exhibited an interference patterns from scattering centers at the edges. The computer simulation revealed that a target has characteristic frequencies marking transitions from reflection being dominated by one polarization state to reflection being dominated by the orthogonal polarization state. For the rotor in uneven ground the characteristic frequencies were found at the maxima of the polarizability angle. At these particular frequencies, the relative phase changes sign. The characteristic frequencies may be useful as a target signature.

Informative representation learning for automatic target recognition

Charles F. Hester, Kelly K. D. Risko

Show abstract

Informative representations are those representations that do more than reconstruct the data; they have information embedded implicitly in them and are compressive for utilization in real-time Automatic Target Recognition. In this paper we create methods for embedding information in subspace bases through sparsity and information theoretic measures. We present a theory of informative bases and demonstrate some practical examples of basis learning using infrared imagery. We will employ sparsity and entropy measures to drive the learning process to extract the most informative representation and will draw relations between informative representations and the quadratic correlation filter.

Time dependent moments for a nonstationary noise model

L. Cohen, A. Ahmad

Show abstract

We calculate the raw and central moments for a noise model that is the sum of elementary signals. In addition we obtain expressions for the scintillation index. The case where the elementary signals are real and complex are considered and the relationships between the two are derived.

Waveform design with time and frequency constraints for optimal detection of elastic objects

Brandon M. Hamschin, Patrick J. Loughlin

Show abstract

We extend a recent method by Kay that maximizes the probability of detecting an elastic object in the presence of Gaussian reverberation and additive Gaussian interference. Kay's solution specifies the spectral magnitude for the optimal transmit waveform, and hence there is an unlimited number of "optimal" waveforms that can be transmitted, all with the same spectral magnitude but differing in terms of time domain characteristics such as duration and peak power. We extend Kay's approach in order to obtain a unique optimal waveform by incorporating time-domain constraints, via two optimization problem formulations. One approach yields a waveform that preserves the optimal spectral magnitude while achieving the minimum temporal duration. The second complementary approach considers temporal concentration rather than duration, and yields a waveform that, depending on the degree of concentration imposed, achieves the optimal the spectral magnitude to varying degrees.

Probability distribution for intensity for a reverberation model

A. Ahmad, L. Cohen

Show abstract

We consider the probability distribution for intensity for a reverberation model that is the sum of elementary signals. The random aspect is that the initial spatial means of the elementary signals are chosen randomly. The intensity is calculated at fixed positions and time after the pulses evolves. A number of cases are done analytically. Otherwise we study the probability distribution by simulation.

Impact of range dependent propagation on classification of underwater objects by their sonar backscatter

Vikram Thiruneermalai Gomatam, Patrick Loughlin

Show abstract

Propagation effects, such as dispersion, absorption and multi-path, can adversely impact classification of underwater objects from their sonar backscatter. One approach to handling this problem is to extract features from the wave that are minimally affected by propagation effects, if possible. In previous work, a signal processing and feature extraction method was developed to obtain moment-like features that are invariant to dispersion and absorption. The method was developed based on linear wave propagation in range- independent environments. However, most ocean environments, especially littoral environments, exhibit range dependence. Deriving propagation invariant features for such environments remains an especially challenging task. In this paper, we explore the classification utility of the previously developed range-independent features in a range-dependent environment, via simulation of the propagation of the backscatter from two different cylinders in an ideal wedge. Our simulation results show that, while performance does drop off for increasing distances in a range dependent environment, the previously developed invariant moment features do provide better classification performance than ordinary temporal moments.

Dismounted human detection at long ranges

Amy E. Bell

Show abstract

This research investigates the automatic detection of a dismounted human from a single image as a function of range. The histogram of oriented gradients (HOG) method provides the feature vector and a support vector machine performs the classification. This work presents, for the first time, an understanding of how HOG for human detection holds up as range increases. The results indicate that HOG remains effective even at long distances; for example, the average miss rate and false alarm rate were both kept to 5% for humans only 12 pixels high and 4-5 pixels wide. The impact of the amount and type of training data needed to achieve this long-range performance is examined.

Human body tracking using LMS-VSMM from monocular video sequences

Hong Han, Zhichao Chen, LC Jiao, et al.

Show abstract

A new model-based human body tracking framework with learned-based theory is proposed in this paper. This framework introduces a likely model set-variable structure multiple models (LMS-VSMM) to track articulated human motion in monocular images sequences. The key joint points are selected as image feature, which are detected automatically and the undetected points are estimated with Particle filters, multiple motion models are learned from CMU motion capture database with ridge regression method to direct tracking. In tracking, motion models currently in effect switches from one to another in order to match the present human motion mode. The motion model is activated according to the change in projection angle of kinematic chain, and topological and compatibility relationship among them. It is terminated according to their model probabilities. And likely model set schemes of VSMM is used to estimate the quaternion vectors of joints rotation. Experiments using two videos demonstrate this tracking framework is efficient with respect to 3D pose and 2D projection.

Human detection based on curvelet transform and integrating heterogeneous features

Hong Han, Youjian Fan

Show abstract

A method in Curvelet transformation and integrating heterogeneous features for human detection are proposed in this paper. The descriptor based on the second generation Curvelet transform (CTD) was proposed firstly, it concatenated the edge and texture feature vectors. To capture edge features, the statistic measures such as energy, entropy, standard deviation, max value and contrast computed from the blocks which is partitioned from the sub-bands of all the scales are concatenated. To get texture features, the lowest frequency sub-band coefficients were partitioned into overlapped blocks. Four co-occurrence matrixes were computed for each block. And some descriptors such as angular second-moment, contrast, correlation, sum of variance, sum of average and entropy are computed from the co-occurrence matrix, which are concatenated as the texture feature vector. And then the method integrating three feature extraction methods, such as Histogram of Oriented Gradient (HOG), Granularity-tunable Gradients Partition descriptors (GGP), and CTD, is proposed for human detection. Computational Cost Normalized classification Margin is used to determine the order of the feature to be evaluated. The experimental results on the basis of INRIA and MIT human database showed that CTD and integrating heterogeneous feature method increased the detection accuracy comparing to HOG and GGP.

Detecting and tracking people and their body parts in infrared

Kai Jüngling, Michael Arens

Show abstract

In most of today's surveillance tasks, people's actions are the focus of attention. A prerequisite for action interpretation is a stable tracking of people to build meaningful trajectories. Specifically in surveillance applications, not only trajectories on agent level are of interest, but also interpretation on the level of limbs provides important information when it comes to more sophisticated action recognition tasks. In this paper, we present an integrated approach to detect and track people and their body parts in thermal imagery. For that, we introduce a generic detection and tracking strategy that employs only local image features and thus works independently of underlying video data specifics like color information - making it applicable to both, visible and infrared data. In addition, we show how this approach serves to detect a person's body parts and extract trajectories which can be input for further interpretation purposes.

An implicit shape model based approach to identify armed persons

Stefan Becker, Kai Jüngling

Show abstract

In addition to detecting and tracking persons via video surveillance in public spaces like airports and train stations, another important aspect of a situation analysis is the appearance of objects in the periphery of a person. Not only from a military perspective, in certain environments, an unidentified armed person can be an indicator for a potential threat. In order to become aware of an unidentified armed person and to initiate counteractive measures, the ability to identify persons carrying weapons is needed. In this paper we present a classification approach, which fits into an Implicit Shape Model (ISM) based person detection and is capable to differentiate between unarmed persons and persons in an aiming body posture. The approach relies on SIFT features and thus is completely independent of sensor-specific features which might only be perceivable in the visible spectrum. For person representation and detection, a generalized appearance codebook is used. Compared to a stand-alone person detection strategy with ISM, an additional training step is introduced that allows interpretation of a person hypothesis delivered by the ISM. During training, the codebook activations and positions of participated features are stored for the desired classes, in this case, persons in an aiming posture and unarmed persons. With the stored information, one is able to calculate weight factors for every feature participating in a person hypothesis in order to derive a specific classification model. The introduced model is validated using an infrared dataset which shows persons in aiming and non-aiming body postures from different angles.

A Bayesian approach to activity detection in video using multi-frame correlation filters

Abhijit Mahalanobis, Robert Stanfill, Kenny Chen

Show abstract

Multi-frame correlation filters have been recently reported in the literature for the detection of moving objects. Introduced by Kerekes and Kumar [5], this technique uses a motion model to accumulate evidence over time in a Bayesian framework to improve the receiver operating characteristic (ROC) curve. In this paper, we generalize the approach to not only detect objects, but also their activities by using separate motion models to represent each activity. We also discuss results of preliminary simulations using publicly released aerial data set to illustrate the concept.

Practical optimal processing in hyperdimensional spaces via domain-reducing mappings

Manuel Fernández, Tom Aridgides, Firooz Sadjadi

Show abstract

Modern multi- and hyper-dimensional processing problems, such as those encountered in many applications involving image processing, adaptive beamforming, hyperspectral IR detection, medical imaging, STAP, Volterra calibration, etc., are numerically very demanding due to the vast amounts of data involved. Further compounding the situation is the fact that many such applications require estimating a set of parameters of interest that may be so large that the data available, despite its massiveness, may not be enough to properly calculate the pertinent statistics. The approach presented here addresses such problems by projecting the available data - both, modeled and measured - into a reduced-dimensionality domain where the estimation process is then performed. This strategy is extremely useful when the parameter set is not the final objective per se, but rather just a means to an end (e.g., a classification decision, detecting a signal of interest, etc.). In particular, we will concentrate on the case of finding the optimal projector for a given problem of interest where a priori information may be available. This means that the reduced-dimensionality domain must be selected as one incorporating and preserving that knowledge. We explore the use of Krylov Subspaces to achieve this end, as they inherently allow the inclusion of such data. In order to maintain a visage of practicality, we have chosen to present our developments from the perspective of the adaptive processing (filtering) problem, as this enables our presentation to be applicable to the endless expanse of optimization problems that can be addressed via a Least Squares formulation. Regularization issues, as well as extensions to non-linear filters (Taylor/Volterra/polynomial), will also be presented so as to provide additional ideas regarding the usefulness and malleability of our methods.

Baseline processing pipeline for fast automatic target detection and recognition in airborne 3D ladar imagery

Simon Roy, Jean Maheux

Show abstract

It has been proven that 3D ladar imagery has a strong potential for automatic target detection (ATD) and automatic target recognition (ATR); ladars enhance target information, which may then be exploited to yield higher recognition rates and lower false alarms. Although numerous techniques have been proposed for both 3D ATD and 3D ATR, no single approach has proven capable of systematically outperforming all other techniques for every possible scenario. In this context, this paper describes a set of fast 3D ATD/ATR algorithms designed to process cooperative targets in airborne 3D ladar imagery. This algorithmic chain consists of four modules: detection, segmentation, classification and recognition. In each module, fast algorithms were implemented, some of which stem from open literature while others were designed in-house. The purpose of this algorithmic chain is to provide a baseline approach for efficient processing of simple scenarios. The ultimate goal of this work is to characterize and compare algorithms with respect to increasingly complex scenarios, in hopes of progressing towards an adaptive processing pipeline for context-driven 3D ATD/ATR. In this paper, the four modules of the baseline processing pipeline are first described. Preliminary test results obtained with real airborne ladar imagery are then presented, in which fast and accurate 3D ATD/ATR is performed with a library of 20 scanned vehicles. Finally, a demonstration is presented to illustrate how this baseline approach may be expanded to tackle more complex scenarios, such as non-cooperative targets concealed under vegetation.

Integrating LPR with CCTV systems: problems and solutions

David Bissessar, Dmitry O. Gorodnichy

Show abstract

A new generation of high-resolution surveillance cameras makes it possible to apply video processing and recognition techniques on live video feeds for the purpose of automatically detecting and identifying objects and events of interest. This paper addresses a particular application of detecting and identifying vehicles passing through a checkpoint. This application is of interest to border services agencies and is also related to many other applications. With many commercial automated License Plate Recognition (LPR) systems available on the market, some of which are available as a plug-in for surveillance systems, this application still poses many unresolved technological challenges, the main two of which are: i) multiple and often noisy license plate readings generated for the same vehicle, and ii) failure to detect a vehicle or license plate altogether when the license plate is occluded or not visible. This paper presents a solution to both of these problems. A data fusion technique based on the Levenshtein distance is used to resolve the first problem. An integration of a commercial LPR system with the in-house built Video Analytic Platform is used to solve the latter. The developed solution has been tested in field environments and has been shown to yield a substantial improvement over standard off-the-shelf LPR systems.

Anomaly detection in hyperspectral imagery using stable distribution

S. Mercan, Mohammad S. Alam

Show abstract

In hyperspectral imaging applications, the background generally exhibits a clearly non-Gaussian impulsive behavior, where valuable information stays in the tail. In this paper, we propose a new technique, where the background is modeled using the stable distribution for robust detection of outliers. The outliers of the distribution can be considered as potential anomalies or regions of interests (ROIs). We effectively utilize the stable model for detecting targets in impulsive hyperspectral data. To decrease the false alarm rate, it is necessary to compare the ROI with the known reference using a suitable technique, such as the Euclidian distance. Modeling data with stable distribution compensates a drawback of the Gaussian model, which is not well suited for describing signals with impulsive behavior. In addition, thresholding is considered to avoid misclassification of targets. Test results using real life hyperspectral image datasets are presented to verify the effectiveness of the proposed technique.

Multisensor ISR in geo-registered contextual visual dataspace (CVD)

Kyungnam Kim, Yuri Owechko, Arturo Flores, et al.

Show abstract

Current ISR (Intelligence, Surveillance, and Reconnaissance) systems require an analyst to observe each video stream, which will result in analyst overload as systems such as ARGUS or Gorgon Stare come into use with many video streams generated by those sensor platforms. Full exploitation of these new sensors is not possible using today's one video stream per analyst paradigm. The Contextual Visual Dataspace (CVD) is a compact representation of real-time updating of dynamic objects from multiple video streams in a global (geo-registered/annotated) view that combines automated 3D modeling and semantic labeling of a scene. CVD provides a single integrated view of multiple automatically-selected video windows with 3D context. For a proof of concept, a CVD demonstration system performing detection, localization, and tracking of dynamic objects (e.g., vehicles and pedestrians) in multiple infrastructure camera views was developed using a combination of known computer vision methods, including foreground detection by background subtraction, ground-plane homography mapping, and appearance model-based tracking. Automated labeling of fixed and moving objects enables intelligent context-aware tracking and behavior analysis and will greatly improve ISR capabilities.

Integration of low level and ontology derived features for automatic weapon recognition and identification

Nikolay Metodiev Sirakov, Sang Suh, Salvatore Attardo

Show abstract

This paper presents a further step of a research toward the development of a quick and accurate weapons identification methodology and system. A basic stage of this methodology is the automatic acquisition and updating of weapons ontology as a source of deriving high level weapons information. The present paper outlines the main ideas used to approach the goal. In the next stage, a clustering approach is suggested on the base of hierarchy of concepts. An inherent slot of every node of the proposed ontology is a low level features vector (LLFV), which facilitates the search through the ontology. Part of the LLFV is the information about the object's parts. To partition an object a new approach is presented capable of defining the objects concavities used to mark the end points of weapon parts, considered as convexities. Further an existing matching approach is optimized to determine whether an ontological object matches the objects from an input image. Objects from derived ontological clusters will be considered for the matching process. Image resizing is studied and applied to decrease the runtime of the matching approach and investigate its rotational and scaling invariance. Set of experiments are preformed to validate the theoretical concepts.

View morphing using linear prediction of sub-space features

Abhijit Mahalanobis, Phil Berkowitz, Mubarak Shah

Show abstract

We present a mathematical technique for estimating new perspective views of an object from a single image. Unlike traditional graphics or ray tracing methods, our approach treats the view-morphing problem as a 2-D linear prediction process. We first estimate the prediction parameters in a reduced dimensional space using features extracted from "training" images of the object. Given an arbitrary view of the object, the features of the new view are linearly predicted from which the morphed image of the object is reconstructed. The proposed approach can be used for rapidly incorporating new objects in the knowledge base of a computer vision system and may have advantages in low-contrast situations where it is difficult to establish correspondence between sample views.

Redefining automatic target recognition (ATR) performance standards

Donald Waagen, Charles Hester, Ben Schmid, et al.

Show abstract

Present descriptors for Automatic Target Recognition (ATR) performance are inadequate for use in comparing algorithms that are purported to be a solution to the problem. The use of receiver operator characteristic curves (ROCs) is a defacto standard, but they do not communicate several key performance measures, including (i) intrinsic separation between classes in the input space, (ii) the efficacy of the mapping induced by the algorithm, (iii) the complexity of the algorithmic mapping, and (iv) a measure of the generalization of the proposed solution. Previous work by Sims et. al.^2,5 has addressed the distortion of the evaluation sets to indicate an algorithm's capability (or lack thereof) for generalization and handling of unspecified cases. This paper addresses the rethinking of the summary statistics used for understanding the performance of a solution. We propose new approaches for solution characterization, allowing algorithm performance comparison in an equitable and insightful manner. This paper proffers some examples and suggests directions for new work from the community in this field.

Analytic performance model for grayscale quantization in the presence of additive noise

Adam R. Nolan, G. Steven Goley

Show abstract

Synthetic aperture radar (SAR) exploitation algorithms typically rely on the use of derived features to represent the target. These features are chosen to discriminate between target classes while exhibiting robustness to noise and calibration artifacts. One of the challenges in working with such features, is understanding when this assumption of robustness is no longer valid. In this paper, we focus on characterizing the performance of the gray scale quantization feature in the presence of additive noise. We derive an approximation for the variance of the intraclass distance by treating the additive noise as an independently identically distributed (iid) process. The analytic model is contrasted with empirical results for a two class problem.

Variability and robustness of scatterers in HRR/ISAR ground target data and its influence on the ATR performance

R. Schumacher, H. Schimpf, J. Schiller

Show abstract

The most challenging problem of Automatic Target Recognition (ATR) is the extraction of robust and independent target features which describe the target unambiguously. These features have to be robust and invariant in different senses: in time, between aspect views (azimuth and elevation angle), between target motion (translation and rotation) and between different target variants. Especially for ground moving targets in military applications an irregular target motion is typical, so that a strong variation of the backscattered radar signal with azimuth and elevation angle makes the extraction of stable and robust features most difficult. For ATR based on High Range Resolution (HRR) profiles and / or Inverse Synthetic Aperture Radar (ISAR) images it is crucial that the reference dataset consists of stable and robust features, which, among others, will depend on the target aspect and depression angle amongst others. Here it is important to find an adequate data grid for an efficient data coverage in the reference dataset for ATR. In this paper the variability of the backscattered radar signals of target scattering centers is analyzed for different HRR profiles and ISAR images from measured turntable datasets of ground targets under controlled conditions. Especially the dependency of the features on the elevation angle is analyzed regarding to the ATR of large strip SAR data with a large range of depression angles by using available (I)SAR datasets as reference. In this work the robustness of these scattering centers is analyzed by extracting their amplitude, phase and position. Therefore turntable measurements under controlled conditions were performed targeting an artificial military reference object called STANDCAM. Measures referring to variability, similarity, robustness and separability regarding the scattering centers are defined. The dependency of the scattering behaviour with respect to azimuth and elevation variations is analyzed. Additionally generic types of features (geometrical, statistical), which can be derived especially from (I)SAR images, are applied to the ATR-task. Therefore subsequently the dependence of individual feature values as well as the feature statistics on aspect (i.e. azimuth and elevation) are presented. The Kolmogorov-Smirnov distance will be used to show how the feature statistics is influenced by varying elevation angles. Finally, confusion matrices are computed between the STANDCAM target at all eleven elevation angles. This helps to assess the robustness of ATR performance under the influence of aspect angle deviations between training set and test set.

The influence of multipath on ship ATR performance

Hartmut Schimpf

Show abstract

Based on measurements of a ship at 17 GHz and on several simulated ships at 35GHz it is demonstrated how multipath changes the range profiles that form a basis for the construction of ATR features for ship classification. The fluctuation of range profiles leads to a corresponding fluctuation of feature values that make it difficult to define stable test feature vectors, and reliable feature references in the training stage.

A comparison of machine learning methods for target recognition using ISAR imagery

Karen D. Uttecht, Cindy X. Chen, Jason C. Dickinson, et al.

Show abstract

The ability to accurately classify targets is critical to the performance of automated/assisted target recognition (ATR) algorithms. Supervised machine learning methods have been shown to be able to classify data in a variety of disciplines with a high level of accuracy. The performance of machine learning techniques in classifying ground targets in two-dimensional radar imagery were compared. Three machine learning models were compared to determine which model best classifies targets with the highest accuracy: decision tree, Bayes', and support vector machine. X-band signature data acquired in scale-model compact ranges were used. ISAR images were compared using several techniques including two-dimensional cross-correlation and pixel by pixel comparison of the image against a reference image. The highly controlled nature of the collected imagery was ideally suited for the inter-comparison of the machine learning models. The resulting data from the image comparisons were used as the feature space for testing the accuracy of the three types of classifiers. Classifier accuracy was determined using N-fold cross-validation.

Automatic Target Recognition XXI

Volume Details

Table of Contents

Table of Contents