Proceedings Volume 9090

Automatic Target Recognition XXIV

cover
Proceedings Volume 9090

Automatic Target Recognition XXIV

View the digital version of this volume at SPIE Digital Libarary.

Volume Details

Date Published: 24 June 2014
Contents: 7 Sessions, 24 Papers, 0 Presentations
Conference: SPIE Defense + Security 2014
Volume Number: 9090

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Front Matter: Volume 9090
  • Advanced Concept I
  • Advanced Concept II
  • Active Sensor Processing
  • Advanced Sensor Processing II
  • Advanced Algorithms
  • Facial and Activity Recognition
Front Matter: Volume 9090
icon_mobile_dropdown
Front Matter: Volume 9090
This PDF file contains the front matter associated with SPIE Proceedings Volume 9090, including the Title Page, Copyright information, Table of Contents, and Conference Committee listing.
Advanced Concept I
icon_mobile_dropdown
Unification of automatic target tracking and automatic target recognition
The subject being addressed is how an automatic target tracker (ATT) and an automatic target recognizer (ATR) can be fused together so tightly and so well that their distinctiveness becomes lost in the merger. This has historically not been the case outside of biology and a few academic papers. The biological model of ATT∪ATR arises from dynamic patterns of activity distributed across many neural circuits and structures (including retina). The information that the brain receives from the eyes is “old news” at the time that it receives it. The eyes and brain forecast a tracked object’s future position, rather than relying on received retinal position. Anticipation of the next moment – building up a consistent perception – is accomplished under difficult conditions: motion (eyes, head, body, scene background, target) and processing limitations (neural noise, delays, eye jitter, distractions). Not only does the human vision system surmount these problems, but it has innate mechanisms to exploit motion in support of target detection and classification. Biological vision doesn’t normally operate on snapshots. Feature extraction, detection and recognition are spatiotemporal. When vision is viewed as a spatiotemporal process, target detection, recognition, tracking, event detection and activity recognition, do not seem as distinct as they are in current ATT and ATR designs. They appear as similar mechanism taking place at varying time scales. A framework is provided for unifying ATT and ATR.
Optimized sparse presentation-based classification method with weighted block and maximum likelihood model
Jun He, Tian Zuo, Bo Sun, et al.
This paper is aiming at applying sparse representation based classification (SRC) on face recognition with disguise or illumination variation. Having analyzed the characteristics of general object recognition and the principle of the classifier of SRC method, authors focus on evaluating blocks of a probe sample and propose an optimized SRC method based on position-preserving weighted block and maximum likelihood model. Principle and implementation of the proposed method have been introduced in the article, and experiments on Yale and AR face database have been given too. From experimental results, it can be seen that the proposed optimized SRC method works well than existing methods.
Maritime vessel recognition in degraded satellite imagery
Katie Rainey, Shibin Parameswaran, Josh Harguess
When object recognition algorithms are put to practice on real-world data, they face hurdles not always present in experimental situations. Imagery fed into recognition systems is often degraded by noise, occlusions, or other factors, and a successful recognition algorithm must be accurate on such data. This work investigates the impact of data degradations on an algorithm for the task of ship classification in satellite imagery by imposing such degradation factors on both training and testing data. The results of these experiments provide lessons for the development of real-world applications for classification algorithms.
Application of an image feature network-based object recognition algorithm to aircraft detection and classification
A network created from the distance-values representing the spacing between points identified by an image feature detection algorithm can be utilized for object classification. This paper presents work on the application of this algorithm to the problem of aircraft presence detection and classification. It considers algorithm performance across a variety of scenarios, including instances where the sky has different characteristics, detection and characterization from different levels of image resolution and detection and characterization where multiple craft are present in a single frame. An extension to the base algorithm, which determines the orientation of a detected aircraft is also presented.
Fusion-based approach for long-range night-time facial recognition
Robert B. Martin, Mikhail Sluch, Kristopher M. Kafka, et al.
Long range identification using facial recognition is being pursued as a valuable surveillance tool. The capability to perform this task covertly and in total darkness greatly enhances the operators’ ability to maintain a large distance between themselves and a possible hostile target. An active-SWIR video imaging system has been developed to produce high-quality long-range night/day facial imagery for this purpose. Most facial recognition techniques match a single input probe image against a gallery of possible match candidates. When resolution, wavelength, and uncontrolled conditions reduce the accuracy of single-image matching, multiple probe images of the same subject can be matched to the watch-list and the results fused to increase accuracy. If multiple probe images are acquired from video over a short period of time, the high correlation between the images tends to produce similar matching results, which should reduce the benefit of the fusion. In contrast, fusing matching results from multiple images acquired over a longer period of time, where the images show more variability, should produce a more accurate result. In general, image variables could include pose angle, field-of-view, lighting condition, facial expression, target to sensor distance, contrast, and image background. Long-range short wave infrared (SWIR) video was used to generate probe image datasets containing different levels of variability. Face matching results for each image in each dataset were fused, and the results compared.
Advanced Concept II
icon_mobile_dropdown
Nonstationary noise propagation with sources
J. S. Ben-Benjamin, L. Cohen
We discuss a number of topics relevant to noise propagation in dispersive media. We formulate the problem of pulse propagation with a source term in phase space and show that a four dimensional Wigner distribution is required. The four dimensional Wigner distribution is that of space and time and also wavenumber and frequency. The four dimensional Wigner spectrum is equivalent to the space-time autocorrelation function. We also apply the quantum path method to improve the phase space approximation previously obtained. In addition we discuss motion in a Snell’s law medium.
Passive detection, characterization, and localization of multiple LFMCW LPI signals
Brandon Hamschin, John Clancy, Mike Grabbe, et al.
A method for passive Detection, Characterization, and Localization (DCL) of multiple low power, Linear Frequency Modulated Continuous Wave (LFMCW) (i.e., Low Probability of Intercept (LPI)) signals is proposed. In contrast to other detection and characterization approaches, such as those based on the Wigner-Ville Transform (WVT) 1or the Wigner-Ville Hough Transform (WVHT) ,2 our approach does not begin with a parametric model of the received signal that is specified directly in terms of its LFMCW constituents. Rather, we analyze the signal over time intervals that are short, non-overlapping, and contiguous by modeling it within these intervals as a sum of sinusoidal (i.e., harmonic) components with unknown frequencies, deterministic but unknown amplitudes, unknown order (i.e., number of harmonic components), and unknown noise autocorrelation function. Using this model of the signal, which we refer to as the Short-Time Harmonic Model (STHM), we implement a detection statistic based on Thompson's Method for harmonic analysis,3 which leads to a detection threshold that is a function of False Alarm Probability PFA and not a function of the noise properties. By doing so we reliably detect the presence of multiple LFMCW signals in colored noise without the need for prewhitening, efficiently estimate (i.e., characterize) their parameters, provide estimation error variances for a subset of these parameters, and produce Time-of-Arrival (TOA) estimates that can be used to estimate the geographical location of (i.e., localize) each LFMCW source. Finally, by using the entire time-series we refine these parameter estimates by using them as initial conditions to the Maximum Likelihood Estimator (MLE), which was originally given in1 and later found in2 to be too computationally expensive for multiple LFMCW signals if accurate initial conditions were not available to limit the search space. We demonstrate the performance of our approach via simulation.
Active Sensor Processing
icon_mobile_dropdown
New experiments in inverse synthetic aperture radar image exploitation for maritime surveillance
This paper provides a summary of recent experimental study in using signatures obtained via polarimetric inverse synthetic aperture radar (ISAR) for classification of small boats in littoral environments. First step in discerning the intention of any small boat is to classify and fingerprint it so it can be observed over an extended period of time. Currently, ISAR techniques are used for large ship classification. Large ships tend to have a rich set of discernible features making classification straightforward. However, small boats rarely have a rich set of discernible features, and are more vulnerable to motion-based range migration that leads to severe signature blurring, thus making classification more challenging. The emphasis of this paper is on the development and use of several enhancement methods for polarimetric ISAR imagery of small boats followed by a target classification study whereby the enhanced signatures of two boats were used to extract several separability metrics to ascertain the effectiveness of these distance measure for target classification.
Ladar ATR via probabilistic open set techniques
Target recognition algorithms trained using finite sets of target and confuser data result in classifiers limited by the training set. Algorithms trained under closed set assumptions do not account for the infinite universe of confusers found in practice. In contrast, classification algorithms developed under open set assumptions label inputs not present in the training data as unknown instead of assigning the most likely class. We present an approach to open set recognition that utilizes class posterior estimates to determine probability thresholds for classification. This is accomplished by first training a support vector machine (SVM) in a 1-vs-all configuration on a training dataset containing only target classes. A validation set containing only class data belonging to the training set is used to iteratively determine appropriate posterior probability thresholds for each target class. The testing dataset, which contains targets present in the training data as well as several confuser classes, is first classified by the 1-vs-all SVM. If the estimated posterior for an input falls below the threshold, the target is labeled as unknown. Otherwise, it is labeled with the class resulting from the SVM decision. We apply our method to automatic target recognition (ATR) of ladar range images and compare its performance to current open set and closed set recognition techniques.
Radar target identification using various nearest neighbor techniques
Radar target identification using decision-theoretic distance based methods have long been used for classifying unknown non-cooperative radar targets using their Radar Cross Section (RCS). This study revisits this subject using the recently developed Large Margin Nearest Neighbor (LMNN) technique in addition to other traditional nearest neighbor methods. Radar target recognition has been defined by two performance limiting issues namely 1) azimuth ambiguity (and/or erroneous estimation of target azimuth) and 2) presence of extraneous scatterers along the target. This study examines these different scenarios and highlights any of the benefits that LMNN may add to the radar target classification problem.
Automatic detection of pulsed radio frequency (RF) targets using sparse representations in undercomplete learned dictionaries
Automatic classification of transitory or pulsed radio frequency (RF) signals is of particular interest in persistent surveillance and remote sensing applications. Such transients are often acquired in noisy, cluttered environments, and may be characterized by complex or unknown analytical models. Conventional representations using orthogonal bases, e.g., Short Time Fourier and Wavelet Transforms, can be suboptimal for classification of transients, as they provide a rigid tiling of the time-frequency space, and are not specifically designed for a particular target signal. They do not usually lead to sparse decompositions, and require separate feature selection algorithms, creating additional computational overhead. We propose a fast, adaptive classification approach based on non-analytical dictionaries learned from data. Our goal is to detect chirped pulses from a model target emitter in poor signal-to-noise and varying levels of simulated background clutter conditions. This paper builds on our previous RF classification work, and extends it to more complex target and background scenarios. We use a Hebbian rule to learn discriminative RF dictionaries directly from data, without relying on analytical constraints or additional knowledge about the signal characteristics. A pursuit search is used over the learned dictionaries to generate sparse classification features in order to identify time windows containing a target pulse. We demonstrate that learned dictionary techniques are highly suitable for pulsed RF analysis and present results with varying background clutter and noise levels. The target detection decision is obtained in almost real-time via a parallel, vectorized implementation.
Autonomous underwater pipeline monitoring navigation system
Byrel Mitchell, Nina Mahmoudian, Guy Meadows
This paper details the development of an autonomous motion-control and navigation algorithm for an underwater autonomous vehicle, the Ocean Server IVER3, to track long linear features such as underwater pipelines. As part of this work, the Nonlinear and Autonomous Systems Laboratory (NAS Lab) developed an algorithm that utilizes inputs from the vehicles state of the art sensor package, which includes digital imaging, digital 3-D Sidescan Sonar, and Acoustic Doppler Current Profilers. The resulting algorithms should tolerate real-world waterway with episodic strong currents, low visibility, high sediment content, and a variety of small and large vessel traffic.
Advanced Sensor Processing II
icon_mobile_dropdown
Time-frequency filtering for classifying targets in nonstationary clutter
Vikram Thiruneermalai Gomatam, Patrick Loughlin
Classifying underwater targets from their sonar backscatter is often complicated by induced or self-noise (i.e. clutter, reverberation) arising from the scattering of the sonar pulse from non-target objects. Because clutter is inherently nonstationary, and because the propagation environment can induce nonstationarities as well, in addition to any nonstationarities / time-varying spectral components of the target echo itself, a joint phase space approach to target classification has been explored. In this paper, we apply a previously developed minimum mean square time-frequency spectral estimation method to design a bank of time-frequency filters from training data to distinguish targets from clutter. The method is implemented in the ambiguity domain in order to reduce computational requirements. In this domain, the optimal filter (more commonly called a “kernel” in the time-frequency literature) multiples the ambiguity function of the received signal, and then the mean squared distance to each target class is computed. Simulations demonstrate that the class-specific optimal kernel better separates each target from the clutter and other targets, compared to a simple mean-squared distance measure with no kernel processing.
Instantaneous frequency and the Wigner-Gabor signal
Determining the amplitude and phase of a signal is important in many areas of science and engineering. The derivative of the phase is typically called the "instantaneous frequency," which in principle mathematically describes (and ideally coincides with) the common physical experiences of variable-frequency phenomena, such as a siren. However, there is an infinite number of different amplitude-phase pairs that will all generate the same real signal, and hence there is an unlimited number of "instantaneous frequencies" for a given real signal. Gabor gave a procedure for associating a specific complex signal to a given real signal, from which a unique definition of the amplitude and phase, and consequently the instantaneous frequency, of the real signal is obtained. This complex signal, called the analytic signal, is obtained by inverting the Fourier spectrum of the real signal over the positive frequency range only. We introduce a new complex signal representation by applying Gabor's idea to the Wigner time-frequency distribution. The resulting complex signal, which we call the Wigner-Gabor signal, has a number of interesting properties that we discuss and compare with the analytic signal. In general the Wigner-Gabor signal is not the analytic signal, although for a pure tone A cos(ω0t) the Wigner-Gabor and analytic signals both equal A exp(jω0t). Also, for a time-limited signal s(t) = 0, |t| > T, the analytic signal is not time-limited, but the Wigner-Gabor signal is time-limited.
Cross-spectral TDOA and FDOA estimation
We present methods for accurately estimating and tracking instantaneous frequency and relative time delay of narrowband signal components. These methods are applied to the problem of estimating the location of an emitter from the signal(s) received by one or more receivers. Both instantaneous frequency estimation and time delay estimation are based on previously reported cross-spectral methods that have been applied successfully to a variety of signal processing problems. Accurate geolocation is accomplished by matching the Doppler characteristics of the received signal to Doppler characteristics estimated from the known emitter motion and possible emitter locations.
Matrix superposition structure with a tree-based principle
If we assume that a natural image can be modeled as a succession of a multilevel system, we can develop an optimal routine of a matrix superposition. Each matrix separates the fundamental elements by a set of optimal criteria. The matrix superposition is then characterized by a tree-based principle which is applied adaptively. We will also demonstrate how the missing data constrains may be overcome by collecting additional measurements.
Advanced Algorithms
icon_mobile_dropdown
Line fitting based feature extraction for object recognition
Image feature extraction plays a significant role in image based pattern applications. In this paper, we propose a new approach to generate hierarchical features. This new approach applies line fitting to adaptively divide regions based upon the amount of information and creates line fitting features for each subsequent region. It overcomes the feature wasting drawback of the wavelet based approach and demonstrates high performance in real applications. For gray scale images, we propose a diffusion equation approach to map information-rich pixels (pixels near edges and ridge pixels) into high values, and pixels in homogeneous regions into small values near zero that form energy map images. After the energy map images are generated, we propose a line fitting approach to divide regions recursively and create features for each region simultaneously. This new feature extraction approach is similar to wavelet based hierarchical feature extraction in which high layer features represent global characteristics and low layer features represent local characteristics. However, the new approach uses line fitting to adaptively focus on information-rich regions so that we avoid the feature waste problems of the wavelet approach in homogeneous regions. Finally, the experiments for handwriting word recognition show that the new method provides higher performance than the regular handwriting word recognition approach.
Sparse representation for vehicle recognition
Nathan D. Monnig, Wesam Sakla
The Sparse Representation for Classification (SRC) algorithm has been demonstrated to be a state-of-the-art algorithm for facial recognition applications. Wright et al. demonstrate that under certain conditions, the SRC algorithm classification performance is agnostic to choice of linear feature space and highly resilient to image corruption. In this work, we examined the SRC algorithm performance on the vehicle recognition application, using images from the semi-synthetic vehicle database generated by the Air Force Research Laboratory. To represent modern operating conditions, vehicle images were corrupted with noise, blurring, and occlusion, with representation of varying pose and lighting conditions. Experiments suggest that linear feature space selection is important, particularly in the cases involving corrupted images. Overall, the SRC algorithm consistently outperforms a standard k nearest neighbor classifier on the vehicle recognition task.
Adaptive compressive sensing for target detection
The goal of a target detection system is to determine the location of potential targets in the field of view of the sensor. Traditionally, this is done using high quality images from a conventional imager. For wide field of view scenarios, this can pose a challenge for both data acquisition and system bandwidth. In this paper, we discuss a compressive sensing technique for target detection that dramatically reduce the number of measurements that are required to perform the task, as compared to the number of pixels in the conventional images. This in turn can reduce the data rate from the sensor electronics, and along with it the cost, complexity and the bandwidth requirements of the system. Specifically, we discuss a two-stage approach that first adaptively searches a large area using shift-invariant masks to determine the locations of potential targets (i.e. the regions of interest), and then re-visits each location to discriminate between target and clutter using a different set of specialized masks. We show that the overall process is not only highly efficient (i.e dramatically reduces the number of measurements as compared to the number of pixels), but does so without appreciable loss in target detection performance.
Automatic target recognition using group-structured sparse representation
Bo Sun, Xuewen Wu, Jun He, et al.
Sparse representation classification method has been increasingly used in the fields of computer vision and pattern analysis, due to its high recognition rate, little dependence on the features, robustness to corruption and occlusion, and etc. However, most of these existing methods aim to find the sparsest representations of the test sample y in an overcomplete dictionary, which do not particularly consider the relevant structure between the atoms in the dictionary. Moreover, sufficient training samples are always required by the sparse representation method for effective recognition. In this paper we formulate the classification as a group-structured sparse representation problem using a sparsity-inducing norm minimization optimization and propose a novel sparse representation-based automatic target recognition (ATR) framework for the practical applications in which the training samples are drawn from the simulation models of real targets. The experimental results show that the proposed approach improves the recognition rate of standard sparse models, and our system can effectively and efficiently recognize targets under real environments, especially, where the good characteristics of the sparse representation based classification method are kept.
Facial and Activity Recognition
icon_mobile_dropdown
Automatic recognition of emotions from facial expressions
Henry Xue, Izidor Gertner
In the human-computer interaction (HCI) process it is desirable to have an artificial intelligent (AI) system that can identify and categorize human emotions from facial expressions. Such systems can be used in security, in entertainment industries, and also to study visual perception, social interactions and disorders (e.g. schizophrenia and autism). In this work we survey and compare the performance of different feature extraction algorithms and classification schemes. We introduce a faster feature extraction method that resizes and applies a set of filters to the data images without sacrificing the accuracy. In addition, we have enhanced SVM to multiple dimensions while retaining the high accuracy rate of SVM. The algorithms were tested using the Japanese Female Facial Expression (JAFFE) Database and the Database of Faces (AT&T Faces).
Robust person and object tracking in LWIR and VIS based on a new template matching method
Template matching is one of the oldest techniques in computer vision. It has been applied in a variety of different applications using cross correlation as distance measurement or derivates of it. But so far, the success of object tracking is very limited despite the promising structural similarity search that is done thereby. Based on an analysis of the underlying reasons, a new kind of measurement is proposed therefore to open up far more of the potential the structural search inherently offers. This new measurement does not sum up differences in color space like the cross correlation but outputs the number of matching pixels in percent. As a key feature, local color variations are considered in order to properly handle the different character of homogeneous and highly structured regions and to model the relations between them. Furthermore, relevant differences between templates are expatiated and stressed while irrelevant contributions to the measurement function are widely suppressed in order to avoid unnecessary distortions on the measurement and, therefore, on the search decision. The presented results document the advantages in comparison to the measurements known from the literature. Different objects and persons in LWIR and VIS image sequences are tracked to illustrate the performance and the benefit in a broad field of applications.
Military personnel recognition system using texture, colour, and SURF features
This paper presents an automatic, machine vision based, military personnel identification and classification system. Classification is done using a Support Vector Machine (SVM) on sets of Army, Air Force and Navy camouflage uniform personnel datasets. In the proposed system, the arm of service of personnel is recognised by the camouflage of a persons uniform, type of cap and the type of badge/logo. The detailed analysis done include; camouflage cap and plain cap differentiation using gray level co-occurrence matrix (GLCM) texture feature; classification on Army, Air Force and Navy camouflaged uniforms using GLCM texture and colour histogram bin features; plain cap badge classification into Army, Air Force and Navy using Speed Up Robust Feature (SURF). The proposed method recognised camouflage personnel arm of service on sets of data retrieved from google images and selected military websites. Correlation-based Feature Selection (CFS) was used to improve recognition and reduce dimensionality, thereby speeding the classification process. With this method success rates recorded during the analysis include 93.8% for camouflage appearance category, 100%, 90% and 100% rates of plain cap and camouflage cap categories for Army, Air Force and Navy categories, respectively. Accurate recognition was recorded using SURF for the plain cap badge category. Substantial analysis has been carried out and results prove that the proposed method can correctly classify military personnel into various arms of service. We show that the proposed method can be integrated into a face recognition system, which will recognise personnel in addition to determining the arm of service which the personnel belong. Such a system can be used to enhance the security of a military base or facility.