Tracking systems are now a ubiquitous part of many domains, including marketing, logistics, sports, and military uses.1, 2 Although each tracking application is different, all systems tend to suffer from similar overriding problems. For instance, it is difficult to track a vehicle as it moves through a city because vehicle shapes are not known a priori. In addition, vehicles are often occluded as they move behind buildings and trees. Vehicles may also move erratically.
Classical vehicle tracking approaches have two particular weaknesses. First, there is a lack of high-order knowledge that can be used to respond to multisensory inputs. Second, with these classical systems, it is not possible to communicate tracking decisions to humans with descriptions based on qualia (the type of properties that a person typically perceives or experiences).3
The goal of our work is to demonstrate a proof-of-concept design that makes improvements to these classical vehicle tracking systems. We have used real-world data to show that the uncertainty involved with tracking processes can be reduced with a three-layer model. Our model (see Figure 1) consists of a data-driven lower level, a second—‘thinking’—layer (used to resolve multiple hypotheses in the data), and an even higher level that provides insight into the goals and motivations of the system. With our model, the thinking process can be communicated to a human. As such, tracking operations can be improved and results made more reliable.
Figure 1. Proposed three-layer tracking model. The bottom layer (blue) represents physical data sources, such as electro-optical (EO) cameras, synthetic aperture radar (SAR) systems, or even a Google Glass device. These low-level devices feed a second layer (orange), which uses these inputs to develop several potential views of the world that are human-centric rather than data-centric. These hypotheses are resolved and fed to the top layer (green) to inform and direct the end user to a specific goal (e.g., a user may wish to know the number of vehicles that passed through an intersection to detect unusual traffic flows).
In 2008, we acquired tracking data from airborne assets with a modified DCS Corporation radar. In this ‘GOTCHA’ experimental campaign, the region of interest was continually illuminated by a large radar beam. We also used other sensors (e.g., an optical sensing device on a King Air 90 aircraft, and an IR imaging sensor device) for these tests. The data we obtained (see Figure 2) is centered on the Wright-Patterson Air Force Base and has a radius of about 5km. After this data was analyzed, we chose a scene—consisting of a T-shaped road intersection (see Figure 3)—with 60 seconds of electro-optical (EO) and synthetic aperture radar (SAR) data that are roughly aligned in time and space. During the chosen 60 seconds of the scene, more than 20 vehicles come into view and then either turn or go straight through the intersection. Under some conditions, it is difficult for classical trackers to effectively monitor the vehicles as they stop and move off again. However, with our hypothetical thinking methodology—combined with other sources of information (i.e., the qualia of the scene as indicated by the fused EO and SAR data)—these difficulties can be overcome. Although our higher-order thinking technique is more computationally intensive, we are able to analyze the scene, provide several potentially competing options, evaluate these options using heuristics and training, and correct any potential tracking errors.
Figure 2. Coincident SAR (top) and EO (bottom) images of vehicles traveling through a T-shaped intersection. These sample images were obtained during the Air Force Research Laboratory's GOTCHA experimental campaign in 2008. Colored markings represent the tracks of different vehicles through the scene.
Figure 3. SAR image of the scene chosen for testing of the three-layer model. Colored markings represent the tracks of different vehicles through the scene.
We have applied our three-layer architecture to this tracking scenario.4 The first layer of tracking was performed with the EO data and the course tracking was conducted using SAR data. We then communicated both these sets of data, through a novel qualia-inspired event log, to the thinking layer. We exposed the second layer's sense-making engine to ‘truthed’ data (i.e., the classifications had been verified by a human), which allowed us to establish a baseline of normal behavior. As such, the second layer ‘learned’ four basic facts about the truthed data. First, that vehicles always appear in the scene far from the intersection. Second, vehicles must eventually appear near the intersection. Third, vehicles cannot start and stop instantaneously. And finally, vehicles cannot drive through each other. We subsequently applied these truths to the event log that we had generated from the scenario.4 When we used only EO data on two particular vehicles, however, an anomaly was produced, i.e., there seemed to be a violation of the rule that vehicles always originally appeared far from the intersection in the historical data.
The sense-making component of our system was thus invoked by this anomaly. This component—within the thinking layer—generates four hypotheses (see Figure 4), based on the behavior of vehicles recorded in the historical log. These hypotheses are currently hard-coded into the system as the only options for vehicle behavior. The different outcomes, however, can also be learned by the system. The different hypotheses are evaluated—by looking for information that would contradict them or provide evidence for them—within the second layer. After this process is completed, layer 2 provides an output of four evaluations (see Figure 4). In this case, the system chose hypothesis 4 because it is not contradicted. The track of the vehicle through the intersection can therefore be corrected and the algorithm is improved. An additional benefit of our approach is that all of the hypotheses, and the reasoning for each evaluation, can be presented to a human operator in a readable format.
Figure 4. Four hypotheses are generated by the sense-making component of the three-layer system. These hypotheses are evaluated in the second layer.
We have applied a novel three-layer hypothetical thinking architecture to a vehicle tracking scenario. We have also successfully demonstrated with our experimental results that tracking processes can be improved using our model. We now plan to integrate and extend our methodology to additional vehicle tracking scenarios and other areas. We hope to evaluate ‘cartoon world’ simulations in detail. As virtual-world simulators become more common with the advent of autonomous cars, it is conceivable that the artificial intelligence involved with these cars will be applied to scenes generated with SAR and EO data. We also anticipate integration of our architecture into an end-to-end system, with full human involvement, with the use of Google Glass.
Jonathan White is the Director of Computer Engineering. His research interests include sensor-based tracking, cyber security, green computing, and engineering services.
Arizona State University
Jared Culbertson, Igor Ternovskiy
Air Force Research Laboratory
1. B. D. Birrer, R. A. Raines, R. O. Baldwin, M. E. Oxley, S. K. Rogers, Using qualia and hierarchial models in malware detection, J. Inf. Assurance Security 4, p. 247-255, 2009.
2. J. P. Mansell, W. M. Riley, Vehicle tracking and security system, US Patent 5,223,844, 1993.
3. O. Larue, P. Poirier, R. Nkambou, The emergence of (artificial) emotions from cognitive and neurological processes, Biol. Inspired Cognitive Architectures 4, p. 54-68, 2013.
4. J. White, A. Helmstetter, J. Culbertson, I. Ternovskiy, Qualia centric hypothetical thinking: applications to vehicle tracking with the fusion of EO and SAR input data sources, Proc. SPIE
9458, p. 945806, 2015. doi:10.1117/12.2176582