Show all abstracts
View Session
- Front Matter: Volume 9122
- Information Fusion and Analysis
- Information Visualization
- Big Data and Information Management
- Participatory Sensing & Cognition
- Poster Session
Front Matter: Volume 9122
Front Matter: Volume 9122
Show abstract
This PDF file contains the front matter associated with SPIE Proceedings Volume 9122 including the Title Page, Copyright information, Table of Contents, Introduction, and Conference Committee listing.
Information Fusion and Analysis
Automatic theory generation from analyst text files using coherence networks
Show abstract
This paper describes a three-phase process of extracting knowledge from analyst textual reports. Phase 1 involves
performing natural language processing on the source text to extract subject-predicate-object triples. In phase 2, these
triples are then fed into a coherence network analysis process, using a genetic algorithm optimization. Finally, the
highest-value sub networks are processed into a semantic network graph for display. Initial work on a well- known data
set (a Wikipedia article on Abraham Lincoln) has shown excellent results without any specific tuning. Next, we ran the
process on the SYNthetic Counter-INsurgency (SYNCOIN) data set, developed at Penn State, yielding interesting and
potentially useful results.
Using Complex Event Processing (CEP) and vocal synthesis techniques to improve comprehension of sonified human-centric data
Jeff Rimland,
Mark Ballora
Show abstract
The field of sonification, which uses auditory presentation of data to replace or augment visualization techniques, is
gaining popularity and acceptance for analysis of “big data” and for assisting analysts who are unable to utilize
traditional visual approaches due to either: 1) visual overload caused by existing displays; 2) concurrent need to perform
critical visually intensive tasks (e.g. operating a vehicle or performing a medical procedure); or 3) visual impairment due
to either temporary environmental factors (e.g. dense smoke) or biological causes.
Sonification tools typically map data values to sound attributes such as pitch, volume, and localization to enable them to
be interpreted via human listening. In more complex problems, the challenge is in creating multi-dimensional
sonifications that are both compelling and listenable, and that have enough discrete features that can be modulated in
ways that allow meaningful discrimination by a listener.
We propose a solution to this problem that incorporates Complex Event Processing (CEP) with speech synthesis. Some
of the more promising sonifications to date use speech synthesis, which is an "instrument" that is amenable to extended
listening, and can also provide a great deal of subtle nuance. These vocal nuances, which can represent a nearly limitless
number of expressive meanings (via a combination of pitch, inflection, volume, and other acoustic factors), are the basis
of our daily communications, and thus have the potential to engage the innate human understanding of these sounds.
Additionally, recent advances in CEP have facilitated the extraction of multi-level hierarchies of information, which is
necessary to bridge the gap between raw data and this type of vocal synthesis. We therefore propose that CEP-enabled
sonifications based on the sound of human utterances could be considered the next logical step in human-centric "big
data" compression and transmission.
A data fusion approach to indications and warnings of terrorist attacks
Show abstract
Indications and Warning (I&W) of terrorist attacks, particularly IED attacks, require detection of networks of agents and patterns of behavior. Social Network Analysis tries to detect a network; activity analysis tries to detect anomalous activities. This work builds on both to detect elements of an activity model of terrorist attack activity – the agents, resources, networks, and behaviors. The activity model is expressed as RDF triples statements where the tuple positions are elements or subsets of a formal ontology for activity models. The advantage of a model is that elements are interdependent and evidence for or against one will influence others so that there is a multiplier effect. The advantage of the formality is that detection could occur hierarchically, that is, at different levels of abstraction. The model matching is expressed as a likelihood ratio between input text and the model triples. The likelihood ratio is designed to be analogous to track correlation likelihood ratios common in JDL fusion level 1. This required development of a semantic distance metric for positive and null hypotheses as well as for complex objects. The metric uses the Web 1Terabype database of one to five gram frequencies for priors. This size requires the use of big data technologies so a Hadoop cluster is used in conjunction with OpenNLP natural language and Mahout clustering software. Distributed data fusion Map Reduce jobs distribute parts of the data fusion problem to the Hadoop nodes. For the purposes of this initial testing, open source models and text inputs of similar complexity to terrorist events were used as surrogates for the intended counter-terrorist application.
Warfighter information services: lessons learned in the intelligence domain
S. E. Bray
Show abstract
A vision was presented in a previous paper of how a common set of services within a framework could be used to
provide all the information processing needs of Warfighters. Central to that vision was the concept of a “Virtual
Knowledge Base”. The paper presents an implementation of these ideas in the intelligence domain. Several innovative
technologies were employed in the solution, which are presented and their benefits explained. The project was
successful, validating many of the design principles for such a system which had been proposed in earlier work. Many of
these principles are discussed in detail, explaining lessons learned. The results showed that it is possible to make vast
improvements in the ability to exploit available data, making it discoverable and queryable wherever it is from anywhere
within a participating network; and to exploit machine reasoning to make faster and better inferences from available data,
enabling human analysts to spend more of their time doing more difficult analytical tasks rather than searching for
relevant data. It was also demonstrated that a small number of generic Information Processing services can be combined
and configured in a variety of ways (without changing any software code) to create “fact-processing” workflows, in this
case to create different intelligence analysis capabilities. It is yet to be demonstrated that the same generic services can
be reused to create analytical/situational awareness capabilities for logistics, operations, planning or other military
functions but this is considered likely.
A survey of automated methods for sensemaking support
Show abstract
Complex, dynamic problems in general present a challenge for the design of analysis support systems and tools
largely because there is limited reliable a priori procedural knowledge descriptive of the dynamic processes in the
environment. Problem domains that are non-cooperative or adversarial impute added difficulties involving
suboptimal observational data and/or data containing the effects of deception or covertness. The fundamental nature
of analysis in these environments is based on composite approaches involving mining or foraging over the evidence,
discovery and learning processes, and the synthesis of fragmented hypotheses; together, these can be labeled as
sensemaking procedures. This paper reviews and analyzes the features, benefits, and limitations of a variety of
automated techniques that offer possible support to sensemaking processes in these problem domains.
Information Visualization
Neural network based visualization of collaborations in a citizen science project
Show abstract
Citizen science projects are those in which volunteers are asked to collaborate in scientific projects, usually by volunteering idle computer time for distributed data processing efforts or by actively labeling or classifying information - shapes of galaxies, whale sounds, historical records are all examples of citizen science projects in which users access a data collecting system to label or classify images and sounds.
In order to be successful, a citizen science project must captivate users and keep them interested on the
project and on the science behind it, increasing therefore the time the users spend collaborating with the
project. Understanding behavior of citizen scientists and their interaction with the data collection systems
may help increase the involvement of the users, categorize them accordingly to different parameters, facilitate their collaboration with the systems, design better user interfaces, and allow better planning and deployment of similar projects and systems.
Users behavior can be actively monitored or derived from their interaction with the data collection systems. Records of the interactions can be analyzed using visualization techniques to identify patterns and outliers. In this paper we present some results on the visualization of more than 80 million interactions of almost 150 thousand users with the Galaxy Zoo I citizen science project. Visualization of the attributes extracted from their behaviors was done with a clustering neural network (the Self-Organizing Map) and a selection of icon- and pixel-based techniques. These techniques allows the visual identification of groups of similar behavior in several different ways.
Visualizing common operating picture of critical infrastructure
Show abstract
This paper presents a solution for visualizing the common operating picture (COP) of the critical infrastructure (CI). The
purpose is to improve the situational awareness (SA) of the strategic-level actor and the source system operator in order
to support decision making. The information is obtained through the Situational Awareness of Critical Infrastructure and
Networks (SACIN) framework. The system consists of an agent-based solution for gathering, storing, and analyzing the
information, and a user interface (UI) is presented in this paper.
The UI consists of multiple views visualizing information from the CI in different ways. Different CI actors are
categorized in 11 separate sectors, and events are used to present meaningful incidents. Past and current states, together
with geographical distribution and logical dependencies, are presented to the user. The current states are visualized as
segmented circles to represent event categories. Geographical distribution of assets is displayed with a well-known map
tool. Logical dependencies are presented in a simple directed graph, and users also have a timeline to review past events.
The objective of the UI is to provide an easily understandable overview of the CI status. Therefore, testing methods, such
as a walkthrough, an informal walkthrough, and the Situation Awareness Global Assessment Technique (SAGAT), were
used in the evaluation of the UI. Results showed that users were able to obtain an understanding of the current state of
CI, and the usability of the UI was rated as good. In particular, the designated display for the CI overview and the
timeline were found to be efficient.
Visualization of multi-INT fusion data using Java Viewer (JVIEW)
Show abstract
Visualization is important for multi-intelligence fusion and we demonstrate issues for presenting physics-derived (i.e.,
hard) and human-derived (i.e., soft) fusion results. Physics-derived solutions (e.g., imagery) typically involve sensor
measurements that are objective, while human-derived (e.g., text) typically involve language processing. Both results
can be geographically displayed for user-machine fusion. Attributes of an effective and efficient display are not well
understood, so we demonstrate issues and results for filtering, correlation, and association of data for users - be they
operators or analysts. Operators require near-real time solutions while analysts have the opportunities of non-real time
solutions for forensic analysis. In a use case, we demonstrate examples using the JVIEW concept that has been applied
to piloting, space situation awareness, and cyber analysis. Using the open-source JVIEW software, we showcase a big
data solution for multi-intelligence fusion application for context-enhanced information fusion.
A visual analytic framework for data fusion in investigative intelligence
Show abstract
Intelligence analysis depends on data fusion systems to provide capabilities of detecting and tracking important objects,
events, and their relationships in connection to an analytical situation. However, automated data fusion technologies are
not mature enough to offer reliable and trustworthy information for situation awareness. Given the trend of increasing
sophistication of data fusion algorithms and loss of transparency in data fusion process, analysts are left out of the data
fusion process cycle with little to no control and confidence on the data fusion outcome. Following the recent rethinking
of data fusion as human-centered process, this paper proposes a conceptual framework towards developing alternative
data fusion architecture. This idea is inspired by the recent advances in our understanding of human cognitive systems,
the science of visual analytics, and the latest thinking about human-centered data fusion. Our conceptual framework is
supported by an analysis of the limitation of existing fully automated data fusion systems where the effectiveness of
important algorithmic decisions depend on the availability of expert knowledge or the knowledge of the analyst’s mental
state in an investigation. The success of this effort will result in next generation data fusion systems that can be better
trusted while maintaining high throughput.
Human terrain exploitation suite: applying visual analytics to open source information.
Show abstract
This paper presents the concept development and demonstration of the Human Terrain Exploitation Suite (HTES) under development at the U.S. Army Research Laboratory’s Tactical Information Fusion Branch. The HTES is an amalgamation of four complementary visual analytic capabilities that target the exploitation of open source information. Open source information, specifically news feeds, blogs and other social media, provide a unique opportunity to collect and examine salient topics and trends. Analysis of open source information provides valuable insights into determining opinions, values, cultural nuances and other socio-political aspects within a military area of interest. The early results of the HTES field study indicate that the tools greatly increased the analysts’ ability to exploit open source information, but improvement through greater cross-tool integration and correlation of their results is necessary for further advances.
Big Data and Information Management
Profile-based autonomous data feeding: an approach to the information retrieval problem in a high communications latency environment
Show abstract
This paper proposes the use of user profiles for data selection and prioritization for transmission. This approach has three
parts. First, a profile can be created for an individual user. This may provide the best results; however, it requires
transmitting a separate profile up for each prospective user. Second, user correspondence with a set of profiles can be
tracked. Finally, this can be extended to match a user not just with a single profile but with (possibly different) profiles
for each dimension tracked. The benefits of each of these approaches are discussed and the implementation pathway is
considered.
Exploiting social media for Army operations: Syrian crisis use case
Sue E. Kase,
Elizabeth K. Bowman,
Tanvir Al Amin,
et al.
Show abstract
Millions of people exchange user-generated information through online social media (SM) services. The prevalence of
SM use globally and its growing significance to the evolution of events has attracted the attention of the Army and other
agencies charged with protecting national security interests. The information exchanged in SM sites and the networks of
people who interact with these online communities can provide value to Army intelligence efforts. SM could facilitate
the Military Decision Making Process by providing ongoing assessment of military actions from a local citizen
perspective. Despite potential value, there are significant technological barriers to leveraging SM. SM collection and
analysis are difficult in the dynamic SM environment and deception is a real concern. This paper introduces a credibility
analysis approach and prototype fact-finding technology called the “Apollo Fact-finder” that mitigates the problem of
inaccurate or falsified SM data. Apollo groups data into sets (or claims), corroborating specific observations, then
iteratively assesses both claim and source credibility resulting in a ranking of claims by likelihood of occurrence. These
credibility analysis approaches are discussed in the context of a conflict event, the Syrian civil war, and applied to tweets
collected in the aftermath of the Syrian chemical weapons crisis.
A qualitative readiness-requirements assessment model for enterprise big-data infrastructure investment
Show abstract
In the last three decades, there has been an exponential growth in the area of information technology providing the
information processing needs of data-driven businesses in government, science, and private industry in the form of
capturing, staging, integrating, conveying, analyzing, and transferring data that will help knowledge workers and
decision makers make sound business decisions. Data integration across enterprise warehouses is one of the most
challenging steps in the big data analytics strategy. Several levels of data integration have been identified across
enterprise warehouses: data accessibility, common data platform, and consolidated data model. Each level of integration
has its own set of complexities that requires a certain amount of time, budget, and resources to implement. Such levels of
integration are designed to address the technical challenges inherent in consolidating the disparate data sources. In this
paper, we present a methodology based on industry best practices to measure the readiness of an organization and its
data sets against the different levels of data integration. We introduce a new Integration Level Model (ILM) tool, which
is used for quantifying an organization and data system’s readiness to share data at a certain level of data integration. It
is based largely on the established and accepted framework provided in the Data Management Association (DAMADMBOK).
It comprises several key data management functions and supporting activities, together with several
environmental elements that describe and apply to each function. The proposed model scores the maturity of a system’s
data governance processes and provides a pragmatic methodology for evaluating integration risks. The higher the
computed scores, the better managed the source data system and the greater the likelihood that the data system can be
brought in at a higher level of integration.
Utilizing semantic Wiki technology for intelligence analysis at the tactical edge
Show abstract
Challenges exist for intelligence analysts to efficiently and accurately process large amounts of data collected from a
myriad of available data sources. These challenges are even more evident for analysts who must operate within small
military units at the tactical edge. In such environments, decisions must be made quickly without guaranteed access to
the kinds of large-scale data sources available to analysts working at intelligence agencies. Improved technologies must
be provided to analysts at the tactical edge to make informed, reliable decisions, since this is often a critical collection
point for important intelligence data. To aid tactical edge users, new types of intelligent, automated technology
interfaces are required to allow them to rapidly explore information associated with the intersection of hard and soft data
fusion, such as multi-INT signals, semantic models, social network data, and natural language processing of text.
Abilities to fuse these types of data is paramount to providing decision superiority. For these types of applications, we
have developed BLADE. BLADE allows users to dynamically add, delete and link data via a semantic wiki, allowing
for improved interaction between different users. Analysts can see information updates in near-real-time due to a
common underlying set of semantic models operating within a triple store that allows for updates on related data points
from independent users tracking different items (persons, events, locations, organizations, etc.). The wiki can capture
pictures, videos and related information. New information added directly to pages is automatically updated in the triple
store and its provenance and pedigree is tracked over time, making that data more trustworthy and easily integrated with
other users’ pages.
Participatory Sensing & Cognition
User-centric incentive design for participatory mobile phone sensing
Show abstract
Mobile phone sensing is a critical underpinning of pervasive mobile computing, and is one of the key factors for improving
people’s quality of life in modern society via collective utilization of the on-board sensing capabilities of people’s
smartphones. The increasing demands for sensing services and ambient awareness in mobile environments highlight the
necessity of active participation of individual mobile users in sensing tasks. User incentives for such participation have
been continuously offered from an application-centric perspective, i.e., as payments from the sensing server, to compensate
users’ sensing costs. These payments, however, are manipulated to maximize the benefits of the sensing server, ignoring
the runtime flexibility and benefits of participating users. This paper presents a novel framework of user-centric incentive
design, and develops a universal sensing platform which translates heterogenous sensing tasks to a generic sensing plan
specifying the task-independent requirements of sensing performance. We use this sensing plan as input to reduce three
categories of sensing costs, which together cover the possible sources hindering users’ participation in sensing.
Conversational sensing
Show abstract
Recent developments in sensing technologies, mobile devices and context-aware user interfaces have made it pos-
sible to represent information fusion and situational awareness for Intelligence, Surveillance and Reconnaissance
(ISR) activities as a conversational process among actors at or near the tactical edges of a network. Motivated by
use cases in the domain of Company Intelligence Support Team (CoIST) tasks, this paper presents an approach
to information collection, fusion and sense-making based on the use of natural language (NL) and controlled nat-
ural language (CNL) to support richer forms of human-machine interaction. The approach uses a conversational
protocol to facilitate a
ow of collaborative messages from NL to CNL and back again in support of interactions
such as: turning eyewitness reports from human observers into actionable information (from both soldier and
civilian sources); fusing information from humans and physical sensors (with associated quality metadata); and
assisting human analysts to make the best use of available sensing assets in an area of interest (governed by man-
agement and security policies). CNL is used as a common formal knowledge representation for both machine
and human agents to support reasoning, semantic information fusion and generation of rationale for inferences,
in ways that remain transparent to human users. Examples are provided of various alternative styles for user
feedback, including NL, CNL and graphical feedback. A pilot experiment with human subjects shows that a
prototype conversational agent is able to gather usable CNL information from untrained human subjects.
Using cognitive architectures to study issues in team cognition in a complex task environment
Show abstract
Cognitive social simulation is a computer simulation technique that aims to improve our understanding of the dynamics of socially-situated and socially-distributed cognition. This makes cognitive social simulation techniques particularly appealing as a means to undertake experiments into team cognition. The current paper reports on the results of an ongoing effort to develop a cognitive social simulation capability that can be used to undertake studies into team cognition using the ACT-R cognitive architecture. This capability is intended to support simulation experiments using a team-based problem solving task, which has been used to explore the effect of different organizational environments on collective problem solving performance. The functionality of the ACT-R-based cognitive social simulation capability is presented and a number of areas of future development work are outlined. The paper also describes the motivation for adopting cognitive architectures in the context of social simulation experiments and presents a number of research areas where cognitive social simulation may be useful in developing a better understanding of the dynamics of team cognition. These include the use of cognitive social simulation to study the role of cognitive processes in determining aspects of communicative behavior, as well as the impact of communicative behavior on the shaping of task-relevant cognitive processes (e.g., the social shaping of individual and collective memory as a result of communicative exchanges). We suggest that the ability to perform cognitive social simulation experiments in these areas will help to elucidate some of the complex interactions that exist between cognitive, social, technological and informational factors in the context of team-based problem-solving activities.
Language and dialect identification in social media analysis
Stephen Tratz,
Douglas Briesch,
Jamal Laoudi,
et al.
Show abstract
Historically-unwritten Arabic dialects are increasingly appearing online in social media texts and are often intermixed with other languages, including Modern Standard Arabic, English, and French. The next generation analyst will need new capabilities to quickly distinguish among the languages appearing in a given text and to identify informative patterns of language switching that occur within a user’s social network—patterns that may correspond to socio-cultural aspects such as participants’ perceived and projected group identity. This paper presents work to (i) collect texts written in Moroccan Darija, a low-resource Arabic dialect from North Africa, and (ii) build an annotation tool that (iii) supports development of automatic language and dialect identification and (iv) provides social and information network visualizations of languages identified in tweet conversations.
Poster Session
Application of the JDL data fusion process model to hard/soft information fusion in the condition monitoring of aircraft
Joseph T. Bernardo
Show abstract
Hard/soft information fusion has been proposed as a way to enhance diagnostic capability for the condition monitoring
of machinery. However, there is a limited understanding of where hard/soft information fusion could and should be
applied in the condition monitoring of aircraft. Condition-based maintenance refers to the philosophy of performing
maintenance when the need arises, based upon indicators of deterioration in the condition of the machinery. The addition
of the multisensory capability of human cognition to electronic sensors may create a fuller picture of machinery
condition.
Since 1988, the Joint Directors of Laboratories (JDL) data fusion process model has served as a framework for
information fusion research. Advances are described in the application of hard/soft information fusion in condition
monitoring using terms that condition-based maintenance professionals in aviation will recognize. Emerging literature
on hard/soft information fusion in condition monitoring is organized into the levels of the JDL data fusion process
model. Gaps in the literature are identified, and the author’s ongoing research is discussed. Future efforts will focus on
building domain-specific frameworks and experimental design, which may provide a foundation for improving flight
safety, increasing mission readiness, and reducing the cost of maintenance operations.
Predicting student success using analytics in course learning management systems
Show abstract
Educational data analytics is an emerging discipline, concerned with developing methods for exploring the unique types
of data that come from the educational context. For example, predicting college student performance is crucial for both
the student and educational institutions. It can support timely intervention to prevent students from failing a course,
increasing efficacy of advising functions, and improving course completion rate. In this paper, we present the efforts
carried out at Oak Ridge National Laboratory (ORNL) toward conducting predictive analytics to academic data collected
from 2009 through 2013 and available in one of the most commonly used learning management systems, called Moodle.
First, we have identified the data features useful for predicting student outcomes such as students’ scores in homework
assignments, quizzes, exams, in addition to their activities in discussion forums and their total GPA at the same term
they enrolled in the course. Then, Logistic Regression and Neural Network predictive models are used to identify
students as early as possible that are in danger of failing the course they are currently enrolled in. These models compute
the likelihood of any given student failing (or passing) the current course. Numerical results are presented to evaluate
and compare the performance of the developed models and their predictive accuracy.