Proceedings Volume 11018

Signal Processing, Sensor/Information Fusion, and Target Recognition XXVIII

cover
Proceedings Volume 11018

Signal Processing, Sensor/Information Fusion, and Target Recognition XXVIII

Purchase the printed version of this volume at proceedings.com or access the digital version at SPIE Digital Library.

Volume Details

Date Published: 22 August 2019
Contents: 13 Sessions, 49 Papers, 31 Presentations
Conference: SPIE Defense + Commercial Sensing 2019
Volume Number: 11018

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Front Matter: Volume 11018
  • Multisensor Fusion, Multitarget Tracking and Resource Management I
  • Multisensor Fusion, Multitarget Tracking and Resource Management II
  • Information Fusion Methodologies and Applications I
  • Information Fusion Methodologies and Applications II
  • Information Fusion Methodologies and Applications III
  • Information Fusion Methodologies and Applications IV
  • Information Fusion Methodologies and Applications V
  • Signal and Image Processing, and Information Fusion Applications I
  • Signal and Image Processing, and Information Fusion Applications II
  • Signal and Image Processing, and Information Fusion Applications III
  • Signal and Image Processing, and Information Fusion Applications IV
  • Poster Session
Front Matter: Volume 11018
icon_mobile_dropdown
Front Matter: Volume 11018
This PDF file contains the front matter associated with SPIE Proceedings Volume 11018, including the Title Page, Copyright information, Table of Contents, Author and Conference Committee lists.
Multisensor Fusion, Multitarget Tracking and Resource Management I
icon_mobile_dropdown
Extracting fast targets from an EO sensor
Andrew Finelli, Yaakov Bar-Shalom, Peter Willett, et al.
This work describes a method for measurement extraction of fast point targets leaving an extended signature in the pixelated focal plane array of an EO sensor. The extraction method and subsequent statistics are derived from a physics based model where the spatial quantization regions, or pixels, are separated by dead zones. Furthermore, the intensity in a pixel is corrupted by approximately Gaussian noise which exhibits a variance proportional to the pixel area. This noise model is based on a Poisson assumption, in reference to the number of photons that contribute to the noise intensity. The signal portion of the image is a spatially quantized version of the target’s point spread function (PSF) that is modeled as a Gaussian PSF that moves at a constant velocity, a realistic assumption during an exposure time that may have been adjusted (and lengthened) to enhance target SNR. The measurement extraction is done using a Maximum Likelihood (ML) method for which we provide an appropriate Cramer-Rao Lower Bound (CRLB) on the estimation error of the target’s 2-D starting and ending positions. We then provide a definition for the signal to noise ratio (SNR) in an image using a matched filter (MF). Next, we present Monte Carlo simulations to confirm the derived results and find that the measurement extractor is efficient for SNRs .≥ 12dB (using our SNR definition). Finally, we develop a solution to the problem of detecting fast targets in images. We present approximate distributions for the test statistic under the null (H0 — target absent) and alternative (H1 — target present) hypotheses that can be used to set a threshold for specific probabilities of detection PD and false alarm PFA. Finally, we verify these distributions with Monte Carlo simulations.
Target tracking in over the horizon radar
Andrew Finelli, Yaakov Bar-Shalom, Peter Willett, et al.
This paper describes the application of the Deep Target Extractor (DTE), developed from the Maximum Likelihood Probabilistic Multiple Hypothesis Tracker (ML-PMHT), to measurements from an over-the-horizon radar (OTHR) observing a highly maneuvering target (HMT). We describe the motion model for HMTs that start at very large speeds and makes extremely sharp turns. We then present a measurement model for OTHR that uses ray-tracing software to produce detections across multiple signal paths resulting from ionospheric refractions. Next, we describe the DTE operation in the multiple signal path framework of the OTHR. Finally, we present a test scenario where the DTE performance shows low root-mean-square error (RMSE) for HMT motion parameter estimation.
On-demand track-to-track fusion using local IMM inside information
R. Visina, Y. Bar-Shalom, P. Willett, et al.
The fusion of state estimates from Interacting Multiple Model (IMM) estimators using inside information (mixture estimates and probabilities) is described in this paper. Fusion is performed on-demand, i.e., without conditioning on past track information. The local trackers run IMM estimators to track a target and transmit mode-conditioned estimates and mode probabilities to a Fusion Center. The fused state posterior probability density is a Gaussian mixture whose parameters can be computed recursively. The likelihood functions of the state and mode are derived, yielding consistent data fusion. Simulations show that this method outperforms the fusion of the local IMM estimator outputs both in terms of error during target maneuvers and in the consistency of the mean-squared error.
CRLB for multi-sensor rotational bias estimation for passive sensors without target state estimation
Michael Kowalski, Yaakov Bar-Shalom, Peter Willett, et al.
Bias estimation is a significant problem in target tracking applications and passive sensors present additional challenges in this field. Biases in passive sensors are commonly represented as unknown rotations of the sensor coordinate frame and it is necessary to correct for such errors. Many methods have used simultaneous target state and bias estimation to register the sensors, however it may be advantageous to decouple state and bias estimation to simplify the estimation problem. This way bias estimation can be done for any arbitrary target motion. If measurements are converted into Cartesian coordinates and differenced then it is possible to isolate the effects of the biases. This bias pseudo-measurement approach has been used in bias estimation for many types of biases and sensors and this paper applies this method to 3D passive sensors with rotational biases. The Cram´er-Rao Lower Bound for the bias estimates is evaluated and it is shown to be attained, i.e., the bias estimates are statistically efficient.
The cross-covariance for heterogeneous track-to-track fusion
Track-to-track fusion (T2TF) has been studied widely for both homogeneous and heterogeneous cases, these cases denoting common and disparate state models. However, as opposed to homogeneous fusion, the cross-covariance for heterogeneous local tracks in different state spaces that accounts for the relationship between the process noises of the heterogeneous models seems not to be available in the literature. The present work provides the derivation of the cross-covariance for heterogeneous local tracks of different dimensions where the local states are related by a nonlinear transformation (with no inverse transformation). First, the relationship between the process noise covariances of the motion models in different state spaces is obtained. The cross-covariance of the local estimation errors is then derived in a recursive form by taking into account the relationship between the local state model process noises. In our simulations, linear minimum mean square (LMMSE) fusion is carried out for a scenario of two tracks of a target from two local trackers, one from an active sensor and one from a passive sensor.
Stone Soup: announcement of beta release of an open-source framework for tracking and state estimation
David Last, Paul Thomas, Steven Hiscocks, et al.
Tracking and state estimation technologies are used in a variety of domains that include astronomy, air surveillance, maritime situational awareness, biology, and the internet. Algorithms for tracking and state estimation are becoming increasingly complex and it is difficult for researchers and skilled practitioners to implement and systematically evaluate these state-of-the-art algorithms. System designers also need to objectively assess the performance of algorithms against operational requirements, and tools to conveniently perform such systematic assessment have been lacking. Recognising this problem, an initiative was taken to create an open-source frame- work called Stone Soup", which would be used for the development, demonstration, and evaluation of tracking and state estimation algorithms. Stone Soup was made openly available in April 2019 as a beta version (V0.1b1). This paper introduces the Stone Soup framework and describes how users can take advantage of this framework to develop their own algorithms, set up experiments with real-world data, and evaluate algorithms.
Multisensor Fusion, Multitarget Tracking and Resource Management II
icon_mobile_dropdown
Experimental results in bearings-only tracking using the sequential Monte-Carlo probability hypothesis density filter
We evaluate the use of a probability hypothesis density (PHD) filter in a bearings-only tracking application. The main feature of a PHD filter is that it propagates the first-order statistical moment of a multisource posterior distribution. Multisource estimation using a PHD filter has been shown to reliably track multiple simulated targets in the bearings-only case. In this paper we evaluate the utility of the sequential Monte-Carlo PHD filter for tracking surface ships using bearings-only data acquired from a Bluefin-21 unmanned underwater vehicle in Boston Harbor. The unmanned underwater vehicle was equipped with a rigidly mounted planar hydrophone array that measures the bearing angle to sources of acoustic noise, of which shipping traffic is the dominant source. We further evaluate several target maneuvering models, including clockwise and counter-clockwise coordinated turns. The combination of the coordinated turn models with a constant velocity model is used in a multiple model PHD filter. The results of the multiple model PHD filter are compared to the results of a PHD filter using only a constant velocity model.
"Self-Intoxication" and the effects on track quality
In tactical operations, maintaining the pedigree of data may be problematic due to limitations in data links. If the pedigree information is missing and/or incomplete, information that arrives at a sensor platform may include information that the platform itself created. If the platform uses this data again, the platform will become incorrectly more confident of this information. This condition is called “self-intoxication,” and it is a system of systems problem. This paper assesses the effects of self-intoxication on the covariance consistency, i.e. how accurately the platform’s track covariance reflects the true uncertainty. We analyze covariance consistency of a track relative to truth.
Problems with information fusion under extreme multiple-counting conditions
This document concerns three information fusion methods: Information Matrix Fusion (IMF), Covariance Intersection (CI), and Sampling Covariance Intersection (SCI). These methods are compared for performance under extreme multiple-counting conditions, that is, when an estimate is improperly fused to a track multiple times as if the estimate was repeatedly found by independent measurements. This situation can possibly occur in networked fusion systems where data pedigree is less than properly maintained, especially when an information relay is implemented to handle diminished communication environments. Extreme multiple-counting behavior in particular is examined for the purposes of this document. This research demonstrates that the normally preferable methods, IMF and SCI, are prone to falsely optimistic covariance values in such situations. All three fusion methods result in the state estimate approaching the estimate being repeatedly fused; the more conservative CI method also results in the covariance approaching that of the repeated estimate. We obtain these results through inference from the governing equations and examination of behavior under Monte Carlo simulations.
A fuzzy inference system for ship-ship collision alert generation
Constantinos Rizogiannis, Stelios C. A. Thomopoulos
Ship-ship collisions are rare events mainly caused by human errors. In an ever growing global trade world, the avoidance of such accidents is of great importance as they may cause significant human, environmental, and financial consequences. A collision warning system providing to Maritime related authorities (Vessel Traffic Services, Coast Guard authorities, Port authorities, other) real-time alerts for ship-ship encounter may significantly contribute to the reduction of collision accidents and the increase of navigation safety. In this paper a Fuzzy inference system (FIS) is proposed to estimate risk collision for pairs of ships receiving as input parameters calculated based on data obtained from the Automatic Identification System (AIS) which is installed onboard ships. The proposed system operates in real time examining pairs of ships that recently transmitted an AIS signal while discards, at an early processing stage, those pairs which do not satisfy specific conditions in order to reduce computational burden. Experiments have been carried out with real data and the results show that the proposed approach is effective in automatic alert generation for ships involved in near ship-ship situations.
Maritime situational awareness with OCULUS Sea C2I and forensics tools for a Common Information Sharing Environment (CISE)
Stelios C. A. Thomopoulos
CISE stands for Common Information Sharing Environment and refers to an architecture and set of protocols, procedures and services for the exchange of data and information across Maritime Authorities of EU (European Union) Member States (MS’s). In the context of enabling the implementation and adoption of CISE by different MS’s, EU has funded a number of projects that enable the development of subsystems and adaptors intended to allow MS’s to connect and make use of CISE. In this context, the Integrated Systems Laboratory (ISL) has led the development of the corresponding Hellenic and Cypriot CISE by developing a Control, Command and Information (C2I) system that unifies all partial maritime surveillance systems into one National Situational Picture Management (NSPM) system, and adaptors that allow the interconnection of the corresponding national legacy systems to CISE and the exchange of data, information and requests between the two MS’s. Furthermore, a set of forensics tools that allow geospatial and time filtering and detection of anomalies, risk incidents, fake MMSIs, suspicious speed changes, collision paths, and gaps in AIS (Automatic Identification System), have been developed by combining motion models, AI, deep learning and fusion algorithms using data from different databases through CISE. This paper discusses these developments within the EU CISE-2020, Hellenic CISE and CY-CISE projects and the benefits from the sharing of maritime data across CISE for both maritime surveillance and security.
Information Fusion Methodologies and Applications I
icon_mobile_dropdown
A GLMB filter for unified multitarget multisensor management
The generalized labeled multi-Bernoulli (GLMB) filter of Vo and Vo is an exact closed-form solution of the multitarget Bayes' filter and is, therefore, provably Bayes-optimal. Its recent implementations are extremely fast, with computational order O(n2m) where n,m are the current numbers of tracks resp. measurements. This paper generalizes the GLMB filter to fully integrated multitarget tracking and sensor management, in which dynamically moving sensors can appear and disappear and in which the states of these sensors are estimated via measurements collected by internal actuator sensors.
Information Fusion Methodologies and Applications II
icon_mobile_dropdown
Sequential and parallel fusion of detection and classification systems
This paper will investigate the fusion of various detection and classification systems. The architecture of combining these systems is the main interest of this work. We assume the detection and classification systems are known and they are legacy systems such that we know their receiver operating characteristic (ROC) functions, or their approximate ROC functions. Given an objective function we seek the optimal architecture that maximizes the objective function. Combining detection systems sequentially has been around for decades, especially in the bio-medical field where tests are preformed sequentially such that the outcome of one test will determine which test will be performed next. In military applications, we often use multiple detection systems in parallel and combine the outputs into a “fusion” center to determine the final answer. We conjecture that there might be a parallel and series mixture that would yield better performance. Part of determining this mixture is determining which systems go "where" in the mix. We investigate this architecture.
Sequence theory for classification in multi-label ATR classification tasks
This work examines the scenario of ATR classification in multi-label settings by using the framework of a classification sequence. Classification tasks are often composed of a sequence of identification tasks that together, generate an overall classification. For instance, objects may be sorted and classified as one particular target type and then those targets are further identified. Rather than passing all objects through each classifier, a sequence of classifiers may be used to identify objects without the need to process data through each classifier. Such sequences exist for two-label outcomes (such as target and non-target) and have been called: Believe the Negative, Believe the Positive, and Believe the Extremes. In each of these sequences, the first classification system is able to identify objects such that only a portion of objects must be passed to the second system for identification. However, to extend these sequences to k-labels, a new definition of the ordering on the labels must be generated in order to incorporate all k-labels into the classification sequence. In this work, we develop the mathematical structures that exist for a k-label classification sequence, provides formula for both the optimal performance and operational cost of these sequences, and examines the performance of such sequences under a variety of operating conditions. Conceptually, we will begin and demonstrate these results with a 3-label ATR system. In conclusion, this work will demonstrate the utility of using a sequence to fuse information in a multi-label classification task.
Information Fusion Methodologies and Applications III
icon_mobile_dropdown
Extremely deep Bayesian learning with Gromov's method
We have invented two new Bayesian deep learning algorithms using stochastic particle flow to compute Bayes’ rule. These learning algorithms have a continuum of layers, in contrast with 10 to 100 discrete layers in standard deep learning neural nets. We compute Bayes’ rule for learning using a stochastic particle flow designed with Gromov’s method. Both deep learning and standard particle filters suffer from the curse of dimensionality, and we mitigate this problem by using stochastic particle flow to compute Bayes’ rule. The intuitive explanation for the dramatic reduction in computational complexity is that stochastic particle flow adaptively moves particles to the correct region of d dimensional space to represent the multivariate probability density of the state vector conditioned on the data. There is nothing analogous to this in standard neural nets (deep or shallow), where the geometry of the network is fixed.
Information fusion to estimate resilience of dense urban neighborhoods
Anthony Palladino, Elisa J. Bienenstock, Bradley M. West, et al.
Diverse sociocultural influences in rapidly growing dense urban areas may induce strain on civil services and reduce the resilience of those areas to exogenous and endogenous shocks. We present a novel approach with foundations in computer and social sciences, to estimate the resilience of dense urban areas at finer spatiotemporal scales compared to the state-ofthe-art. We fuse multi-modal data sources to estimate resilience indicators from social science theory and leverage a structured ontology for factor combinations to enhance explainability. Estimates of destabilizing areas can improve the decision-making capabilities of civil governments by identifying critical areas needing increased social services.
An agent-administrator-based security mechanism for distributed sensors and drones for smart grid monitoring
Distributed sensors are the eyes and ears of a smart grid which provide information vital for monitoring and controlling the entire power generation, transmission, and distribution systems. Secure exchange of information among the sensing and decision-making entities is essential as failures may bring the entire system on its knees. With the rapid growth in the number of distributed sensors, drones have a myriad of applications. A swarm of drones could also be deployed in war zones and disaster-stricken areas where a secured intercommunication is of paramount importance for survivability and for successful mission completion. In this paper, a secure mechanism is proposed based on mobile agents to secure information exchange with minimum overhead. An Agent Administrator (AA) automatically clones and sends a secure mobile agent (SMA) to the target sensors or drones to scan and check their security status. Then, the dispatched SMAs send feedbacks to the server AA or other members. In case of sensors, the closest terminal unit to which the sensors are directly connected is designated as an AA, which is capable of checking authentication and scanning for vulnerabilities. In the case of drones, any one of them or multiple of them could be designated as the AA and the flagged feedback is broadcast to all other nodes or drones thereby providing them security status updates. A modified Nagle’s Algorithm is also proposed to support real-time video transmission. The experimental results validate the effectiveness and convenience of the proposed system.
Object recognition, identification and classification for intelligent surveillance and reconnaissance platforms
Raymond Ptucha, Aneesh Bhat, Aravindh Kuppusamy, et al.
We research solutions to enable intelligence, surveillance and reconnaissance by means of near-real-time target recognition, identification and classification. The cloud services, intelligence, surveillance, target acquisition and reconnaissance are of importance for the C4+iSTAR systems. These platforms are expected to ensure high-level cognitive autonomy to accomplish complex missions and tasks in rapidly-changing adverse environments. We research and apply deep learning concepts and algorithms to enable high-confidence awareness and advance situational analysis. We examine engineering solutions for object identification and classification. Our findings ensure sufficient level of fidelity on the object recognition and classification likelihood with high identification probability, processing latency on low-power ARM CPUs, and, integration capabilities. Advanced concepts on moving target recognition and object classification using fly data are researched with low-fidelity experimental substantiations. We modify the YOLOv3 object detection method to detect bounding boxes for arbitrary orientation angle, angle of view and corner shapes which are of importance in aerial applications. Our results demonstrate adequate detection capability while maintaining fast computational performance of the original YOLOv3 architecture. The proposed algorithms and computing schemes are supported by codes in C++.
Information Fusion Methodologies and Applications IV
icon_mobile_dropdown
Long lasting effects of awareness training methods on reducing overall cyber security risk
Georgios Pouraimis, Konstantinos-Georgios Thanos, Athanassios Grigoriadis, et al.
Social Engineering holds one of the most critical threats to public and private organizations. In this paper we focus on phishing threats by measuring the positive impact that awareness methods may provide to them in a long-term period to companies and public bodies. The assessment criterion uses two phishing attacks in a period of 18 weeks. The phishing attack comprises a hook mail containing a link to a credentials harvesting website. Users’ reaction and user agent fingerprints are used in order to calculate a risk score for each victim. By applying chi square – tests it was found that there is a statistically significant score improvement for participants that were trained via the awareness methods. Furthermore, a risk analysis is conducted to identify, quantify and prioritize potential risks that could negatively affect the end-user’s operations. The main idea concerning this proposed technique is the fact that the assessment methods can assist the employees to develop skills and abilities in order to use the digital world safely, avoiding phishing attacks. The risk analysis findings indicate that the awareness approach has significant improvement in long term lasting risk reduction. The study was conducted as part of the European Horizon 2020 DOGANA project which aims to deploy effective mitigation strategies and lead to reduce the risk created by modern Social Engineering 2.0 attack techniques. The results obtained in this paper corroborate the results obtained by the EU funded project SAINT from the econometric analysis and modeling of the cybercrime and cyber security markets.
Automated real-time risk assessment for airport passengers using a deep learning architecture
Stelios C. A. Thomopoulos, Stelios Daveas, Antonios Danelakis
Airport control check points are required to operate and maintain modern security systems preventing malicious actions. This paper presents a methodology, introduced in the context of the FLYSEC project [30], that provides real-time risk assessment for airport passengers based on their trajectories. The proposed methodology implements a deep learning architecture. It is fully automated, reducing the workload of the video surveillance operators making leading to less error-prone conclusions. It has been integrated with the Command and Control System (C2) of iCrowd, a crowd simulation platform developed by the Integrated Systems Lab of the Institute of Informatics and Telecommunications in NCSR Demokritos. iCrowd features a highly-configurable, high-fidelity agent-based behavior simulator and provides a realistic environment that enables behaviors of simulated actors (e.g. passengers, personnel, malicious actors), instantiates the functionality of hardware security technologies, and simulates passengers’ facilitation and customer service. iCrowd has been used for conducting experiments on simulated scenarios in order to evaluate the proposed risk assessment scheme. The experimental results indicate that the proposed risk assessment scheme is very promising and can reliably be used in an airport security frame for evaluating and/or enveloping security tracking systems performance.
A framework for context change detection and management in probabilistic models for context in fusion
In a prior paper, a probabilistic model for using context in fusion was developed. It was shown that context-based fusion could be represented by a Bayesian probabilistic model that contains situation and context data, as well as conditional probabilities for the random variables. In the same paper, a conceptual model of an adaptive real-time context management system was proposed to monitor fusion performance, and select the appropriate context in order to improve fusion performance. This paper represents an extension of the above paper by developing frameworks for an adaptive general real-time context management, with application to optimize the tracking performance of an airborne platform.
Information Fusion Methodologies and Applications V
icon_mobile_dropdown
Deep learning in AI and information fusion panel discussion
During the 2018 SPIE DSS conference, panelists were invited to highlight the trends and use of artificial intelligence and deep learning (AI/DL) for information fusion. This paper highlights the common issues presented from the panel discussion. The key issues include: leveraging AI/DL coordinated with information fusion for: (1) knowledge reasoning and reasoning, (2) information fusion enhancement, (3) object recognition and tracking, (4) data with models fusion, and (5) deep multimodal fusion cognition strategies to support the user.
Applying cognitive psychology principles to the (dis)information environment: an examination of discourse comprehension, memory, and fusion of news articles
Disinformation campaigns such as fake news have evolved beyond traditional propaganda and are being used in strategic ways. Further, sources of information have changed from traditional media (newspapers, fliers, radio) to multi social media platforms (Twitter, You-Tube, Reddit). This paper addresses how cognitive psychology research related to traditional texts and news discourse applies to information garnered in the changing social media environment and especially false information. Specifically, the memory representations that readers create when they read text and how they are able to integrate complex ideas from multiple sources (including images) are reviewed. An emphasis is placed on the fragility of memory representations such as the susceptibility to information reconstruction and how that vulnerability can interact with disinformation. Potential ways to counter disinformation are explored such as perspective shifting, changing the context, and thinking analytically. In addition, this paper will identifies what gaps need to be addressed in the growing research areas of misinformation and fake news.
Evaluation of algorithms for fake news identification
Today, information is spread quickly throughout communities by means of simple messaging, group chats, and social media platforms. Because of the ease of use that these services provide, misinformation has become a common trend. The term ‘fake news’ has emerged as being a way to refer to all information shared in a manner that is meant to mislead a reader into thinking something is a true statement when it is not. Combating fake news has become a major topic, and many are attempting to find a way of detecting when something is real or made up. In this paper, we look at a database of news articles that have been classified as either real or fake and apply machine learning to automatically determine if something is deliberately misleading. Algorithms have been developed to make judgements, classify articles in a database and judge new articles based on learned knowledge. This model combines multiple factors that may raise or lower confidence in the article being legitimate or illegitimate and provides a single confidence metric. This paper presents the development of these algorithms for assessing articles. It discusses the efficacy of using this approach and compares it to other classification approaches. It then presents the results of using the system to classify numerous presented articles and discusses the sufficiency of system accuracy for multiple applications. Finally, it discusses next steps in the fake news detection project and how these algorithms fit within them.
Variational adversarial generative models for information fusion with applications to cross-domain causal reasoning, hypothesis construction and identity learning (Conference Presentation)
Many problems in data fusion require reasoning about heterogeneous data obtained from different sources (news media, first-hand reporters, etc.) and containing diverse modalities (image, video, text, audio, etc.). Construction of the situation hypotheses, predictions of future events or inference of event causes, identification of the actors, or forensic analyses require automated decision aids that can provide fast and reliable extraction of relevant information and association of the sensory inputs. While humans can effectively fuse multiple sensory signals, the machine learning-based methods cannot yet effectively replicate these cognitive processes. The challenges are due to the presence of heterogeneity gap in multi-modal multi-source data, which creates inconsistent distributions and representations of the input signals. Researchers attempted to solve this problem by modeling explicit cross-modal correlation and learning common joint representations of the heterogeneous data. However, only marginal successes have been achieved, and domains have so far been limited to imagery/video-to-text association with no promise to generalize to any knowledge fusion. In this paper, we explore the application of three recent modeling techniques to multi-source multi-modal data fusion: adversarial generative modeling, variational inference, and inverse reinforcement learning. The generative adversarial networks (GANs) are capable of estimating a generative model by an adversarial training process, where the component model that learns generative data distribution competes against the discriminative component that attempts to classify the generated data. GANs-learned distributions are capable of producing the data samples of high quality, which is essential for producing meaningful cross-modality association decisions. First, we use GANs for modeling the joint distribution over the heterogeneous data of different sources and modalities, learning the common representation and improving the cross-modal correlation estimates. Our approach is based on generative modeling of explicit intermodality correlation and intra-modality reconstruction, while the discriminative component will judge the quality of generated samples both within the same modality and between different modalities. Second, we will use the variational inference to add the classification reasoning to our model, making our solution capable of producing both high-quality cross-modal correlation decisions as well as classification of the fused inputs as specific events or activities. Variational inference provides the approximation necessary for reasoning over large scale data by decomposing the automated maximum likelihood estimation into perception and control, and allowing learning disentangled representations that are essential for generalizing across different domains. Finally, we will use the concepts from inverse reinforcement learning to update the parameters of common joint multi-modal representation. We will conclude this paper by studying the application of proposed model for two use-cases: construction of hypotheses and patterns of life from multi-media data where the data modalities contain text and imagery artifacts, and distributed decision-making in the geospatial environment where the input data can come from different overlapping and potentially conflicting observations from distinct observers, and contain information at different conceptual level, such as entity movement, skills, goals, and reactions.
Archaeological dating using a data fusion approach
A new approach for dating archaeological sites is described. The method is inspired by Hapgood’s hypothesis that patterns of glaciation and ice ages can be explained by shifts in the geographic location of the North Pole. We have identified over fifty archaeological sites throughout the world that could have once been aligned to north (i.e., to one of these past poles) when the sites were first established but are now misaligned due to subsequent pole shifts. An algorithm is described that fuses the location and orientation of these sites with Hapgood’s original climate-dated pole locations to infer the date of construction of the associated sites. The results suggest that these sites may be far older than is currently thought.
Signal and Image Processing, and Information Fusion Applications I
icon_mobile_dropdown
ULearn: understanding and reacting to student frustration using deep learning, mobile vision and NLP
ULearn is a system that uses deep learning, computer vision and NLP to assist students with the task of web-based learning. ULearn’s goal is to detect when the student is experiencing higher levels of frustration and then present alternative meaningful alternative content. The ULearn app features a web-brower, though the intention is to have ULearn assist students in learning scenarios it is equally applicable to other web-based tasks. While the user is browsing, ULearn monitors them using the front-facing camera and when negative emotions are detected the user is presented with a set of “tips”. The first step in ULearn is to perform face detection which returns an ROI that is fed into an emotion detection system. A deep-learning CNN is used to perform the emotion detection yielding one of anger, fear, disgust, surprise, neutral, and happy. If a significate negative emotion is detected ULearn generates a set of alternative content called “tips” which are a set of links to similar content web pages to the current one being viewed. These links can be found through scraping the currently viewed web page for content that is used directly in a search or first passing this information to an NLP stage. The NLP stage gives the saliency of the most prominent entities in the current web page content. Real test results are given, and the success and challenges faced by ULearn are presented along with future avenues of work.
Drone based user and heading detection using deep learning and stereo vision
The problem of assisting a low-vision person with environment awareness using a drone is addressed. Specifically, the first stage task of detecting the User and their heading which does not require any user adaptive training is tackled. The modalities of 3D and 2D vision on a drone are compared for this task. 3D data is provided using a stereo sensor mounted on the drone that communicates using RF to a mobile device based android application. For the task of localization, a Single Shot MulitBox Detector is utilized. Different networks in terms of input modalities and related structure are developed including a 2D only network and a 3D+2D fused network. Performance of these networks are compared and results discused. In addition, a comparison of retrained networks versus training from scratch is made. In all cases, approximately 34,000 user heading samples were collected for training. Real data from outdoor drone flights that communicate with the Android based application are shown. Detecting both the user in the scene and their heading is an important first step necessary in a drone-based system that helps low-vision persons with environment awareness. The success and challenges faced are presented along with future avenues of work.
Deep learning on hyperspectral data to obtain water properties and bottom depths
Kristen Nock, Elizabeth Gilmour, Paul Elmore, et al.
Developing accurate methods to determine bathymetry, bottom type, and water column optical properties from hyperspectral imagery is an ongoing scientific problem. Recent advances in deep learning have made convolutional neural networks (CNNs) a popular method for classification and regression on complex datasets. In this paper, we explore the use of CNNs to extract water depth, bottom type, and inherent optical properties (IOPs) from hyperspectral imagery (HSI) of water. We compare the CNN results to other machine learning algorithms: k-nearest-neighbors (KNN), stochastic gradient descent (SGD), random forests (RF), and extremely randomized trees (ET). This work is an inverse problem in which we seek to find the water properties than impact the reflectance and hence the collected HSI. The data includes optically shallow water, in which the bottom can be seen, and optically deep, in which the bottom cannot be seen and does not affect the reflectance. The scalar optical properties we find through regression are chlorophyll (CHL), colored dissolved organic matter (CDOM), total suspended sediments (TSS). For the case of the optically shallow water, we classify the bottom type among 114 different substrates. The results demonstrate that for finding water depth, bottom type, and IOPs in the case of optically shallow water, the CNN has better performance than other machine learning methods. For regression of the IOPs in optically deep water, the extremely randomized trees method outperforms the CNN. We further investigate the mechanisms of these results and discuss hyperparameter tuning strategies that may improve deep learning accuracy.
Characterizing hyperspectral signatures of human faces in the shortwave infrared spectrum
Robert J. Goldman, Minas Benyamin, Matthew Thielke, et al.
Hyperspectral imaging provides extensive spectral reflectance information useful for material classification and discrimination not available with conventional broadband imaging. In this work, we first seek to characterize the hyperspectral signature of human faces in the shortwave infrared (SWIR) band. A hyperspectral SWIR face dataset of 100 subjects was collected as part of this study. Regions of interest (ROI) were defined for each subject and the mean and variance of each ROI were computed. The results show that hyperspectral signatures are similar between male and female subjects for the cheek, forehead, and hair ROIs. Furthermore, this study investigated whether the hyperspectral face signatures from the ROIs contained discriminative information for gender classification. We implemented and trained five different classifiers for gender classification. Results from the machine learning experiments indicate that hyperspectral facial signatures in the SWIR band is only weakly discriminative with respect to gender.
Comparison of neural network classifiers for automatic target recognition
Mark Carlotto, Mark Nebrich, David Ramirez
We consider a challenge problem involving the automatic detection of large commercial vehicles such as trucks, buses, and tractor-trailers in Quickbird EO pan imagery. Three target classifiers are evaluated: a “bagged” perceptron algorithm (BPA) that uses an ensemble method known as bootstrap aggregation to increase classification performance, a convolutional neural network (CNN) implemented using the MobileNet architecture in TensorFlow, and a memory-based classifier (MBC), which also uses bagging to increase performance. As expected, the CNN significantly outperformed the BPA. Surprisingly, the performance of the MBC was only slightly below that of the CNN. We discuss these results and their implications for this and other similar applications.
Signal and Image Processing, and Information Fusion Applications II
icon_mobile_dropdown
Deep learning architecture advancements for accurate and robust image registration
Derek J. Walvoord, Doug W. Couwenhoven
Registration of image collections and video sequences is a critical component in algorithms designed to extract actionable intelligence from remotely sensed data. While methodologies for registration continue to evolve, the accuracy of alignment remains dependent on how well the approach tolerates changes in capture geometry, sensor characteristics, and scene content. Differences in imaging modality and field-of-view present additional challenges. Registration techniques have progressed from simple, global correlation-based algorithms, to higher-order model fitting using salient image features, to two-stage approaches leveraging high-fidelity sensor geometry, to new methods that exploit high-performance computing and convolutional neural networks (ConvNets). The latter offers important advantages by removing model assumptions and learning feature extraction directly through the minimization of a registration cost function. Deep learning approaches to image registration are still relatively unexplored for overhead imaging, and their ability to accommodate a large problem domain offers potential for several new developments. This work presents a new network architecture that improves accuracy and generalization capabilities over our modality-agnostic deep learning approach to registration that recently advanced the state of the art. A thoroughly tested ConvNet pyramid remains the core of our network approach, and has been optimized for registration and generalized to begin addressing derivative applications such as mosaic generation. Further modifications, such as objective function masking and reduced interpolation, have also been implemented to improve the overall registration process. As before, the trained network ingests image frames, applies a vector field, and returns a version of the input image that has been warped to the reference. Qualitative and quantitative performance of the new architecture is evaluated using several overhead still and full-motion video (FMV) data sets.
Radar target identification using HRRP-based features and Extreme Learning Machines
Radar target recognition performance using extreme learning machines (ELM) is examined in this study and compared with optimal classifiers. Classification under various adverse scenarios involving additive noise, azimuth ambiguity, azimuth mismatch between library and unknown target, presence of extraneous scatterers, signature occlusion, absolute phase knowledge, etc. are examined. ELM can be trained expeditiously and are suited for radar target recognition particularly with large training database. The effectiveness of ELM (single layer or multilayered) as a target recognition tool is the focus in this study that relies on real radar data collected in a compact range environment using a stepped-frequency system.
A comparative study of conventional and deep learning approaches for demosaicing Mastcam images
Bayer pattern is a low cost approach to generating RGB images in commercial digital cameras. In NASA's mast camera (Mastcams) onboard the Mars rover Curiosity, Bayer pattern has also been used in capturing the RGB bands. It is well known that debayering (also known as demosaicing) introduces color and zipper artifacts. Currently, NASA is using a demosaicing algorithm developed in early 2000’s. It is probably the right time to assess some state-of-the-art algorithms and recommend a more recent and powerful approach to NASA for its future missions. In this paper, we present results of a comparative study on the use of conventional and deep learning algorithms for demosaicing Mastcam images. Due to lack of ground truth, subjective evaluation has been used in our study.
Signal and Image Processing, and Information Fusion Applications III
icon_mobile_dropdown
Perceptually lossless compression of Mastcam images with error recovery
We present a high performance image compression framework for Mastcam images in the Mars rover Curiosity. First, we aim at achieving perceptually lossless compression. Four well-known image codecs in the literature have been evaluated and the performance was assessed using four well-known performance metrics. Second, we investigated the impact of error concealment algorithms for handling corrupted pixels due to transmission errors in communication channels. Extensive experiments using actual Mastcam images have been performed to demonstrate the proposed framework.
Fusion of landsat and worldview images
Pansharpened Landsat images have 15 m spatial resolution with 16-day revisit periods. On the other hand, Worldview images have 0.5 m resolution after pansharpening but the revisit times are uncertain. We present some preliminary results for a challenging image fusion problem that fuses Landsat and Worldview (WV) images to yield a high temporal resolution image sequence at the same spatial resolution of WV images. Since the spatial resolution between Landsat and Worldview is 30 to 1, our preliminary results are mixed in that the objective performance metrics such as peak signal-to-noise ratio (PSNR), correlation coefficient (CC), etc. sometimes showed good fusion performance, but at other times showed poor results. This indicates that more fusion research is still needed in this niche application.
Ground object detection in worldview images
Ground object detection is important for many civilian applications. Counting the number of cars in parking lots can provide very useful information to shop owners. Tent detection and counting can help humanitarian agencies to assess and plan logistics to help refugees. In this paper, we present some preliminary results on ground object detection using high resolution Worldview images. Our approach is a simple and semi-automated approach. A user first needs to manually select some object signatures from a given image and builds a signature library. Then we use spectral angle mapper (SAM) to automatically search for objects. Finally, all the objects are counted for statistical data collection. We have applied our approach to tent detection for a refugee camp near the Syrian-Jordan border. Both multispectral Worldview images with eight bands at 2 m resolution and pansharpened images with four bands at 0.5 m resolution were used. Moreover, synthetic hyperspectral (HS) images derived from the above multispectral (MS) images were also used for object detection. Receiver operating characteristics (ROC) curves as well as detection maps were used in all of our studies.
Challenges in object detection in above-water imagery
Many existing methods of object detection, including edge detection, blob detection, and background subtraction (implemented in libraries such as OpenCV) have proven to be enormously successful when applied to many types of video datasets. However, detecting objects over water presents challenges that are unique and not easily accommodated for by pre-existing algorithms available in popular image processing libraries. In this paper, existing approaches are brie y reviewed, and the challenges encountered in above-water video datasets are highlighted. A recently proposed approach to object detection in radar images - a novel, pixel-intensity statistic based thresholding approach | is then reviewed. In this paper, this approach has been successfully applied to EO/IR datasets as well, extending the implementation to ensure success when applied onto other types of image datasets.
Calibration and synchronization of geospatial metadata for aerial sensor systems
B. Erdnüß
Given an aerial imagery showing an interesting object one often asks for the exact location of the object in the world. There are several approaches to answer the question but one of the easiest and computational cheapest approach is to use geospatial metadata like from a global navigation satellite system (GNSS) and inertial navigation system (INS) together with a digital elevation model (DEM) of the observed area to estimate the target location. The quality of the result depends greatly on the precision of the metadata and the accuracy of the synchronization of the metadata to the single imagery frame. This paper discusses how to quantitatively describe the accuracy of the metadata of an aerial motion imagery system. The aim is to have this information available for information fusion that improve the object location as presumed by the metadata with information from image recognition algorithms.
Signal and Image Processing, and Information Fusion Applications IV
icon_mobile_dropdown
Extrinsic parameter calibration of 2D LiDAR-camera using edge matching and removal of infrared cut filter
Recently, autonomous driving based on multiple sensors has been studied actively in the field of automobiles and unmanned robots. Extrinsic parameter calibration is the first step to integrate the camera with Light Detection And Ranging (LiDAR). This paper proposes an extrinsic parameter calibration method using camera images and single 2D LiDAR points. The removal of infrared cut filter makes the line of laser scan points visible in the camera image. The scan line of the laser points on the calibration target is detected using edge matching in camera images, and the laser points are mapped to the coordinates of image with the initial value of the extrinsic parameters. We estimate the extrinsic parameters using edge matching and top-down method. The proposed method is verified by experiment according to the distance of the target.
Radar applications of quantum squeezing
David Luong, Bhashyam Balaji
Recent experimental results have demonstrated gains in sensing capability using novel possibilities offered by quantum mechanics. In particular, a prototype radar which uses quantum techniques to enhance detection ability has been built in a laboratory, showing that quantum radars at RF frequencies are feasible. This prototype is called a quantum two-mode squeezing radar (QTMS radar). In this paper, we use the QTMS radar as a springboard to review the concept of quantum squeezing. We find that Josephson parametric amplifiers (JPAs), one of which was used in the QTMS radar prototype, can be employed to produce two-mode and one-mode squeezed states at RF frequencies. We then briefly discuss some of the possible applications of such states to the field of radar engineering.
An adaptive smooth variable structure filter based on the static multiple model strategy
Andrew Lee, S. Andrew Gadsden, Stephen A. Wilkerson
Estimation theory is an important field in mechanical and electrical engineering, and is comprised of strategies that are used to predict, estimate, or smooth out important system state and parameters. The most popular and well-studied estimation strategy was developed over 60 years ago, and is referred to as the Kalman filter (KF). The KF yields the optimal solution in terms of estimation error for linear, well-known systems. Other variants of the KF have been developed to handle modeling uncertainties, non-Gaussian noise, and nonlinear systems and measurements. Although KF-based methods typically work well, they lack robustness to uncertainties and external disturbances – which are prevalent in signal processing and target tracking problems. The smooth variable structure filter (SVSF) was introduced in an effort to provide a more robust estimation strategy. In an effort to improve the robustness and filtering strategy further, this paper introduces an adaptive form of the SVSF based on the static multiple model strategy.
Quantum radar, quantum networks, not-so-quantum hackers
David Luong, Bhashyam Balaji
Many quantum technologies, such as quantum computers, rely on a phenomenon called entanglement. One reason why quantum networks are being studied is because they can distribute entanglement to their users. In this paper, we describe how quantum radars, particularly the recently-developed quantum two-mode squeezing radar (QTMS radar), can be used with quantum networks. On a related note, we also point out how QTMS radar can be vulnerable to interception if an adversary has access to the measurement record that the radar uses to distinguish signal from noise.
A secure adaptive beamforming mechanism exploiting deafness in directional beamforming MANET
Vincenzo Inzillo, Abdon Serianni Sr., Alfonso Ariza Quintana
Granting an high level of Security in Wireless Networks is one of the most important required task provided by IEEE802.1x standards for modern wireless network environments such as Mobile Ad Hoc Networks (MANET). In this field, the use of directional antennas could help to reduce negative effects brought by the majority of the network security attacks such as eavesdropping and DOS (Denial of Service). Therefore, classic directional antennas are often not sufficient to mitigate these kinds of issues due that they are poor in terms of hardware architecture. One of the most significant issues in directional MAC (Medium Access Control) communications is represented by deafness problem; however, in this paper, we present a novel approach exploiting deafness issue and beamforming that has the goal to mitigate main security threats, through a node insulation mechanism that allows to improve the overall network security level in directional MANET.
The design of a portable fusion system of an uncooled infrared bolometer and a CCD camera
Rongguo Fu, Zhenwei Du, Ming Zhou, et al.
A portable device of image fusion system of infrared and visible light image is designed. The device is composed of image collecting, process and display part. Image collecting part includes an uncooled infrared bolometer and a CCD camera. The response wavelength of uncooled infrared bolometer is 8-14μm and that of CCD is 0.2~1.1μm. Image process part is a complex structure composed of DSP642, TVP5150, SAA7121H, SDRAM storage and other electronic components, while displaying part is computer or LCD. The focus length of uncooled infrared bolometer is 50mm and that of CCD is 8~50mm to match bolometer. The optical axises of them are rectified to parallel carefully. A multi-dimensional mechanical structure is designed specially to fit for the uncooled infrared bolometer and CCD camera, so as to modify the parallel of axises. The rectifying is carried under an optical system of 3 meter focus length, the resolution angle is 0.05mrad. Images of bolometer and CCD are decoded in the input circuit of TVP5150 and form the digital signal of BT.656, then the decoded videos are transferred to DSP642 circuit. Here the images of bolometer and CCD are fused under the weighted average algorithm, the output image is encoded to PAL format in order to keep the real time response, the complex fusing algorithm such as Laplace pyramid or wavelet isn’t implanted in the DSP642 at present. Theoretically the fused image includes more information than single infrared or CCD image, in order to verify the fusing result, a group of experiments are carried to detect man behind bush. By comparing three images of infrared, visible, fusing images, a primary conclusion is obtained that the weight coefficient will influence the fusing effect in different circumstances.
Neural connectivity analysis by using 3D TMS-EEG with source localization and sliding window coherence techniques
Deepa Gupta, Xiaoming Du, Elliot Hong, et al.
Studying electroencephalography (EEG) in response to transcranial magnetic stimulation (TMS) is gaining popularity for investigating the dynamics of complex neural architecture in the brain. For example, the primary motor cortex (M1) executes voluntary movements by complex connections with other associated subnetworks. To understand these connections better, we analyzed EEG signal response to TMS at left M1 from schizophrenia patients and healthy controls and in contrast with resting state EEG recording. After removing artifacts from EEG, we conducted 2D to 3D sLORETA conversion, a well-established source localization method, for estimating signal strength of 68 source dipoles or cortical regions inside the brain. Next, we studied dynamic connectivity by computing time-evolving spatial coherence of 2278 (=68*(68-1)/2) pairs of cortical regions, with sliding window technique of 200ms window size and 20ms shift over 1sec long data. Pairs with consistent coherence (coherence>0.8 during 200+ sliding windows of patients and controls combined) were chosen for identifying stable networks. For example, we found that during the resting state, precuneus was steadily coherent with middle and superior temporal gyrus in the left hemisphere in both patient and controls. Their connectivity pattern over the sliding windows significantly differed between patients and controls (pvalue<0.05). Whereas for M1, the same was true for two other coherent pairs namely, superamarginal gyrus with lateral occipital gyrus in right hemisphere and medial orbitofrontal gyrus with fusiform in left hemisphere. The TMS-EEG dynamic connectivity results can help to differentiate patient and normal subjects and also help to better understand the brain architecture and mechanisms.
Poster Session
icon_mobile_dropdown
A novel maximum likelihood target detection with antenna selection for active phase array radar system
Kuei-Jang Huang, Jhong-Wei Siao, Der-Hong Ting, et al.
Active phased array radar is currently the development trend of radar system. It consists of thousands of separate Transmit/Receive (T/R) modules with the power transmission function to achieve the phase and amplitude modulation. For a radar system, target detection is always the most important issue. Basically, some antennas perform the tracking task, while the remaining antennas conduct the search function after a radar system detects the targets. With such multiple T/R modules or namely, the multiple input multiple output (MIMO) scheme, the radar can have larger radar gain advantage to increase its search region or capacity. However, as the increase of antennas adopted by the active phased array radar, the required hardware and computational complexity also becomes a serious concern for the radar system in practical realizations. In literatures, one of the solutions for avoiding such drawback is to select the antennas properly and effectively for target detection and tracking. In this paper, we proposed a novel target detection method based on the maximum likelihood (ML) criterion to predict and locate the targets correctly and optimally for active phased array radar system. Besides, an antenna selection method is also proposed and combined to the target detection in reducing the required hardware and computational complexity. The simulation results show that the proposed methods can not only take the advantage of reducing hardware and computational complexity, but also maintain the performances of the radar system.
A multiple model adaptive SVSF-KF estimation strategy
Jacob M. Goodman, Stephen A. Wilkerson, Charles Eggleton, et al.
State estimation strategies play a critical role in obtaining accurate information about the state of dynamic systems as they develop. Such information can be important on its own and critical for precise and predictable control of such systems. The Kalman filter (KF) is a classic algorithm and among the most powerful tools in state estimation. The Kalman filter however can be sensitive to modeling uncertainty and sudden changes in system dynamics. The Smooth Variable Structure Filter (SVSF) is a relatively new estimation strategy that operates on variable structure concepts. In general, the SVSF has the advantage that is can be quite robust to modeling uncertainty and sudden fault conditions. Recent advancements to the SVSF, such as the addition of a covariance formulation, and the derivation of a time varying smoothing boundary layer (VBL), have allowed for combined SVSF – KF strategies. In a typical SVSF-KF approach, the VBL is used to detect the presence of a system fault, and switch from the more optimal KF gain to the more robust SVSF gain. While this approach has been proven effective in several cases, there are circumstances where the VBL will fail to indicate the presence of an ongoing fault. A new form of the SVSF-KF is proposed, based on the framework of the Multiple Model Adaptive Estimator.
Measuring and monitoring the QoS and QoE in software defined networking environments
Jan Rozhon, Filip Rezac, Jakub Safarik, et al.
Software Defined Networks (SDN) are gaining attraction with the expanding use of complex data center infrastructures that accommodate the increasing demand for computational power related to much more feature-rich web applications and common use of deep learning algorithms. The increased set of features being used in the applications are reflected in the increased demands on network architectures starting with the higher network throughput, through the need for complex high-availability schemes and ending with near-perfect delay/loss communication characteristics. This increased demand resulted in the need for more flexible network architectures resulting in the major change in the networking paradigm and the related shift from traditional networks to software defined ones. The quality of service (QoS) in the networks and quality of experience (QoE) of the end-user services is a major topic of interest in the networking community resulting in several approaches implemented in the networks to ensure resource reservation or traffic prioritization. In this paper, we propose a way how to propagate the arbitrary qualitative parameter in the OpenFlow messages that would allow for easy monitoring of the quality of service and quality of experience. Moreover, we focus on the measurement of the quality of speech and the consecutive propagation of the information through the SDN network to allow SDN controllers and the OpenFlow capable switches controlled by them to react on the decreasing quality and support the services being carried through the network. The paper describes the way how the quality is measured, how the information is processed by the controller and how it is encapsulated in the OpenFlow messages. The assumptions are validated in the simulations based on the mininet simulation tool and Ryu SDN controller. The implications for the carried voice quality are discussed as well.