Proceedings Volume 10992

Geospatial Informatics IX

cover
Proceedings Volume 10992

Geospatial Informatics IX

Purchase the printed version of this volume at proceedings.com or access the digital version at SPIE Digital Library.

Volume Details

Date Published: 26 July 2019
Contents: 6 Sessions, 17 Papers, 10 Presentations
Conference: SPIE Defense + Commercial Sensing 2019
Volume Number: 10992

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Front Matter: Volume 10992
  • Full Motion Video Analytics
  • Environmental and Disaster Analytics
  • Geospatial Informatics Applications I
  • Geospatial Informatics Applications II
  • Poster Session
Front Matter: Volume 10992
icon_mobile_dropdown
Front Matter: Volume 10992
This PDF file contains the front matter associated with SPIE Proceedings Volume 10992, including the title page, copyright information, table of contents, and author and committee lists.
Full Motion Video Analytics
icon_mobile_dropdown
Captioning of full motion video from unmanned aerial platforms
In this work, we aim to address the needs of human analysts to consume and exploit data given the proliferation of overhead imaging sensors. We have investigated automatic captioning methods capable of describing and summarizing scenes and activities by providing textual descriptions using natural language for overhead full motion video (FMV). We have integrated methods to provide three types of outputs: (1) summaries of short video clips; (2) semantic maps, where each pixel is labeled with a semantic category; and (3) dense object description to capture object attributes and activities. We show results obtained from VIRAT and Aeroscapes publicly available datasets.
Towards image and video super-resolution for improved analytics from overhead imagery
In this work, we address the problem of losing details in the overhead remote sensing image acquisition and generation process due to sensor resolution and distance to target by leveraging state-of-the-art deep neural network architectures. The goal is to recover such details by super-resolving the images acquired by overhead imaging sensors in order for human analysts to interpret data more accurately, and consequentially, for automated visual exploitation algorithms to be applied more effectively. We have developed a super-resolution framework operating on overhead full motion video (FMV) and still imagery (e.g. satellite images). Our framework consists of a neural network capable of learning the mapping between low and high resolution images in order to produce plausible details about the scene. Our framework combines Generative Adversarial Networks (GANs) and Recurrent Neural Networks (RNNs) to process low resolution signals both spatially and, in the case of FMV, temporally. We have applied the output of our system to several visual perception tasks, including object detection, object tracking, and semantic segmentation. We have also applied our methods to data from different geographical areas, sensors, and even modalities to demonstrate broad and generalized applicability.
Fully convolutional adaptive tracker with real time performance
We present a Fully Convolutional Adaptive Tracker (FCAT) based on a Siamese architecture that operates in real-time and is well suited for tracking from aerial platforms. Real time performance is achieved by using a fully convolutional network to generate a densely sampled response map in a single pass. The network is fined-tuned on the tracked target with an adaptation approach that is similar to the procedure used to train Discriminative Correlation Filters. A key difference between FCAT and Discriminative Correlation Filters is that FCAT fine-tunes the template feature directly using Stochastic Gradient Descent while DCF regresses a correlation filter. The effectiveness of the proposed method was illustrated on surveillance style videos, where FCAT performs competitively with state-of-the-art visual trackers while maintaining real-time tracking speeds of over 30 frames per second.
Modeling and assessing VNIIRS using in-scene metrics
The Video National Imagery Interpretability Rating Scale (VNIIRS) is a useful standard for quantifying the interpretability of motion imagery. Automated accurate assessment of VNIIRS would benefit operators by characterizing the potential utility of a video stream. For still, visible-light imagery the general image quality equation (GIQE) provides a standard model to automatically estimate the NIIRS of the image using sensor parameters, namely the ground sample distance (GSD), the relative edge response (RER), and signal-to-noise ratio (SNR). Typically, these parameters are associated with a specific sensor and the metadata correspond to specific image acquisition. For many tactical video sensors however, these sensor metadata are not available and it is necessary to estimate these parameters from information available in the imagery. We present methods for estimating the RER and SNR through analysis of the scene, i.e. the raw pixel data. By estimating the RER and SNR directly from the video data, we can compute accurate VNIIRS estimates for the video. We demonstrate the method on a set of video data.
Environmental and Disaster Analytics
icon_mobile_dropdown
Geospatial analytics of Hurricane Florence flooding effects using overhead imagery
Mark W. Roberson, Abigail E. Bell, Laura E. Roberson, et al.
In September 2018, Hurricane Florence struck the southeastern United States, depositing an estimated ten trillion gallons of water on the states of North Carolina and South Carolina. The resulting floodwaters caused the loss of human lives and inflicted tremendous economic costs upon the states. The flooding was particularly damaging to both livestock and crops, and due to the importance of agriculture to the economy of North Carolina, this damage is of special concern. Overhead sensing modalities, including synthetic aperture radar (SAR) and optical imagery, provide tools to study the affected regions using time-varying geospatial analytics. We discuss our work with the analysis of floodwaters related to livestock waste waters and to crop health. We discuss several collection platforms as well as the sensor geometries in order to discuss the performance trade-offs. We process overhead data sets to analyze the floodwater surface areas and normalized differential vegetation index (NDVI) before and after the time of the hurricane to understand the effects of the storm as it relates to agriculture in North Carolina.
Design of a cloud-based geo-location service in a disaster incident command system (Conference Presentation)
Prasad Calyam, Osunkoya Opeoluwa, Andrew Krall, et al.
In the aftermath of a disaster, it is usually difficult to communicate and coordinate the activities of the various emergency incident responders responsible for triage at multiple disaster incident scenes. Incident scenes are often spread out across a large geo-physical area, and emergency response usually occurs with limited manpower. Thus, the challenge is to have rapid setup of a disaster incident command system (ICS) and integration of disaster wide real-time location and status information across the various staff, patients and incidents to provide a cohesive picture across the disaster incident scenes. In this paper, we address these concerns through a novel design of a cloud based distributed geo-location service for a next-generation ICS. We describe software services that integrate wireless mesh elements, geo-location, messaging, incident and responder information management and video streaming services. The aim of these services design is to deploy them in austere physical environments, where existing communication channels may be unavailable due to disaster impact. Our design fits within a hierarchical cloud-fog platform with a suite of visual and geolocation applications that provides a centralized view of the various incidents in a disaster scenario. The platform supports the ability to integrate variants of user interface dashboards and IoT devices (e.g., heads-up displays for real-time two-way communication, virtual beacons to collect contextual geolocation information), as well as response protocols to guide an emergency responder’s actions by providing a context related checklist to ensure the right procedures were followed. The contextual information includes the present and previous locations of patients and staff, a video stream of a staff member’s camera, the location and details of the various incidents across the disaster scene, and the statuses of the patients. Thus, the platform has the potential to improve situational awareness to prioritize triage resources, reduce the frequency of human error in incident response, leverage manpower effectively at incident sites, and improve overall triage accuracy. We conclude the paper with a demonstration of our implementation that simulates a real-time dynamic map marker functionality during management of several incidents and provides benefits for: (a) medical triage, and (b) protest crowd incident management.
Detection of illegal fishing
Sustainment of fishing stocks can be accomplished by reducing illegal fishing. Enforcement requires timely intelligence. Often the perpetrators escape the enforcement zone to meet up with the fish buyers at sea where they conduct illegal transactions. Transshipments at sea enable criminal endeavors of all kinds. This paper addresses detecting fishingrelated behaviors from track data, associating RF and satellite imagery to identify the vessels, and using the evidence to build a confident case to support prosecution and deterrence efforts.
A sensor selection model in simultaneous monitoring of multiple types of disaster
This paper proposes a conceptual design of priority-based sensor selection model for use in simultaneous monitoring and managing of multiple types of disasters, especially in the detection and response phases. Sensors which measure different types of energy are critical components of real-time monitoring environment in disaster management. Moreover, the use of appropriate sensors in disaster monitoring has vital role in avoiding the production of inaccurate or useless data for management as inappropriate systems result in unreliable detection systems. In addition, building a disaster monitoring system with emergent technology products (sensors) may have significant installation and operational costs, which make the selection of process more crucial; low cost and efficient monitoring systems are likely to be widely adopted. In fact, while data gathered by most of the monitoring systems can be used in different types of disaster detection and response, these systems are usually designed for single type of disaster. In order to reduce the installation and operational costs of monitoring systems and to increase the monitoring availabilities, sensor systems which enable simultaneous monitoring of different disaster types would be beneficial. The proposed model provides also useful preference tables to make appropriate selection of sensor systems according to disaster types.
Geospatial Informatics Applications I
icon_mobile_dropdown
Into the wild: a study in rendered synthetic data and domain adaptation methods
Marissa Dotter, Chelsea Mediavilla, Jonathan Sato, et al.
Rendering synthetic imagery from gaming engine environments allows us to create data featuring any number of object orientations, conditions, and lighting variations. This capability is particularly useful in classification tasks, where there is an overwhelming lack of labeled data needed to train state-of-the-art machine learning algorithms. However, the use of synthetic data is not without limit: in the case of imagery, training a deep learning model on purely synthetic data typically yields poor results when applied to real world imagery. Previous work shows that "domain adaptation," mixing real-world and synthetic data, improves performance on a target dataset. In this paper, we train a deep neural network with synthetic imagery, including ordnance and overhead ship imagery and investigate a variety of methods to adapt our model to a dataset of real images.
Classification of maritime vessels using capsule networks
Cameron Hilton, Shibin Parameswaran, Marissa Dotter, et al.
Capsule networks have shown promise in their ability to perform classification tasks with viewpoint invariance; outperforming the accuracy of other models in some cases. This capability applies to maritime classification tasks where there is a lack of labeled data and an inability to collect all viewpoints of objects that are needed to train machine learning algorithms. Capsule Networks lend themselves well to applying their unique network architecture to the maritime vessel BCCT dataset, which exhibits characteristics aligned with the theorized strengths of Capsule Networks. Comparing these with respect to traditional CNN architectures and data augmentation techniques provides a potential roadmap for incorporation into future classification tasks involving imagery in data starved domains relying heavily on viewpoint invariance. We present our results on the classification of ship using Capsule Networks and explore their usefulness at this task given their current state of development.
Large-scale overhead image summarization (Conference Presentation)
Gordon Christie, Ryan Amundsen, Scott Almes, et al.
In this work, we aim to address the needs of human analysts to automatically summarize the content of large swaths of overhead imagery. We present our approach to this problem using deep neural networks, providing detection and segmentation information to enable fine-grained description of scene content for human ingestion. Four different perception systems were run on blocks of large-scale satellite imagery: (1) semantic segmentation of roads, buildings, and vegetation; (2) zone segmentation to identify commercial, industrial, residential, and airport zones; (3) classification of objects such as helipads, silos, and water towers; and (4) object detection to find vehicles. Results are filtered based on a user's zoom level in the swath, and subsequently summarized as textual bullets and statistics. Our framework blocks the image swaths at a resolution of approximately 30cm for each perception system. For semantic segmentation, overlapping imagery is processed to avoid edge artifacts and improve segmentation results by voting for the category label of each pixel in the scene visible from multiple chips. Our approach to zone segmentation is based on classification models that vote for a chip belonging to a particular zone type. Regions surrounded by chips classified as a particular category are assigned a higher score. We also provide an overview of our experience using OpenStreetMap (OSM) for pixel-wise annotation (for semantic segmentation), image-level labels (for classification), and end-to-end captioning methods (image to text). These capabilities are envisioned to aid the human analyst through an interactive user interface, whereby scene content is automatically summarized and updated as the user pans and zooms within the imagery.
Geospatial Informatics Applications II
icon_mobile_dropdown
Representation of predicted accuracy of 3d geospatial products and their subsequent fusion with other products
This paper describes a method to represent the predicted accuracy of an arbitrary 3d geospatial product from a specific type or class of products; for example, the class of 3d point clouds generated from EO imagery by vendor “abc” within date range “xyz”. The predicted accuracy is based on accuracy assessments of previous products from the same type or class of products; in particular, based on corresponding sample statistics of geolocation error computed using groundtruth or surveyed geolocations. The representation of predicted accuracy is theoretically rigorous, flexible, and practical and is based on the underlying concepts of Mixed Gaussian Random Fields (MGRF). It also allows for a representation of predicted accuracy that can vary over the product, allowing for increased geolocation uncertainty for a priori“problem areas” in the product. The MGRF-based approach for the representation of predicted accuracy is particularly applicable to 3d geospatial products that do not have product-specific predicted accuracies generated with the product itself. This is the typical situation, particularly for commodities-based geospatial products. The paper also describes a method for the near-optimal adjustment of a geospatial product based on its predicted accuracy and its fusion with other products.
Deep learning for automatic ordnance recognition
Explosive Ordnance Disposal (EOD) technicians are on call to respond to a wide variety of military ordnance. As experts in conventional and unconventional ordnance, they are tasked with ensuring the secure disposal of explosive weaponry. Before EOD technicians can render ordnance safe, the ordnance must be positively identified. However, identification of unexploded ordnance (UXO) in the field is made difficult due to a massive number of ordnance classes, object occlusion, time constraints, and field conditions. Currently, EOD technicians collect photographs of unidentified ordnance and compare them to a database of archived ordnance. This task is manual and slow - the success of this identification method is largely dependent on the expert knowledge of the EOD technician. In this paper, we describe our approach to automatic ordnance recognition using deep learning. Since the domain of ordnance classification is unique, we first describe our data collection and curation efforts to account for real-world conditions, such as object occlusion, poor lighting conditions, and non-iconic poses. We apply a deep learning approach using ResNet to this problem on our collected data. While the results of these experiments are quite promising, we also discuss remaining challenges and potential solutions to deploying a real system to assist EOD technicians in their extremely challenging and dangerous role.
Precision sensing of AC magnetic fields from moving platforms
Mark W. Roberson, Abigail E. Bell, Steve Waller, et al.
Mapping alternating current (AC) magnetic field strengths over time by mapping of the field strengths presents detailed information of human activities and the presence of facilities. Unlike few fixed location viewing, with a moving detector measurement accuracy becomes more challenging. The use of real-time geospatial analytics requires several changes in detection methodologies and data dissemination. For small unmanned airborne systems (sUAS), ensuring the rotational stability of the sensor while reducing the self-sensing noise presents design challenges in the airframe. For ground vehicles, the rapidly changing values of the AC magnetic fields require rapid and accurate updates of the position. This information must be precisely indexed to magnetic field strength frequency and phase information. We discuss our work with the collection of multi-axis magnetic field data from both multi-rotor sUAS and ground vehicles. We model the power lines using detailed measurements of the conductors and field modeling software. We study the sUAS magnetic field data as a function of height above the ground plane of the experimental sites for both the fundamental and harmonic frequencies and phases relative to a fixed location reference time signal. The collection regions include rural, interstate, and suburban areas with both overhead and buried power lines contributing to the signals.
Object extraction in the context of an image registration workflow
With the advent of affordable drone systems, imagery from airborne sensors has become available for addressing many different tasks in various fields of application. For some of these tasks the imagery has to come with a georeference satisfying certain accuracy requirements. If we want to perform such a task and the accuracy of GPS and INS sensors onboard the sensor platform cannot match accuracy requirements or location information is faulty or unavailable, we need to establish a georeference or improve the inaccurate existing one. We do this with our image registration workflow. It matches the contours of objects present both in the imagery and a reference image which comes with a georeference satisfying the accuracy requirement of the task to be performed. This approach has proven to be both feasible and robust to appearance unsimilarity between the image and the reference image, enabling to use a reference that is rather unsimilar in appearance to the image. The workflow comprises four steps, namely extracting the objects, extracting their contours, reducing the amount of contour points and finally matching them. To improve the performance of our workflow, we aspire to improve the performance of each of the four steps individually. While previous work has focused on the finetuning of the three latter steps keeping the object extracting method and thus step one fixed for the time being, the scope of this work was the implementation of a novel object extraction method and its evaluation in the context of the workflow. Long line shaped objects such as road networks are likely to be present both in the image and the reference despite their possible unsimilarity in appearance. The method extracts such objects after growing them by merging smaller individual line-shaped objects if certain merge criteria is met.
Quantitative assessment of image quality for maritime surveillance applications
Ross Eaton, Ian M. Gingrich, John M. Irvine
Analysis and measurement of perceived image quality has been an active area of research for decades. Although physical measurements of image parameters often correlate with human perceptions, user-centric approaches have focused on the observer’s ability to perform certain tasks with the imagery. This task-based orientation has led to the development of the Johnson Criteria and the National Imagery Interpretability Ratings Scale as standards for quantifying the interpretability of an image. A substantial literature points to three primary factors affecting human perception of image interpretability: spatial resolution, image sharpness as measured by the relative edge response, and perceived noise measured by the signal-to-noise ratio. For maritime and ocean surveillance applications, however, these factors do not fully represent the characteristics of the imagery. Images looking at the ocean surface can encompass a wide range of spatial resolutions. Fog, sun glint, and color distortion can degrade image interpretability. In this paper, we explore both the general factors and the domain specific concerns for quantifying image interpretability. In particular, we propose new metrics to assess the dynamic range and color balance for maritime surveillance imagery. We will present the new metrics and illustrate their performance on relevant image data.
Poster Session
icon_mobile_dropdown
Fusing sensor data with publicly available information (PAI) for autonomy applications
Lei Qian, Vadas Gintautas
Publicly available information (PAI) provides data for reasoning about an environment when direct sensing is constrained, and augments scarce direct measurements. Specifically, PAI sources providing movement information over time support analysis of behavior patterns. In this paper we describe sources and uses of PAI, discuss a tool for automated data analysis with emphasis on detection of anomalous behavior, discuss challenges present in exploiting PAI, and explore how anomalous activity in social media PAI is correlated with anomalous traffic observed using movement PAI.
Evaluation of unsupervised optical flow methods for deep learning in real world datasets
The creation of large labeled datasets for optical ow is often infeasible due to the difficulty associated with measuring dynamic objects in real scenes. Current datasets from real world scenes are often sparse in terms of ground truth. Generating synthetic datasets where ground truth can be easily obtained tends to be the easiest way to acquire the large labeled datasets required to achieve good performance. Often, the switch from synthetic to real world imagery leads to a drop-in performance. Recently with the development of differentiable image warping layers, unsupervised methods, which require no ground truth optical ow, can be applied to train a deep neural network (DNN) model for optical ow tasks, and this allows for training with un-labeled video. Brightness constancy assumption is the underlying principle that enables unsupervised learning of optical ow. Violations of the brightness constancy assumption of optical ow in particular at occlusions results in large outlier errors which are harmful to the learning process. The use of robust regression loss function and outlier prediction methods attempt to alleviate the problem of outliers. In this paper, we will conduct experiments to compare performance various unsupervised optical ow methods by exploring the performance of different robust cost functions, and outlier methods.