Geospatial Informatics, Fusion, and Motion Video Analytics VII | (2017) | Publications

Volume Details

Date Published: 7 June 2017

Contents: 4 Sessions, 15 Papers, 12 Presentations

Conference: SPIE Defense + Security 2017

Volume Number: 10199

All links to SPIE Proceedings will open in the SPIE Digital Library.

Show all abstracts

View Session

Front Matter: Volume 10199
Video Analysis
Photogrammetry and Uncertainty Propagation
Geospatial Analysis

Front Matter: Volume 10199

Show abstract

This PDF file contains the front matter associated with SPIE Proceedings Volume 10199, including the Title Page, Copyright information, Table of Contents, and Conference Committee listing.

Video Analysis

Pilot study on real-time motion detection in UAS video data by human observer and image exploitation algorithm

Jutta Hild, Wolfgang Krüger, Stefan Brüstle, et al.

Show abstract

Real-time motion video analysis is a challenging and exhausting task for the human observer, particularly in safety and security critical domains. Hence, customized video analysis systems providing functions for the analysis of subtasks like motion detection or target tracking are welcome. While such automated algorithms relieve the human operators from performing basic subtasks, they impose additional interaction duties on them. Prior work shows that, e.g., for interaction with target tracking algorithms, a gaze-enhanced user interface is beneficial. In this contribution, we present an investigation on interaction with an independent motion detection (IDM) algorithm. Besides identifying an appropriate interaction technique for the user interface – again, we compare gaze-based and traditional mouse-based interaction – we focus on the benefit an IDM algorithm might provide for an UAS video analyst. In a pilot study, we exposed ten subjects to the task of moving target detection in UAS video data twice, once performing with automatic support, once performing without it. We compare the two conditions considering performance in terms of effectiveness (correct target selections). Additionally, we report perceived workload (measured using the NASA-TLX questionnaire) and user satisfaction (measured using the ISO 9241-411 questionnaire). The results show that a combination of gaze input and automated IDM algorithm provides valuable support for the human observer, increasing the number of correct target selections up to 62% and reducing workload at the same time.

Data transpositioning with content-based image retrieval

Michael J. Manno, Daqing Hou

Show abstract

Currently, when data is collected, it is usually collected for a specific need or situation. This includes text and image data. When a new need or situation arises, the data collection process repeats, often without referencing the original data collected for previous situations. Data Transpositioning is a search methodology that leverages the context of a previous manual search process, to formulate a new automated search with new results. As a result, the data collection process for one situation, can quickly be applied to another situation, but with less user effort. Thus, a set of new results can quickly be constructed without the user manually revisiting each of the originating sources. In the case of Content-Based Image Retrieval, the idea is to identify the content attributes of an image, such as a particular color, shape or texture, and apply changes to the originating query, and return a new set of results with the similar attributes. Data Transpositioning has been successfully applied to the result sets that contain text. Our goal is to continue this research beyond text to solve more complex problems in other domains, especially when image data are involved.

An analysis of optical flow on real and simulated data with degradations

Josh Harguess, Chris Barngrover, Amin Rahimi

Show abstract

Estimating the motion of moving targets from a moving platform is an extremely challenging problem in un-manned systems research. One common and often successful approach is to use optical flow for motion estimation to account for ego-motion of the platform and to then track the motion of surrounding objects. However, in the presence of video degradation such as noise, compression artifacts, and reduced frame rates, the performance of state-of-the-art optical flow algorithms greatly diminishes. We consider the effects of video degradation on two well-known optical flow datasets as well as on a real-world video data. To highlight the need for robust optical flow algorithms in the presence of real-world conditions, we present both qualitative and quantitative results on these datasets.

Geopositioning with a quadcopter: extracted feature locations and predicted accuracy without a priori sensor attitude information

John Dolloff, Bryant Hottel, David Edwards, et al.

Show abstract

This paper presents an overview of the Full Motion Video-Geopositioning Test Bed (FMV-GTB) developed to investigate algorithm performance and issues related to the registration of motion imagery and subsequent extraction of feature locations along with predicted accuracy. A case study is included corresponding to a video taken from a quadcopter. Registration of the corresponding video frames is performed without the benefit of a priori sensor attitude (pointing) information. In particular, tie points are automatically measured between adjacent frames using standard optical flow matching techniques from computer vision, an a priori estimate of sensor attitude is then computed based on supplied GPS sensor positions contained in the video metadata and a photogrammetric/search-based structure from motion algorithm, and then a Weighted Least Squares adjustment of all a priori metadata across the frames is performed. Extraction of absolute 3D feature locations, including their predicted accuracy based on the principles of rigorous error propagation, is then performed using a subset of the registered frames. Results are compared to known locations (check points) over a test site. Throughout this entire process, no external control information (e.g. surveyed points) is used other than for evaluation of solution errors and corresponding accuracy.

Photogrammetry and Uncertainty Propagation

Using image quality metrics to identify adversarial imagery for deep learning networks

Josh Harguess, Jeremy Miclat, Julian Raheema

Show abstract

Deep learning has continued to gain momentum in applications across many critical areas of research in computer vision and machine learning. In particular, deep learning networks have had much success in image classification, especially when training data are abundantly available, as is the case with the ImageNet project. However, several researchers have exposed potential vulnerabilities of these networks to carefully crafted adversarial imagery. Additionally, researchers have shown the sensitivity of these networks to some types of noise and distortion. In this paper, we investigate the use of no-reference image quality metrics to identify adversarial imagery and images of poor quality that could potentially fool a deep learning network or dramatically reduce its accuracy. Results are shown on several adversarial image databases with comparisons to popular image classification databases.

Methods for the specification and validation of geolocation accuracy and predicted accuracy

John Dolloff, Jacqueline Carr

Show abstract

The specification of geolocation accuracy requirements and their validation is essential for the proper performance of a Geolocation System and for trust in resultant three dimensional (3d) geolocations. This is also true for predicted accuracy requirements and their validation for a Geolocation System, which assumes that each geolocation produced (extracted) by the system is accompanied by an error covariance matrix that characterizes its specific predicted accuracy. The extracted geolocation and its error covariance matrix are standard outputs of (near) optimal estimators, either associated (internally) with the Geolocation System itself, or with a “downstream” application that inputs a subset of Geolocation System output, such as sensor data/metadata: for example, a set of images and corresponding metadata of the imaging sensor’s pose and its predicted accuracy. This output allows for subsequent (near) optimal extraction of geolocations and associated error covariance matrices based on the application’s measurements of pixel locations in the images corresponding to objects of interest. This paper presents recommended methods and detailed equations for the specification and validation of both accuracy and predicted accuracy requirements for a general Geolocation System. The specification/validation of accuracy requirements are independent from the specification/validation of predicted accuracy requirements. The methods presented in this paper are theoretically rigorous yet practical.

Three-dimensional scene reconstruction from a two-dimensional image

Franz Parkins, Eddie Jacobs

Show abstract

We propose and simulate a method of reconstructing a three-dimensional scene from a two-dimensional image for developing and augmenting world models for autonomous navigation. This is an extension of the Perspective-n-Point (PnP) method which uses a sampling of the 3D scene, 2D image point parings, and Random Sampling Consensus (RANSAC) to infer the pose of the object and produce a 3D mesh of the original scene. Using object recognition and segmentation, we simulate the implementation on a scene of 3D objects with an eye to implementation on embeddable hardware. The final solution will be deployed on the NVIDIA Tegra platform.

Correlation-agnostic fusion for improved uncertainty estimation in multi-view geo-location from UAVs

Clark N. Taylor, Paul O. Sundlie

Show abstract

When geo-locating ground objects from a UAV, multiple views of the same object can lead to improved geo- location accuracy. Of equal importance to the location estimate, however, is the uncertainty estimate associated with that location. Standard methods for estimating uncertainty from multiple views generally assume that each view represents an independent measurement of the geo-location. Unfortunately, this assumption is often violated due to correlation between the location estimates. This correlation may occur due to the measurements coming from the same platform, meaning that the error in attitude or location may be correlated across time; or it may be due to external sources (such as GPS) having the same error in multiple aircraft. In either case, the geo-location estimates are not truly independent, leading to optimistic estimates of the geo-location uncertainty. For distributed data fusion applications, correlation-agnostic fusion methods have been developed that can fuse data together regardless of how much correlation may be present between the two estimates. While the results are generally not as impressive as when correlation is perfectly known and taken into account, the fused uncertainty results are guaranteed to be conservative and an improvement on operating without fusion. In this paper, we apply a selection of these correlation-agnostic fusion techniques to the multi-view geo-location problem and analyze their effects on geo-location and predicted uncertainty accuracy. We find that significant benefits can be found from applying these correlation agnostic fusion effects, but that they vary greatly in how well they estimate their own uncertainty.

Geospatial Analysis

Geoparsing text for characterizing urban operational environments through machine learning techniques

Noah W. Garfinkle, Lucas Selig, Timothy K. Perkins, et al.

Show abstract

Increasing worldwide internet connectivity and access to sources of print and open social media has increased near realtime availability of textual information. Capabilities to structure and integrate textual data streams can contribute to more meaningful representations of operational environment factors (i.e., Political, Military, Economic, Social, Infrastructure, Information, Physical Environment, and Time [PMESII-PT]) and tactical civil considerations (i.e., Areas, Structures, Capabilities, Organizations, People and Events [ASCOPE]). However, relying upon human analysts to encode this information as it arrives quickly proves intractable. While human analysts possess an ability to comprehend context in unstructured text far beyond that of computers, automated geoparsing (the extraction of locations from unstructured text) can empower analysts to automate sifting through datasets for areas of interest. This research evaluates existing approaches to geoprocessing as well as initiating the research and development of locally-improved methods of tagging parts of text as possible locations, resolving possible locations into coordinates, and interfacing such results with human analysts. The objective of this ongoing research is to develop a more contextually-complete picture of an area of interest (AOI) including human-geographic context for events. In particular, our research is working to make improvements to geoparsing (i.e., the extraction of spatial context from documents), which requires development, integration, and validation of named-entity recognition (NER) tools, gazetteers, and entity-attribution. This paper provides an overview of NER models and methodologies as applied to geoparsing, explores several challenges encountered, presents preliminary results from the creation of a flexible geoparsing research pipeline, and introduces ongoing and future work with the intention of contributing to the efficient geocoding of information containing valuable insights into human activities in space.

A machine learning pipeline for automated registration and classification of 3D lidar data

Abhejit Rajagopal, Karthik Chellappan, Shivkumar Chandrasekaran, et al.

Show abstract

Despite the large availability of geospatial data, registration and exploitation of these datasets remains a persis- tent challenge in geoinformatics. Popular signal processing and machine learning algorithms, such as non-linear SVMs and neural networks, rely on well-formatted input models as well as reliable output labels, which are not always immediately available. In this paper we outline a pipeline for gathering, registering, and classifying initially unlabeled wide-area geospatial data. As an illustrative example, we demonstrate the training and test- ing of a convolutional neural network to recognize 3D models in the OGRIP 2007 LiDAR dataset using fuzzy labels derived from OpenStreetMap as well as other datasets available on OpenTopography.org. When auxiliary label information is required, various text and natural language processing filters are used to extract and cluster keywords useful for identifying potential target classes. A subset of these keywords are subsequently used to form multi-class labels, with no assumption of independence. Finally, we employ class-dependent geometry extraction routines to identify candidates from both training and testing datasets. Our regression networks are able to identify the presence of 6 structural classes, including roads, walls, and buildings, in volumes as big as 8000 m³ in as little as 1.2 seconds on a commodity 4-core Intel CPU. The presented framework is neither dataset nor sensor-modality limited due to the registration process, and is capable of multi-sensor data-fusion.

General linear hypothesis test: a method for algorithm selection

Paul Singerman, Erik Blasch, Michael Giansiracusa, et al.

Show abstract

Algorithm selection is paramount in determining how to implement a process. When the results can be computed directly, an algorithm that reduces computational complexity is selected. When the results less binary there can be difficulty in choosing the proper implementation. Weighing the effect of different pieces of the algorithm on the final result can be difficult to find. In this research, we propose using a statistical analysis tool known as General Linear Hypothesis to find the effect of different pieces of an algorithm implementation on the end result. This will be done with transform based image fusion techniques. This study will weigh the effect of different transforms, fusion techniques, and evaluation metrics on the resulting images. We will find the best no-reference metric for image fusion algorithm selection and test this method on multiple types of image sets. This assessment will provide a valuable tool for algorithm selection to augment current techniques when results are not binary.

Matrix sketching for big data reduction (Conference Presentation)

Soundararajan Ezekiel, Michael Giansiracusa

Show abstract

Abstract: In recent years, the concept of Big Data has become a more prominent issue as the volume of data as well as the velocity in which it is produced exponentially increases. By 2020 the amount of data being stored is estimated to be 44 Zettabytes and currently over 31 Terabytes of data is being generated every second. Algorithms and applications must be able to effectively scale to the volume of data being generated. One such application designed to effectively and efficiently work with Big Data is IBM’s Skylark. Part of DARPA’s XDATA program, an open-source catalog of tools to deal with Big Data; Skylark, or Sketching-based Matrix Computations for Machine Learning is a library of functions designed to reduce the complexity of large scale matrix problems that also implements kernel-based machine learning tasks. Sketching reduces the dimensionality of matrices through randomization and compresses matrices while preserving key properties, speeding up computations. Matrix sketches can be used to find accurate solutions to computations in less time, or can summarize data by identifying important rows and columns. In this paper, we investigate the effectiveness of sketched matrix computations using IBM’s Skylark versus non-sketched computations. We judge effectiveness based on several factors: computational complexity and validity of outputs. Initial results from testing with smaller matrices are promising, showing that Skylark has a considerable reduction ratio while still accurately performing matrix computations.

Double-density and dual-tree based methods for image super resolution

Michael Giansiracusa, Erik Blasch, Paul Singerman, et al.

Show abstract

When several low-resolution images are taken of the same scene, they often contain aliasing and differing subpixel shifts causing different focuses of the scene. Super-resolution imaging is a technique that can be used to construct high-resolution imagery from these low-resolution images. By combining images, high frequency components are amplified while removing blurring and artifacting. Super-resolution reconstruction techniques include methods such as the Non-Uniform Interpolation Approach, which is low resource and allows for real-time applications, or the Frequency Domain Approach. These methods make use of aliasing in low-resolution images as well as the shifting property of the Fourier transform. Problems arise with both approaches, such as limited types of blurred images that can be used or creating non-optimal reconstructions. Many methods of super-resolution imaging use the Fourier transformation or wavelets but the field is still evolving for other wavelet techniques such as the Dual-Tree Discrete Wavelet Transform (DTDWT) or the Double-Density Discrete Wavelet Transform (DDDWT). In this paper, we propose a super-resolution method using these wavelet transformations for use in generating higher resolution imagery. We evaluate the performance and validity of our algorithm using several metrics, including Spearman Rank Order Correlation Coefficient (SROCC), Pearson’s Linear Correlation Coefficient (PLCC), Structural Similarity Index Metric (SSIM), Root Mean Square Error (RMSE), and PeakSignal-Noise Ratio (PSNR). Initial results are promising, indicating that extensions of the wavelet transformations produce a more robust high resolution image when compared to traditional methods.

Standardized acquisition, storing and provision of 3D enabled spatial data

B. Wagner, S. Maier, E. Peinsipp-Byma

Show abstract

In the area of working with spatial data, in addition to the classic, two-dimensional geometrical data (maps, aerial images, etc.), the needs for three-dimensional spatial data (city models, digital elevation models, etc.) is increasing. Due to this increased demand the acquiring, storing and provision of 3D enabled spatial data in Geographic Information Systems (GIS) is more and more important. Existing proprietary solutions quickly reaches their limits during data exchange and data delivery to other systems. They generate a large workload, which will be very costly. However, it is noticeable that these expenses and costs can generally be significantly reduced using standards. The aim of this research is therefore to develop a concept in the field of three-dimensional spatial data that runs on existing standards whenever possible. In this research, the military image analysts are the preferred user group of the system. To achieve the objective of the widest possible use of standards in spatial 3D data, existing standards, proprietary interfaces and standards under discussion have been analyzed. Since the here used GIS of the Fraunhofer IOSB is already using and supporting OGC (Open Geospatial Consortium) and NATO-STANAG (NATO-Standardization Agreement) standards for the most part of it, a special attention for possible use was laid on their standards. The most promising standard is the OGC standard 3DPS (3D Portrayal Service) with its occurrences W3DS (Web 3D Service) and WVS (Web View Service). A demo system was created, using a standardized workflow from the data acquiring, storing and provision and showing the benefit of our approach.

Concept for a common operational picture in a guidance vehicle

Boris Wagner, Ralf Eck, Sebastian Maier

Show abstract

A Common Operational Picture (COP) shows many operational aspects in coded form inside a geodata representation like a map. For building this picture, many specialized groups produce information. Beside the operating forces these are intelligences, logistics, or the own leaders planning group. Operations in which a COP is used typically are disaster management or military actions. An existing software for Interactive Visualization of Integrated Geodata runs on Tablet-PCs, PCs, Digital Map Tables and video walls. It is already used by the Deutsche Führungsakademie (military academy) for the education of staff officers. German civil disaster management agency decided to use the Digital Map Table for their intelligence analysis. In a mobile scenario, however, novel requirements have to be taken into account to adapt the software to the new environment. This paper investigates these requirements as well as the possible adaptions to provide a COP across multiple players on the go. When acting together, the groups do this in a widespread manner. They are physically spread and they use a variety of software and hardware to produce their contribution. This requires hardware to be ruggedized, mobile, and to support a variety of interfaces. The limited bandwidth in such a setting poses the main challenge for the software, which has to synchronize exchanging a minimum of information. Especially for mobile participants, a solution is planned that scales the amount of data (maps/intelligence data) to the available equipment, the upcoming mission, and the underlying theatre. Special focus is laid on a guidance vehicle leading a convoy.