Show all abstracts
View Session
- Front Matter: Volume 10199
- Video Analysis
- Photogrammetry and Uncertainty Propagation
- Geospatial Analysis
Front Matter: Volume 10199
Front Matter: Volume 10199
Show abstract
This PDF file contains the front matter associated with SPIE Proceedings Volume 10199, including the Title Page, Copyright information, Table of Contents, and Conference Committee listing.
Video Analysis
Pilot study on real-time motion detection in UAS video data by human observer and image exploitation algorithm
Show abstract
Real-time motion video analysis is a challenging and exhausting task for the human observer, particularly in safety and
security critical domains. Hence, customized video analysis systems providing functions for the analysis of subtasks like
motion detection or target tracking are welcome. While such automated algorithms relieve the human operators from
performing basic subtasks, they impose additional interaction duties on them. Prior work shows that, e.g., for interaction
with target tracking algorithms, a gaze-enhanced user interface is beneficial.
In this contribution, we present an investigation on interaction with an independent motion detection (IDM) algorithm.
Besides identifying an appropriate interaction technique for the user interface – again, we compare gaze-based and
traditional mouse-based interaction – we focus on the benefit an IDM algorithm might provide for an UAS video analyst.
In a pilot study, we exposed ten subjects to the task of moving target detection in UAS video data twice, once performing
with automatic support, once performing without it. We compare the two conditions considering performance in terms of
effectiveness (correct target selections). Additionally, we report perceived workload (measured using the NASA-TLX
questionnaire) and user satisfaction (measured using the ISO 9241-411 questionnaire).
The results show that a combination of gaze input and automated IDM algorithm provides valuable support for the
human observer, increasing the number of correct target selections up to 62% and reducing workload at the same time.
Data transpositioning with content-based image retrieval
Show abstract
Currently, when data is collected, it is usually collected for a specific need or situation. This includes text and image data. When a new need or situation arises, the data collection process repeats, often without referencing the original data collected for previous situations. Data Transpositioning is a search methodology that leverages the context of a previous manual search process, to formulate a new automated search with new results. As a result, the data collection process for one situation, can quickly be applied to another situation, but with less user effort. Thus, a set of new results can quickly be constructed without the user manually revisiting each of the originating sources. In the case of Content-Based Image Retrieval, the idea is to identify the content attributes of an image, such as a particular color, shape or texture, and apply changes to the originating query, and return a new set of results with the similar attributes. Data Transpositioning has been successfully applied to the result sets that contain text. Our goal is to continue this research beyond text to solve more complex problems in other domains, especially when image data are involved.
An analysis of optical flow on real and simulated data with degradations
Show abstract
Estimating the motion of moving targets from a moving platform is an extremely challenging problem in un-manned systems research. One common and often successful approach is to use optical flow for motion estimation to account for ego-motion of the platform and to then track the motion of surrounding objects. However, in the presence of video degradation such as noise, compression artifacts, and reduced frame rates, the performance of
state-of-the-art optical flow algorithms greatly diminishes. We consider the effects of video degradation on two well-known optical flow datasets as well as on a real-world video data. To highlight the need for robust optical flow algorithms in the presence of real-world conditions, we present both qualitative and quantitative results on
these datasets.
Geopositioning with a quadcopter: extracted feature locations and predicted accuracy without a priori sensor attitude information
Show abstract
This paper presents an overview of the Full Motion Video-Geopositioning Test Bed (FMV-GTB) developed to
investigate algorithm performance and issues related to the registration of motion imagery and subsequent extraction of
feature locations along with predicted accuracy. A case study is included corresponding to a video taken from a
quadcopter. Registration of the corresponding video frames is performed without the benefit of a priori sensor attitude
(pointing) information. In particular, tie points are automatically measured between adjacent frames using standard
optical flow matching techniques from computer vision, an a priori estimate of sensor attitude is then computed based
on supplied GPS sensor positions contained in the video metadata and a photogrammetric/search-based structure from
motion algorithm, and then a Weighted Least Squares adjustment of all a priori metadata across the frames is performed.
Extraction of absolute 3D feature locations, including their predicted accuracy based on the principles of rigorous error
propagation, is then performed using a subset of the registered frames. Results are compared to known locations (check
points) over a test site. Throughout this entire process, no external control information (e.g. surveyed points) is used
other than for evaluation of solution errors and corresponding accuracy.
Photogrammetry and Uncertainty Propagation
Using image quality metrics to identify adversarial imagery for deep learning networks
Show abstract
Deep learning has continued to gain momentum in applications across many critical areas of research in computer vision and machine learning. In particular, deep learning networks have had much success in image classification, especially when training data are abundantly available, as is the case with the ImageNet project. However, several researchers have exposed potential vulnerabilities of these networks to carefully crafted adversarial imagery. Additionally, researchers have shown the sensitivity of these networks to some types of noise and distortion. In this paper, we investigate the use of no-reference image quality metrics to identify adversarial imagery and images of poor quality that could potentially fool a deep learning network or dramatically reduce its accuracy. Results are shown on several adversarial image databases with comparisons to popular image classification databases.
Methods for the specification and validation of geolocation accuracy and predicted accuracy
Show abstract
The specification of geolocation accuracy requirements and their validation is essential for the proper performance of a
Geolocation System and for trust in resultant three dimensional (3d) geolocations. This is also true for predicted
accuracy requirements and their validation for a Geolocation System, which assumes that each geolocation produced
(extracted) by the system is accompanied by an error covariance matrix that characterizes its specific predicted accuracy.
The extracted geolocation and its error covariance matrix are standard outputs of (near) optimal estimators, either
associated (internally) with the Geolocation System itself, or with a “downstream” application that inputs a subset of
Geolocation System output, such as sensor data/metadata: for example, a set of images and corresponding metadata of
the imaging sensor’s pose and its predicted accuracy. This output allows for subsequent (near) optimal extraction of
geolocations and associated error covariance matrices based on the application’s measurements of pixel locations in the
images corresponding to objects of interest. This paper presents recommended methods and detailed equations for the
specification and validation of both accuracy and predicted accuracy requirements for a general Geolocation System.
The specification/validation of accuracy requirements are independent from the specification/validation of predicted
accuracy requirements. The methods presented in this paper are theoretically rigorous yet practical.
Three-dimensional scene reconstruction from a two-dimensional image
Show abstract
We propose and simulate a method of reconstructing a three-dimensional scene from a two-dimensional image for developing and augmenting world models for autonomous navigation. This is an extension of the Perspective-n-Point (PnP) method which uses a sampling of the 3D scene, 2D image point parings, and Random Sampling Consensus (RANSAC) to infer the pose of the object and produce a 3D mesh of the original scene. Using object recognition and segmentation, we simulate the implementation on a scene of 3D objects with an eye to implementation on embeddable hardware. The final solution will be deployed on the NVIDIA Tegra platform.
Correlation-agnostic fusion for improved uncertainty estimation in multi-view geo-location from UAVs
Clark N. Taylor,
Paul O. Sundlie
Show abstract
When geo-locating ground objects from a UAV, multiple views of the same object can lead to improved geo- location accuracy. Of equal importance to the location estimate, however, is the uncertainty estimate associated with that location. Standard methods for estimating uncertainty from multiple views generally assume that each view represents an independent measurement of the geo-location. Unfortunately, this assumption is often violated due to correlation between the location estimates. This correlation may occur due to the measurements coming from the same platform, meaning that the error in attitude or location may be correlated across time; or it may be due to external sources (such as GPS) having the same error in multiple aircraft. In either case, the geo-location estimates are not truly independent, leading to optimistic estimates of the geo-location uncertainty.
For distributed data fusion applications, correlation-agnostic fusion methods have been developed that can fuse data together regardless of how much correlation may be present between the two estimates. While the results are generally not as impressive as when correlation is perfectly known and taken into account, the fused uncertainty results are guaranteed to be conservative and an improvement on operating without fusion. In this paper, we apply a selection of these correlation-agnostic fusion techniques to the multi-view geo-location problem and analyze their effects on geo-location and predicted uncertainty accuracy. We find that significant benefits can be found from applying these correlation agnostic fusion effects, but that they vary greatly in how well they estimate their own uncertainty.
Geospatial Analysis
Geoparsing text for characterizing urban operational environments through machine learning techniques
Show abstract
Increasing worldwide internet connectivity and access to sources of print and open social media has increased near realtime
availability of textual information. Capabilities to structure and integrate textual data streams can contribute to more
meaningful representations of operational environment factors (i.e., Political, Military, Economic, Social, Infrastructure,
Information, Physical Environment, and Time [PMESII-PT]) and tactical civil considerations (i.e., Areas, Structures,
Capabilities, Organizations, People and Events [ASCOPE]). However, relying upon human analysts to encode this
information as it arrives quickly proves intractable. While human analysts possess an ability to comprehend context in
unstructured text far beyond that of computers, automated geoparsing (the extraction of locations from unstructured text)
can empower analysts to automate sifting through datasets for areas of interest. This research evaluates existing
approaches to geoprocessing as well as initiating the research and development of locally-improved methods of tagging
parts of text as possible locations, resolving possible locations into coordinates, and interfacing such results with human
analysts. The objective of this ongoing research is to develop a more contextually-complete picture of an area of interest
(AOI) including human-geographic context for events. In particular, our research is working to make improvements to
geoparsing (i.e., the extraction of spatial context from documents), which requires development, integration, and
validation of named-entity recognition (NER) tools, gazetteers, and entity-attribution. This paper provides an overview
of NER models and methodologies as applied to geoparsing, explores several challenges encountered, presents
preliminary results from the creation of a flexible geoparsing research pipeline, and introduces ongoing and future work
with the intention of contributing to the efficient geocoding of information containing valuable insights into human
activities in space.
A machine learning pipeline for automated registration and classification of 3D lidar data
Show abstract
Despite the large availability of geospatial data, registration and exploitation of these datasets remains a persis- tent challenge in geoinformatics. Popular signal processing and machine learning algorithms, such as non-linear SVMs and neural networks, rely on well-formatted input models as well as reliable output labels, which are not always immediately available. In this paper we outline a pipeline for gathering, registering, and classifying initially unlabeled wide-area geospatial data. As an illustrative example, we demonstrate the training and test- ing of a convolutional neural network to recognize 3D models in the OGRIP 2007 LiDAR dataset using fuzzy labels derived from OpenStreetMap as well as other datasets available on OpenTopography.org. When auxiliary label information is required, various text and natural language processing filters are used to extract and cluster keywords useful for identifying potential target classes. A subset of these keywords are subsequently used to form multi-class labels, with no assumption of independence. Finally, we employ class-dependent geometry extraction routines to identify candidates from both training and testing datasets. Our regression networks are able to identify the presence of 6 structural classes, including roads, walls, and buildings, in volumes as big as 8000 m3 in as little as 1.2 seconds on a commodity 4-core Intel CPU. The presented framework is neither dataset nor sensor-modality limited due to the registration process, and is capable of multi-sensor data-fusion.
General linear hypothesis test: a method for algorithm selection
Show abstract
Algorithm selection is paramount in determining how to implement a process. When the results can be computed
directly, an algorithm that reduces computational complexity is selected. When the results less binary there can be difficulty
in choosing the proper implementation. Weighing the effect of different pieces of the algorithm on the final result can be
difficult to find. In this research, we propose using a statistical analysis tool known as General Linear Hypothesis to find
the effect of different pieces of an algorithm implementation on the end result. This will be done with transform based
image fusion techniques. This study will weigh the effect of different transforms, fusion techniques, and evaluation metrics
on the resulting images. We will find the best no-reference metric for image fusion algorithm selection and test this method
on multiple types of image sets. This assessment will provide a valuable tool for algorithm selection to augment current
techniques when results are not binary.
Matrix sketching for big data reduction (Conference Presentation)
Show abstract
Abstract: In recent years, the concept of Big Data has become a more prominent issue as the volume of data as well as the velocity in which it is produced exponentially increases. By 2020 the amount of data being stored is estimated to be 44 Zettabytes and currently over 31 Terabytes of data is being generated every second. Algorithms and applications must be able to effectively scale to the volume of data being generated. One such application designed to effectively and efficiently work with Big Data is IBM’s Skylark. Part of DARPA’s XDATA program, an open-source catalog of tools to deal with Big Data; Skylark, or Sketching-based Matrix Computations for Machine Learning is a library of functions designed to reduce the complexity of large scale matrix problems that also implements kernel-based machine learning tasks. Sketching reduces the dimensionality of matrices through randomization and compresses matrices while preserving key properties, speeding up computations. Matrix sketches can be used to find accurate solutions to computations in less time, or can summarize data by identifying important rows and columns. In this paper, we investigate the effectiveness of sketched matrix computations using IBM’s Skylark versus non-sketched computations. We judge effectiveness based on several factors: computational complexity and validity of outputs. Initial results from testing with smaller matrices are promising, showing that Skylark has a considerable reduction ratio while still accurately performing matrix computations.
Double-density and dual-tree based methods for image super resolution
Show abstract
When several low-resolution images are taken of the same scene, they often contain aliasing and differing subpixel
shifts causing different focuses of the scene. Super-resolution imaging is a technique that can be used to construct
high-resolution imagery from these low-resolution images. By combining images, high frequency components are
amplified while removing blurring and artifacting. Super-resolution reconstruction techniques include methods such as the
Non-Uniform Interpolation Approach, which is low resource and allows for real-time applications, or the Frequency
Domain Approach. These methods make use of aliasing in low-resolution images as well as the shifting property of the
Fourier transform. Problems arise with both approaches, such as limited types of blurred images that can be used or creating
non-optimal reconstructions. Many methods of super-resolution imaging use the Fourier transformation or wavelets but
the field is still evolving for other wavelet techniques such as the Dual-Tree Discrete Wavelet Transform (DTDWT) or the
Double-Density Discrete Wavelet Transform (DDDWT). In this paper, we propose a super-resolution method using these
wavelet transformations for use in generating higher resolution imagery. We evaluate the performance and validity of our
algorithm using several metrics, including Spearman Rank Order Correlation Coefficient (SROCC), Pearson’s Linear
Correlation Coefficient (PLCC), Structural Similarity Index Metric (SSIM), Root Mean Square Error (RMSE), and PeakSignal-Noise
Ratio (PSNR). Initial results are promising, indicating that extensions of the wavelet transformations produce
a more robust high resolution image when compared to traditional methods.
Standardized acquisition, storing and provision of 3D enabled spatial data
Show abstract
In the area of working with spatial data, in addition to the classic, two-dimensional geometrical data (maps, aerial
images, etc.), the needs for three-dimensional spatial data (city models, digital elevation models, etc.) is increasing.
Due to this increased demand the acquiring, storing and provision of 3D enabled spatial data in Geographic Information
Systems (GIS) is more and more important. Existing proprietary solutions quickly reaches their limits during data
exchange and data delivery to other systems. They generate a large workload, which will be very costly. However, it is
noticeable that these expenses and costs can generally be significantly reduced using standards. The aim of this research
is therefore to develop a concept in the field of three-dimensional spatial data that runs on existing standards whenever
possible. In this research, the military image analysts are the preferred user group of the system.
To achieve the objective of the widest possible use of standards in spatial 3D data, existing standards, proprietary
interfaces and standards under discussion have been analyzed. Since the here used GIS of the Fraunhofer IOSB is
already using and supporting OGC (Open Geospatial Consortium) and NATO-STANAG (NATO-Standardization
Agreement) standards for the most part of it, a special attention for possible use was laid on their standards.
The most promising standard is the OGC standard 3DPS (3D Portrayal Service) with its occurrences W3DS (Web 3D
Service) and WVS (Web View Service). A demo system was created, using a standardized workflow from the data
acquiring, storing and provision and showing the benefit of our approach.
Concept for a common operational picture in a guidance vehicle
Boris Wagner,
Ralf Eck,
Sebastian Maier
Show abstract
A Common Operational Picture (COP) shows many operational aspects in coded form inside a geodata representation like
a map. For building this picture, many specialized groups produce information. Beside the operating forces these are
intelligences, logistics, or the own leaders planning group. Operations in which a COP is used typically are disaster
management or military actions.
An existing software for Interactive Visualization of Integrated Geodata runs on Tablet-PCs, PCs, Digital Map Tables and
video walls. It is already used by the Deutsche Führungsakademie (military academy) for the education of staff officers.
German civil disaster management agency decided to use the Digital Map Table for their intelligence analysis.
In a mobile scenario, however, novel requirements have to be taken into account to adapt the software to the new
environment. This paper investigates these requirements as well as the possible adaptions to provide a COP across multiple
players on the go. When acting together, the groups do this in a widespread manner. They are physically spread and they
use a variety of software and hardware to produce their contribution. This requires hardware to be ruggedized, mobile, and
to support a variety of interfaces. The limited bandwidth in such a setting poses the main challenge for the software, which
has to synchronize exchanging a minimum of information. Especially for mobile participants, a solution is planned that
scales the amount of data (maps/intelligence data) to the available equipment, the upcoming mission, and the underlying
theatre. Special focus is laid on a guidance vehicle leading a convoy.