Share Email Print

Proceedings Paper

Stabilization and registration of full-motion video data using deep convolutional neural networks
Author(s): Derek J. Walvoord; Doug W. Couwenhoven; Michael A. Bayer
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

Stabilization and registration are common techniques applied to overhead imagery and full-motion video (FMV) during production to facilitate further exploitation by the end user. Algorithms designed to accom- plish these tasks must accommodate changes in capture geometry, atmospheric effects, and sensor charac- teristics. Moreover, algorithms that rely on a controlled image base (CIB) reference typically require some degree of robustness with respect to differences in imaging modality. While many factors contributing to gross misalignment can be mitigated using available sensor telemetry and rigorous photogrammetric modeling, the subsequent image-based registration task often relies on loose model assumptions and poor generalizations. This work presents a modality-agnostic deep learning approach to automatically stabilize and register overhead FMV data to a reference image such as a CIB. The field of deep learning has received significant attention in recent years with advances in high-performance computing and the availability of widely adopted open source tools for numerical computation using data flow graphs. We leverage recent developments in the use of fully differentiable spatial transformer networks to simultaneously remove coarse geometric differences and fine local misalignments in the registration process. Most importantly, no model is required. A convolutional neural network (ConvNet), complete with a spatial transformer, is trained using pairs of frames of FMV data as the input and corresponding label. Once the mechanism by which the deformable warp is learned, the trained network ingests new data and returns a version of the input image sequence that has been warped to a user-specified reference. The performance of our approach is evaluated using several real FMV data sets.

Paper Details

Date Published: 27 April 2018
PDF: 11 pages
Proc. SPIE 10646, Signal Processing, Sensor/Information Fusion, and Target Recognition XXVII, 1064612 (27 April 2018); doi: 10.1117/12.2305072
Show Author Affiliations
Derek J. Walvoord, Harris Corp. (United States)
Doug W. Couwenhoven, Harris Corp. (United States)
Michael A. Bayer, Harris Corp. (United States)

Published in SPIE Proceedings Vol. 10646:
Signal Processing, Sensor/Information Fusion, and Target Recognition XXVII
Ivan Kadar, Editor(s)

© SPIE. Terms of Use
Back to Top