Share Email Print
cover

Proceedings Paper • new

Deep learning architecture advancements for accurate and robust image registration
Author(s): Derek J. Walvoord; Doug W. Couwenhoven
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

Registration of image collections and video sequences is a critical component in algorithms designed to extract actionable intelligence from remotely sensed data. While methodologies for registration continue to evolve, the accuracy of alignment remains dependent on how well the approach tolerates changes in capture geometry, sensor characteristics, and scene content. Differences in imaging modality and field-of-view present additional challenges. Registration techniques have progressed from simple, global correlation-based algorithms, to higher-order model fitting using salient image features, to two-stage approaches leveraging high-fidelity sensor geometry, to new methods that exploit high-performance computing and convolutional neural networks (ConvNets). The latter offers important advantages by removing model assumptions and learning feature extraction directly through the minimization of a registration cost function. Deep learning approaches to image registration are still relatively unexplored for overhead imaging, and their ability to accommodate a large problem domain offers potential for several new developments. This work presents a new network architecture that improves accuracy and generalization capabilities over our modality-agnostic deep learning approach to registration that recently advanced the state of the art. A thoroughly tested ConvNet pyramid remains the core of our network approach, and has been optimized for registration and generalized to begin addressing derivative applications such as mosaic generation. Further modifications, such as objective function masking and reduced interpolation, have also been implemented to improve the overall registration process. As before, the trained network ingests image frames, applies a vector field, and returns a version of the input image that has been warped to the reference. Qualitative and quantitative performance of the new architecture is evaluated using several overhead still and full-motion video (FMV) data sets.

Paper Details

Date Published: 7 May 2019
PDF: 18 pages
Proc. SPIE 11018, Signal Processing, Sensor/Information Fusion, and Target Recognition XXVIII, 1101812 (7 May 2019); doi: 10.1117/12.2522044
Show Author Affiliations
Derek J. Walvoord, Harris Corp. (United States)
Doug W. Couwenhoven, Harris Corp. (United States)


Published in SPIE Proceedings Vol. 11018:
Signal Processing, Sensor/Information Fusion, and Target Recognition XXVIII
Ivan Kadar; Erik P. Blasch; Lynne L. Grewe, Editor(s)

© SPIE. Terms of Use
Back to Top