Share Email Print

Proceedings Paper

3D hierarchical spatial representation and memory of multimodal sensory data
Author(s): Deepak Khosla; Paul A. Dow; David J. Huber
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

This paper describes an efficient method and system for representing, processing and understanding multi-modal sensory data. More specifically, it describes a computational method and system for how to process and remember multiple locations in multimodal sensory space (e.g., visual, auditory, somatosensory, etc.). The multimodal representation and memory is based on a biologically-inspired hierarchy of spatial representations implemented with novel analogues of real representations used in the human brain. The novelty of the work is in the computationally efficient and robust spatial representation of 3D locations in multimodal sensory space as well as an associated working memory for storage and recall of these representations at the desired level for goal-oriented action. We describe (1) A simple and efficient method for human-like hierarchical spatial representations of sensory data and how to associate, integrate and convert between these representations (head-centered coordinate system, body-centered coordinate, etc.); (2) a robust method for training and learning a mapping of points in multimodal sensory space (e.g., camera-visible object positions, location of auditory sources, etc.) to the above hierarchical spatial representations; and (3) a specification and implementation of a hierarchical spatial working memory based on the above for storage and recall at the desired level for goal-oriented action(s). This work is most useful for any machine or human-machine application that requires processing of multimodal sensory inputs, making sense of it from a spatial perspective (e.g., where is the sensory information coming from with respect to the machine and its parts) and then taking some goal-oriented action based on this spatial understanding. A multi-level spatial representation hierarchy means that heterogeneous sensory inputs (e.g., visual, auditory, somatosensory, etc.) can map onto the hierarchy at different levels. When controlling various machine/robot degrees of freedom, the desired movements and action can be computed from these different levels in the hierarchy. The most basic embodiment of this machine could be a pan-tilt camera system, an array of microphones, a machine with arm/hand like structure or/and a robot with some or all of the above capabilities. We describe the approach, system and present preliminary results on a real-robotic platform.

Paper Details

Date Published: 13 April 2009
PDF: 8 pages
Proc. SPIE 7345, Multisensor, Multisource Information Fusion: Architectures, Algorithms, and Applications 2009, 73450O (13 April 2009); doi: 10.1117/12.820363
Show Author Affiliations
Deepak Khosla, HRL Labs., LLC (United States)
Paul A. Dow, HRL Labs., LLC (United States)
David J. Huber, HRL Labs., LLC (United States)

Published in SPIE Proceedings Vol. 7345:
Multisensor, Multisource Information Fusion: Architectures, Algorithms, and Applications 2009
Belur V. Dasarathy, Editor(s)

© SPIE. Terms of Use
Back to Top