Share Email Print

Proceedings Paper

Three-dimensional object recognition using gradient descent and the universal 3-D array grammar
Author(s): Leemon C. Baird; Patrick S. P. Wang
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

A new algorithm is presented for applying Marill's minimum standard deviation of angles (MSDA) principle for interpreting line drawings without models. Even though no explicit models or additional heuristics are included, the algorithm tends to reach the same 3-D interpretations of 2-D line drawings that humans do. Marill's original algorithm repeatedly generated a set of interpretations and chose the one with the lowest standard deviation of angles (SDA). The algorithm presented here explicitly calculates the partial derivatives of SDA with respect to all adjustable parameters, and follows this gradient to minimize SDA. For a picture with lines meeting at m points forming n angles, the gradient descent algorithm requires O(n) time to adjust all the points, while the original algorithm required O(mn) time to do so. For the pictures described by Marill, this gradient descent algorithm running on a Macintosh II was found to be one to two orders of magnitude faster than the original algorithm running on a Symbolics, while still giving comparable results. Once the 3-D interpretation of the line drawing has been found, the 3-D object can be reduced to a description string using the Universal 3-D Array Grammar. This is a general grammar which allows any connected object represented as a 3-D array of pixels to be reduced to a description string. The algorithm based on this grammar is well suited to parallel computation, and could run efficiently on parallel hardware. This paper describes both the MSDA gradient descent algorithm and the Universal 3-D Array Grammar algorithm. Together, they transform a 2-D line drawing represented as a list of line segments into a string describing the 3-D object pictured. The strings could then be used for object recognition, learning, or storage for later manipulation.

Paper Details

Date Published: 1 February 1992
PDF: 9 pages
Proc. SPIE 1607, Intelligent Robots and Computer Vision X: Algorithms and Techniques, (1 February 1992); doi: 10.1117/12.57106
Show Author Affiliations
Leemon C. Baird, Northeastern Univ. (United States)
Patrick S. P. Wang, Northeastern Univ. (United States)

Published in SPIE Proceedings Vol. 1607:
Intelligent Robots and Computer Vision X: Algorithms and Techniques
David P. Casasent, Editor(s)

© SPIE. Terms of Use
Back to Top