Share Email Print

Proceedings Paper

Automatic Mapping Of Large Signal Processing Systems To A Parallel Machine
Author(s): Harry Printz; H. T. Kung; Todd Mummert; Paul Scherer
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

Since the spring of 1988, Carnegie Mellon University and the Naval Air Development Center have been working together to implement several large signal processing systems on the Warp parallel computer. In the course of this work, we have developed a prototype of a software tool that can automatically and efficiently map signal processing systems to distributed-memory parallel machines, such as Warp. We have used this tool to produce Warp implementations of small test systems. The automatically generated programs compare favorably with hand-crafted code. We believe this tool will be a significant aid in the creation of high speed signal processing systems. We assume that signal processing systems have the following characteristics: •They can be described by directed graphs of computational tasks; these graphs may contain thousands of task vertices. • Some tasks can be parallelized in a systolic or data-partitioned manner, while others cannot be parallelized at all. • The side effects of each task, if any, are limited to changes in local variables. • Each task has a data-independent execution time bound, which may be expressed as a function of the way it is parallelized, and the number of processors it is mapped to. In this paper we describe techniques to automatically map such systems to Warp-like parallel machines. We identify and address key issues in gracefully combining different parallel programming styles, in allocating processor, memory and communication bandwidth, and in generating and scheduling efficient parallel code. When iWarp, the VLSI version of the Warp machine, becomes available in 1990, we will extend this tool to generate efficient code for very large applications, which may require as many as 3000 iWarp processors, with an aggregate peak performance of 60 gigaflops.

Paper Details

Date Published: 6 December 1989
PDF: 15 pages
Proc. SPIE 1154, Real-Time Signal Processing XII, (6 December 1989); doi: 10.1117/12.962367
Show Author Affiliations
Harry Printz, Carnegie Mellon University (United States)
H. T. Kung, Carnegie Mellon University (United States)
Todd Mummert, Carnegie Mellon University (United States)
Paul Scherer, Naval Air Development Center (United States)

Published in SPIE Proceedings Vol. 1154:
Real-Time Signal Processing XII
J. P. Letellier, Editor(s)

© SPIE. Terms of Use
Back to Top