Share Email Print

Proceedings Paper

Parallel DSP with memory and I/O processors
Author(s): Vason P. Srini; John Thendean; Sain-Zee Ueng; Jan M. Rabaey
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

The design and implementation of a parallel digital signal processing systems on a chip containing 64 computational processors, 16 memory processors, and 16 I/O processors is described. The processors are interconnected by two levels of segmented buses. Each computational processor has a 16- bit data path and a control unit. The instruction set of the 16-bit processor supports computations on streams of data present in video, graphics, image processing, and digital communication applications. Two's complement arithmetic, saturation arithmetic, and packed instructions are supported. Higher data precision such as 32-bit and 64-bit can be achieved by cascading processors. The instruction memory of each computational processor has sixteen 40-bit words. Data streaming through the processor is manipulated by the instructions in the instruction memory. Multiple operations can be performed in a single cycle in a processor. A handshake protocol is used for synchronization between the sending and receiving processors. Six programmable registers are available in each computational processor for storing data. Each memory processor has a 256 X 16 storage unit for storing additional data. The memory processors can be statically configured as a delay line, FIFO, lookup table or random access memory. For each memory processor there are four FSMs supporting the four configurations. The I/O processors are provided for external communication. Multiple parallel processing chips, digital output from sensors, and SRAM chips can be interconnected using the I/O processors. The VLSI chips implementing the processes is organized as 16 clusters interconnected by a statically programmable hierarchical bus structure. The buses are segmented by programming the switches on the bus. Each cluster has six 16-bit data buses and four 2-bit control buses for supporting communication between four computational processors, one memory processor, and one I/O processor. In addition, adjacent processors can communicate using a bypass bus. The clusters are interconnected by sixteen 16-bit data buses and eight 2-bit control buses. Each cluster has 60 programmable switches to control the communication between the intracluster and intercluster buses. Each processor has 17 programmable switches to control the connections to the intracluster buses.

Paper Details

Date Published: 21 September 1998
PDF: 12 pages
Proc. SPIE 3452, Parallel and Distributed Methods for Image Processing II, (21 September 1998); doi: 10.1117/12.323469
Show Author Affiliations
Vason P. Srini, Data Flux Systems Inc. (United States)
John Thendean, Univ. of California/Berkeley (United States)
Sain-Zee Ueng, Univ. of California/Berkeley (United States)
Jan M. Rabaey, Univ. of California/Berkeley (United States)

Published in SPIE Proceedings Vol. 3452:
Parallel and Distributed Methods for Image Processing II
Hongchi Shi; Patrick C. Coffield, Editor(s)

© SPIE. Terms of Use
Back to Top