Share Email Print

Proceedings Paper

FPGA wavelet processor design using language for instruction-set architectures (LISA)
Author(s): Uwe Meyer-Bäse; Alonzo Vera; Suhasini Rao; Karl Lenk; Marios Pattichis
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

The design of an microprocessor is a long, tedious, and error-prone task consisting of typically three design phases: architecture exploration, software design (assembler, linker, loader, profiler), architecture implementation (RTL generation for FPGA or cell-based ASIC) and verification. The Language for instruction-set architectures (LISA) allows to model a microprocessor not only from instruction-set but also from architecture description including pipelining behavior that allows a design and development tool consistency over all levels of the design. To explore the capability of the LISA processor design platform a.k.a. CoWare Processor Designer we present in this paper three microprocessor designs that implement a 8/8 wavelet transform processor that is typically used in today's FBI fingerprint compression scheme. We have designed a 3 stage pipelined 16 bit RISC processor (NanoBlaze). Although RISC &mgr;Ps are usually considered "fast" processors due to design concept like constant instruction word size, deep pipelines and many general purpose registers, it turns out that DSP operations consume essential processing time in a RISC processor. In a second step we have used design principles from programmable digital signal processor (PDSP) to improve the throughput of the DWT processor. A multiply-accumulate operation along with indirect addressing operation were the key to achieve higher throughput. A further improvement is possible with today's FPGA technology. Today's FPGAs offer a large number of embedded array multipliers and it is now feasible to design a "true" vector processor (TVP). A multiplication of two vectors can be done in just one clock cycle with our TVP, a complete scalar product in two clock cycles. Code profiling and Xilinx FPGA ISE synthesis results are provided that demonstrate the essential improvement that a TVP has compared with traditional RISC or PDSP designs.

Paper Details

Date Published: 9 April 2007
PDF: 12 pages
Proc. SPIE 6576, Independent Component Analyses, Wavelets, Unsupervised Nano-Biomimetic Sensors, and Neural Networks V, 65760U (9 April 2007); doi: 10.1117/12.719020
Show Author Affiliations
Uwe Meyer-Bäse, Florida State Univ. (United States)
Alonzo Vera, The Univ. of New Mexico (United States)
Suhasini Rao, Florida State Univ. (United States)
Karl Lenk, Florida State Univ. (United States)
Marios Pattichis, The Univ. of New Mexico (United States)

Published in SPIE Proceedings Vol. 6576:
Independent Component Analyses, Wavelets, Unsupervised Nano-Biomimetic Sensors, and Neural Networks V
Harold H. Szu; Jack Agee, Editor(s)

© SPIE. Terms of Use
Back to Top