Share Email Print

Proceedings Paper

SIMD-aware loop unrolling for embedded code optimization
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

Due to the rising complexity of modern embedded media applications (EMAs), the instruction level parallelism (ILP) is not sufficient to meet the need. Compilers must have the capability to exploit the superword level parallelism (SLP), which can expose more concurrency lying in applications, minimize the latency created by memory access and hence produce more efficient codes. The loop is a good candidate for SLP extraction because of its paralleled structure between iterations. This work analyzes the memory access patterns found in EMAs and presents our method of loop unrolling to fully utilize these patterns to generate efficient Single Instruction Multiple Data (SIMD) instructions. Experimental results performed on TriMedia TM-1300 processor for the H.264 encoder show performance improvement by a factor ranging from 3 to 30 times with an average of 12 times.

Paper Details

Date Published: 19 November 2003
PDF: 12 pages
Proc. SPIE 5241, Multimedia Systems and Applications VI, (19 November 2003); doi: 10.1117/12.513540
Show Author Affiliations
Yunyang Dai, Univ. of Southern California (United States)
Qing Li, Univ. of Southern California (United States)
Qi Zhang, Univ. of Southern California (United States)
C.-C. Jay Kuo, Univ. of Southern California (United States)

Published in SPIE Proceedings Vol. 5241:
Multimedia Systems and Applications VI
Andrew G. Tescher; Bhaskaran Vasudev; V. Michael Bove; Ajay Divakaran, Editor(s)

© SPIE. Terms of Use
Back to Top