Share Email Print

Proceedings Paper

A parallel error diffusion implementation on a GPU
Author(s): Yao Zhang; John Ludd Recker; Robert Ulichney; Giordano B. Beretta; Ingeborg Tastl; I-Jong Lin; John D. Owens
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

In this paper, we investigate the suitability of the GPU for a parallel implementation of the pinwheel error diffusion. We demonstrate a high-performance GPU implementation by efficiently parallelizing and unrolling the image processing algorithm. Our GPU implementation achieves a 10 - 30x speedup over a two-threaded CPU error diffusion implementation with comparable image quality. We have conducted experiments to study the performance and quality tradeoffs for differences in image block sizes. We also present a performance analysis at assembly level to understand the performance bottlenecks.

Paper Details

Date Published: 25 January 2011
PDF: 9 pages
Proc. SPIE 7872, Parallel Processing for Imaging Applications, 78720K (25 January 2011); doi: 10.1117/12.872616
Show Author Affiliations
Yao Zhang, Univ. of California, Davis (United States)
John Ludd Recker, Hewlett-Packard Labs. (United States)
Robert Ulichney, Hewlett-Packard Co. (United States)
Giordano B. Beretta, Hewlett-Packard Labs. (United States)
Ingeborg Tastl, Hewlett-Packard Labs. (United States)
I-Jong Lin, Hewlett-Packard Labs. (United States)
John D. Owens, Univ. of California, Davis (United States)

Published in SPIE Proceedings Vol. 7872:
Parallel Processing for Imaging Applications
John D. Owens; I-Jong Lin; Yu-Jin Zhang; Giordano B. Beretta, Editor(s)

© SPIE. Terms of Use
Back to Top
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research
Forgot your username?