Share Email Print

Proceedings Paper

A parallel error diffusion implementation on a GPU
Author(s): Yao Zhang; John Ludd Recker; Robert Ulichney; Giordano B. Beretta; Ingeborg Tastl; I-Jong Lin; John D. Owens
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

In this paper, we investigate the suitability of the GPU for a parallel implementation of the pinwheel error diffusion. We demonstrate a high-performance GPU implementation by efficiently parallelizing and unrolling the image processing algorithm. Our GPU implementation achieves a 10 - 30x speedup over a two-threaded CPU error diffusion implementation with comparable image quality. We have conducted experiments to study the performance and quality tradeoffs for differences in image block sizes. We also present a performance analysis at assembly level to understand the performance bottlenecks.

Paper Details

Date Published: 25 January 2011
PDF: 9 pages
Proc. SPIE 7872, Parallel Processing for Imaging Applications, 78720K (25 January 2011); doi: 10.1117/12.872616
Show Author Affiliations
Yao Zhang, Univ. of California, Davis (United States)
John Ludd Recker, Hewlett-Packard Labs. (United States)
Robert Ulichney, Hewlett-Packard Co. (United States)
Giordano B. Beretta, Hewlett-Packard Labs. (United States)
Ingeborg Tastl, Hewlett-Packard Labs. (United States)
I-Jong Lin, Hewlett-Packard Labs. (United States)
John D. Owens, Univ. of California, Davis (United States)

Published in SPIE Proceedings Vol. 7872:
Parallel Processing for Imaging Applications
John D. Owens; I-Jong Lin; Yu-Jin Zhang; Giordano B. Beretta, Editor(s)

© SPIE. Terms of Use
Back to Top