Share Email Print
cover

Proceedings Paper

GPU color space conversion
Author(s): Patrick Chase; Gary Vondran
Format Member Price Non-Member Price
PDF $14.40 $18.00

Paper Abstract

Tetrahedral interpolation is commonly used to implement continuous color space conversions from sparse 3D and 4D lookup tables. We investigate the implementation and optimization of tetrahedral interpolation algorithms for GPUs, and compare to the best known CPU implementations as well as to a well known GPU-based trilinear implementation. We show that a $500 NVIDIA GTX-580 GPU is 3x faster than a $1000 Intel Core i7 980X CPU for 3D interpolation, and 9x faster for 4D interpolation. Performance-relevant GPU attributes are explored including thread scheduling, local memory characteristics, global memory hierarchy, and cache behaviors. We consider existing tetrahedral interpolation algorithms and tune based on the structure and branching capabilities of current GPUs. Global memory performance is improved by reordering and expanding the lookup table to ensure optimal access behaviors. Per multiprocessor local memory is exploited to implement optimally coalesced global memory accesses, and local memory addressing is optimized to minimize bank conflicts. We explore the impacts of lookup table density upon computation and memory access costs. Also presented are CPU-based 3D and 4D interpolators, using SSE vector operations that are faster than any previously published solution.

Paper Details

Date Published: 25 January 2011
PDF: 9 pages
Proc. SPIE 7872, Parallel Processing for Imaging Applications, 78720D (25 January 2011); doi: 10.1117/12.876678
Show Author Affiliations
Patrick Chase, Hewlett-Packard Co. (United States)
Gary Vondran, Hewlett-Packard Co. (United States)


Published in SPIE Proceedings Vol. 7872:
Parallel Processing for Imaging Applications
John D. Owens; I-Jong Lin; Yu-Jin Zhang; Giordano B. Beretta, Editor(s)

© SPIE. Terms of Use
Back to Top