Deriving efficient data movement from decoupled Access/Execute specifications
Add to your list(s)
Download to your calendar using vCal
If you have a question about this talk, please contact Boris Feigin.
On multi-core architectures with software-managed memories, effectively orchestrating data movement is essential to performance, but is tedious and error-prone. In our HiPEAC’09 paper we show that when the programmer can explicitly specify both the memory access pattern and the execution schedule of a computation kernel, the compiler or run-time system can derive efficient data movement, even if analysis of kernel code is difficult or impossible. We have developed a framework of C++ classes for decoupled Access/Execute specifications, allowing for automatic communication optimisations such as software pipelining and data reuse. We demonstrate the ease and efficiency of programming Sony/Toshiba/IBM’s Cell BE architecture using these classes by implementing a set of benchmarks, which exhibit data reuse and non-affine access functions, and by comparing these implementations against alternative implementations, which use hand-written DMA transfers and software-based caching.
This talk is part of the Computer Laboratory Programming Research Group Seminar series.
This talk is included in these lists:
Note that ex-directory lists are not shown.
|