Cache-oblivious algorithms
Massachusetts Institute of Technology
Abstract
This paper presents asymptotically optimal algorithms for rectangular matrix transpose, FFT, and sorting on computers with multiple levels of caching. Unlike previous optimal algorithms, these algorithms are cache oblivious: no variables dependent on hardware parameters, such as cache size and cache-line length, need to be tuned to achieve optimality. Nevertheless, these algorithms use an optimal amount of work and move data optimally among multiple levels of cache. For a cache with size Z and cache-line length L where Z=/spl Omega/(L/sup 2/) the number of cache misses for an m/spl times/n matrix transpose is /spl Theta/(1+mn/L). The number of cache misses for either an n-point FFT or the sorting of n numbers…
Citation impact
- FWCI
- 30.05
- Percentile
- 100%
- References
- 47
Authors
4Topics & keywords
- Cache
- Cache-oblivious algorithm
- Computer science
- Cache algorithms
- Parallel computing
- Transpose
- Algorithm
- Sorting