articleACM Transactions on Mathematical SoftwareMay 1, 2008Closed access

Anatomy of high-performance matrix multiplication

The University of Texas at Austin

Indexed incrossref

Abstract

We present the basic principles that underlie the high-performance implementation of the matrix-matrix multiplication that is part of the widely used GotoBLAS library. Design decisions are justified by successively refining a model of architectures with multilevel memories. A simple but effective algorithm for executing this operation results. Implementations on a broad selection of architectures are shown to achieve near-peak performance.

Citation impact

756
total citations
FWCI
36.59
Percentile
100%
References
21
Citations per year

Authors

2

Topics & keywords

Keywords
  • Computer science
  • Matrix multiplication
  • Multiplication (music)
  • Simple (philosophy)
  • Selection (genetic algorithm)
  • Implementation
  • Matrix (chemical analysis)
  • Parallel computing
No related works found for this paper.

Funding