Journal "Software Engineering"
a journal on theoretical and applied science and technology
ISSN 2220-3397

Vol. 7, no 3 2016 year

DOI: 10.17587/prin.7.132-139
Double Block Data Layout in High Performance Matrix Multiplication Algorithm
M. V. Yurushkin, m.yurushkin@gmail.com, SFEDU, Rostov on Don, 344090, Russian Federation
Corresponding author: Yurushkin Mikhail V., Engineer, SFEDU, Rostov on Don, 344090, Russian Federation, e-mail: m.yurushkin@gmail.com
Received on November 04, 2015
Accepted on December 21, 2015

In this paper matrix multiplication algorithm with double block data layout is suggested. This data layout method allows remarkably decrease amount of cache misses, TLB-cache misses and archive 97 % of peak performance. In the last section results of suggested algorithm with existing packages (MKL, PLASMA, OpenBLAS) comparison are reported. Author outlines that suggested algorithm supports only block matrices in contrast to MKL and OpenBLAS packages, which support matrices with standard data layout. As a consequence, suggested algorithm doesn't replace existing algorithms, but only supplements them.

Keywords: cache memory, block data layout, tilling, high performance computing
pp. 132–139
For citation:
Yurushkin M. V. Double Block Data Layout in High Performance Matrix Multiplication Algorithm, Programmnaya Ingeneria, 2016, vol. 7, no. 3, pp. 132-139.