Massimiliano Fasi
Matrix Multiplication in Multiword Arithmetic: Error Analysis and Application to GPU Tensor Cores
Fasi, Massimiliano; Higham, Nicholas J.; Lopez, Florent; Mary, Theo; Mikaitis, Mantas
Authors
Nicholas J. Higham
Florent Lopez
Theo Mary
Mantas Mikaitis
Abstract
In multiword arithmetic, a matrix is represented as the unevaluated sum of two or more lower precision matrices, and a matrix product is formed by multiplying the constituents in low precision. We investigate the use of multiword arithmetic for improving the performance-accuracy tradeoff of matrix multiplication with mixed precision block fused multiply–add (FMA) hardware, focusing especially on the tensor cores available on NVIDIA GPUs. Building on a general block FMA framework, we develop a comprehensive error analysis of multiword matrix multiplication. After confirming the theoretical error bounds experimentally by simulating low precision in software, we use the cuBLAS and CUTLASS libraries to implement a number of matrix multiplication algorithms using double-fp16 (double-binary16) arithmetic. When running the algorithms on NVIDIA V100 and A100 GPUs, we find that double-fp16 is not as accurate as fp32 (binary32) arithmetic despite satisfying the same worst-case error bound. Using probabilistic error analysis, we explain why this issue is likely to be caused by the rounding mode used by the NVIDIA tensor cores, and we propose a parameterized blocked summation algorithm that alleviates the problem and significantly improves the performance-accuracy tradeoff.
Citation
Fasi, M., Higham, N. J., Lopez, F., Mary, T., & Mikaitis, M. (2023). Matrix Multiplication in Multiword Arithmetic: Error Analysis and Application to GPU Tensor Cores. SIAM Journal on Scientific Computing, 45(1), https://doi.org/10.1137/21M1465032
Journal Article Type | Article |
---|---|
Acceptance Date | Aug 24, 2022 |
Online Publication Date | Feb 2, 2023 |
Publication Date | Feb 2, 2023 |
Deposit Date | Oct 14, 2022 |
Journal | SIAM Journal on Scientific Computing |
Print ISSN | 1064-8275 |
Electronic ISSN | 1095-7197 |
Publisher | Society for Industrial and Applied Mathematics |
Volume | 45 |
Issue | 1 |
DOI | https://doi.org/10.1137/21M1465032 |
Public URL | https://durham-repository.worktribe.com/output/1188984 |
Related Public URLs | http://eprints.maths.manchester.ac.uk/2862/ |
You might also like
Computational graphs for matrix functions
(2023)
Journal Article
CPFloat: A C library for simulating low-precision arithmetic
(2023)
Journal Article
Computing the square root of a low-rank perturbation of the scaled identity matrix
(2022)
Journal Article
The Dynamical Functional Particle Method for Multi-Term Linear Matrix Equations
(2022)
Journal Article
Stochastic rounding: implementation, error analysis and applications
(2022)
Journal Article
Downloadable Citations
About Durham Research Online (DRO)
Administrator e-mail: dro.admin@durham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search