Tensor Core-Adapted Sparse Matrix Multiplication for Accelerating Sparse Deep Neural Networksopen access
- Authors
- Han, Yoonsang; Kim, Inseo; Kim, Jinsung; Moon, Gordon Euhyun
- Issue Date
- Oct-2024
- Publisher
- MDPI
- Keywords
- sparse matrix multiplication; tensor cores; sparse deep neural networks; load balancing; data movement
- Citation
- ELECTRONICS, v.13, no.20
- Journal Title
- ELECTRONICS
- Volume
- 13
- Number
- 20
- URI
- https://scholarworks.bwise.kr/cau/handle/2019.sw.cau/77601
- DOI
- 10.3390/electronics13203981
- ISSN
- 2079-9292
2079-9292
- Abstract
- Sparse matrix-matrix multiplication (SpMM) is essential for deep learning models and scientific computing. Recently, Tensor Cores (TCs) on GPUs, originally designed for dense matrix multiplication with mixed precision, have gained prominence. However, utilizing TCs for SpMM is challenging due to irregular memory access patterns and a varying number of non-zero elements in a sparse matrix. To improve data locality, previous studies have proposed reordering sparse matrices before multiplication, but this adds computational overhead. In this paper, we propose Tensor Core-Adapted SpMM (TCA-SpMM), which leverages TCs without requiring matrix reordering and uses the compressed sparse row (CSR) format. To optimize TC usage, the SpMM algorithm's dot product operation is transformed into a blocked matrix-matrix multiplication. Addressing load imbalance and minimizing data movement are critical to optimizing the SpMM kernel. Our TCA-SpMM dynamically allocates thread blocks to process multiple rows simultaneously and efficiently uses shared memory to reduce data movement. Performance results on sparse matrices from the Deep Learning Matrix Collection public dataset demonstrate that TCA-SpMM achieves up to 29.58x speedup over state-of-the-art SpMM implementations optimized with TCs.
- Files in This Item
-
- Appears in
Collections - College of Software > School of Computer Science and Engineering > 1. Journal Articles

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.