Hope: An Efficient Accelerator With Head-Wise Overlap Processing for Sparse Attention in Vision Transformer
- Authors
- Heo, Jihyeon; Rho, Soomin; Kim, Kwangrae; Chung, Ki-Seok
- Issue Date
- Jun-2025
- Publisher
- Institute of Electrical and Electronics Engineers Inc.
- Keywords
- Hardware Accelerator; Sparse Attention; Vision Transformer
- Citation
- IEEE Symposium on Low-Power and High-Speed Chips and Systems, COOL CHIPS 2025 - Proceedings, pp 1 - 6
- Pages
- 6
- Indexed
- SCOPUS
- Journal Title
- IEEE Symposium on Low-Power and High-Speed Chips and Systems, COOL CHIPS 2025 - Proceedings
- Start Page
- 1
- End Page
- 6
- URI
- https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/207993
- DOI
- 10.1109/COOLCHIPS65488.2025.11018597
- ISSN
- 2167-9657
2473-4683
- Abstract
- Vision Transformers (ViTs) have revolutionized computer vision tasks with superior performance, but their quadratic computational complexity in self-attention remains a critical bottleneck. Sparse attention mitigates this issue by avoiding unnecessary computations, but existing accelerators enforce the sequential execution of QKV generation and sparse attention computations, which hinders execution time improvement. In this paper, we propose HOPE, a novel sparse attention accelerator for ViTs. HOPE accelerator introduces a head-wise overlap scheduling method and a new vector processing unit called SaVPU. The SaVPU augments conventional non-linear operation units with minimal hardware overhead to efficiently process sparse attention, such as SDDMM and SpMM operations. The head-wise overlap scheduling method enables concurrent execution of QKV generation and sparse attention computations. This overlapping execution significantly reduces latency. Extensive experiments on ViT models with sparsity levels between 60 % and 90 % demonstrate that HOPE achieves up to a 1.4× speedup (1.2× on average) over the state-of-the-art ViTCoD accelerator while incurring only 1.25 % additional hardware area. These results confirm the HOPE accelerator achieves significant speedup with minimal hardware overhead.
- Files in This Item
-
Go to Link
- Appears in
Collections - 서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.