A Low Power Attention and Softmax Accelerator for Large Language Models Inference
- Authors
- Kim, Jeong-Hyun; Kim, Chan-Hoon; Rho, Soo-Min; Chung, Ki-Seok
- Issue Date
- Dec-2024
- Publisher
- Institute of Electrical and Electronics Engineers Inc.
- Keywords
- AI accelerator; Algorithm-Hardware Co-Design; LLMs; Low Power Design; NLP; Softmax; Transformer
- Citation
- 2024 IEEE International Conference on Consumer Electronics-Asia, ICCE-Asia 2024, pp 1 - 4
- Pages
- 4
- Indexed
- SCOPUS
- Journal Title
- 2024 IEEE International Conference on Consumer Electronics-Asia, ICCE-Asia 2024
- Start Page
- 1
- End Page
- 4
- URI
- https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/206470
- DOI
- 10.1109/ICCE-Asia63397.2024.10773935
- Abstract
- Transformer-based models, essential for high-performing Large Language Models (LLMs), surpass traditional Deep Neural Networks but require substantial computational resources. Therefore, more efficient transformer algorithms and accelerators are required to reduce the computational cost and power consumption of LLMs. We observed that as the sequence length increases, softmax operations, which are the key operation of the transformer self-attention mechanism, become the major bottleneck. In this paper, we propose Cross-Road Softmax, an optimized algorithm designed for the softmax operation within the attention layer, specifically tailored for inference in LLMs. Our software experiment was conducted on 8 Natural Language Processing benchmarks for evaluation. Furthermore, we design a Cross-Road Accel using the proposed Cross-Road Softmax that accelerates softmax function of the self-attention layer. We implement Cross-Road Accel in RTL and synthesize it with Syn-opsys Design Compiler using Nangate 15nm open cell library to obtain power and area statistics. In summary, on average, Cross-Road Accel achieves an approximately 3.5 × increase in energy efficiency compared to state-of-the-art transformer accelerators.
- Files in This Item
- There are no files associated with this item.
- Appears in
Collections - 서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.