Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

A Low Power Attention and Softmax Accelerator for Large Language Models Inference

Authors
Kim, Jeong-HyunKim, Chan-HoonRho, Soo-MinChung, Ki-Seok
Issue Date
Dec-2024
Publisher
Institute of Electrical and Electronics Engineers Inc.
Keywords
AI accelerator; Algorithm-Hardware Co-Design; LLMs; Low Power Design; NLP; Softmax; Transformer
Citation
2024 IEEE International Conference on Consumer Electronics-Asia, ICCE-Asia 2024, pp 1 - 4
Pages
4
Indexed
SCOPUS
Journal Title
2024 IEEE International Conference on Consumer Electronics-Asia, ICCE-Asia 2024
Start Page
1
End Page
4
URI
https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/206470
DOI
10.1109/ICCE-Asia63397.2024.10773935
Abstract
Transformer-based models, essential for high-performing Large Language Models (LLMs), surpass traditional Deep Neural Networks but require substantial computational resources. Therefore, more efficient transformer algorithms and accelerators are required to reduce the computational cost and power consumption of LLMs. We observed that as the sequence length increases, softmax operations, which are the key operation of the transformer self-attention mechanism, become the major bottleneck. In this paper, we propose Cross-Road Softmax, an optimized algorithm designed for the softmax operation within the attention layer, specifically tailored for inference in LLMs. Our software experiment was conducted on 8 Natural Language Processing benchmarks for evaluation. Furthermore, we design a Cross-Road Accel using the proposed Cross-Road Softmax that accelerates softmax function of the self-attention layer. We implement Cross-Road Accel in RTL and synthesize it with Syn-opsys Design Compiler using Nangate 15nm open cell library to obtain power and area statistics. In summary, on average, Cross-Road Accel achieves an approximately 3.5 × increase in energy efficiency compared to state-of-the-art transformer accelerators.
Files in This Item
There are no files associated with this item.
Appears in
Collections
서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Chung, Ki Seok photo

Chung, Ki Seok
COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)
Read more

Altmetrics

Total Views & Downloads

BROWSE