Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

NN-LUT: Neural Approximation of Non-Linear Operations for Efficient Transformer Inference

Authors
Yu, JoonsangPark, JunkiPark, SeongminKim, MinsooLee, SihwaLee, Dong HyunChoi, Jungwook
Issue Date
Jul-2022
Publisher
Institute of Electrical and Electronics Engineers Inc.
Keywords
look-up table; neural network; non-linear function; transformer
Citation
Proceedings - Design Automation Conference, pp 577 - 582
Pages
6
Indexed
SCOPUS
Journal Title
Proceedings - Design Automation Conference
Start Page
577
End Page
582
URI
https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/172605
DOI
10.1145/3489517.3530505
ISSN
0146-7123
Abstract
Non-linear operations such as GELU, Layer normalization, and Soft-max are essential yet costly building blocks of Transformer models. Several prior works simplified these operations with look-up tables or integer computations, but such approximations suffer inferior accuracy or considerable hardware cost with long latency. This paper proposes an accurate and hardware-friendly approximation framework for efficient Transformer inference. Our framework employs a simple neural network as a universal approximator with its structure equivalently transformed into a Look-up table(LUT). The proposed framework called Neural network generated LUT(NN-LUT) can accurately replace all the non-linear operations in popular BERT models with significant reductions in area, power consumption, and latency.
Files in This Item
Go to Link
Appears in
Collections
서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Choi, Jung wook photo

Choi, Jung wook
COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)
Read more

Altmetrics

Total Views & Downloads

BROWSE