NN-LUT: Neural Approximation of Non-Linear Operations for Efficient Transformer Inference

Yu, Joonsang; Park, Junki; Park, Seongmin; Kim, Minsoo; Lee, Sihwa; Lee, Dong Hyun; Choi, Jungwook

doi:10.1145/3489517.3530505

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

NN-LUT: Neural Approximation of Non-Linear Operations for Efficient Transformer Inference

Authors: Yu, Joonsang; Park, Junki; Park, Seongmin; Kim, Minsoo; Lee, Sihwa; Lee, Dong Hyun; Choi, Jungwook

Issue Date: Jul-2022

Publisher: Institute of Electrical and Electronics Engineers Inc.

Keywords: look-up table; neural network; non-linear function; transformer

Citation: Proceedings - Design Automation Conference, pp 577 - 582

Pages: 6

Indexed: SCOPUS

Journal Title: Proceedings - Design Automation Conference

Start Page: 577

End Page: 582

URI: https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/172605

DOI: 10.1145/3489517.3530505

ISSN: 0146-7123

Abstract: Non-linear operations such as GELU, Layer normalization, and Soft-max are essential yet costly building blocks of Transformer models. Several prior works simplified these operations with look-up tables or integer computations, but such approximations suffer inferior accuracy or considerable hardware cost with long latency. This paper proposes an accurate and hardware-friendly approximation framework for efficient Transformer inference. Our framework employs a simple neural network as a universal approximator with its structure equivalently transformed into a Look-up table(LUT). The proposed framework called Neural network generated LUT(NN-LUT) can accurately replace all the non-linear operations in popular BERT models with significant reductions in area, power consumption, and latency.

Files in This Item: Go to Link

Appears in Collections: 서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

Show full item record

qrcode

Related Researcher

Researcher Choi, Jung wook photo

Choi, Jung wook: COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)

Read more

Altmetrics

Total Views & Downloads

STATISTICS: Total View :5,969,577; Today View :7,261

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1365

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE