Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Optimizing Exponent Bias for Sub-8bit Floating-Point Inference of Fine-tuned Transformers

Authors
이장환Choi, Jung wook
Issue Date
Jun-2022
Publisher
Institute of Electrical and Electronics Engineers Inc.
Keywords
BERT; exponent bias; floating-point; post-training quantization; reduced-precision; SQNR; Transformer
Citation
Proceeding - IEEE International Conference on Artificial Intelligence Circuits and Systems, AICAS 2022, pp 98 - 101
Pages
4
Indexed
SCOPUS
Journal Title
Proceeding - IEEE International Conference on Artificial Intelligence Circuits and Systems, AICAS 2022
Start Page
98
End Page
101
URI
https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/173245
DOI
10.1109/AICAS54282.2022.9869965
Abstract
The Transformer-based fine-tuned neural networks have demonstrated remarkable success in natural language processing (NLP) at the cost of a substantial computational burden. Post-training quantization (PTQ) is a promising technique to reduce the computational cost without expensive re-training. But prior works either demand complex calibration or suffer noticeable accuracy degradation. This paper proposes a practical method for sub-8bit floating-point (FP) PTQ. The proposed method optimizes the exponent bias to minimize quantization error in terms of signal-to-quantization noise ratio (SQNR) progressively like stochastic gradient descent. We evaluate that the proposed method achieves close to full-precision model accuracy for 6 to 8 bit FP PTQ of fine-tuned BERT on GLUE and SQuAD tasks with negligible run-time overhead.
Files in This Item
Go to Link
Appears in
Collections
서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Choi, Jung wook photo

Choi, Jung wook
COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)
Read more

Altmetrics

Total Views & Downloads

BROWSE