Optimizing Exponent Bias for Sub-8bit Floating-Point Inference of Fine-tuned Transformers

이장환; Choi, Jung wook

doi:10.1109/AICAS54282.2022.9869965

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Optimizing Exponent Bias for Sub-8bit Floating-Point Inference of Fine-tuned Transformers

Full metadata record

DC Field	Value	Language
dc.contributor.author	이장환	-
dc.contributor.author	Choi, Jung wook	-
dc.date.accessioned	2022-12-20T10:37:29Z	-
dc.date.available	2022-12-20T10:37:29Z	-
dc.date.issued	2022-06	-
dc.identifier.uri	https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/173245	-
dc.description.abstract	The Transformer-based fine-tuned neural networks have demonstrated remarkable success in natural language processing (NLP) at the cost of a substantial computational burden. Post-training quantization (PTQ) is a promising technique to reduce the computational cost without expensive re-training. But prior works either demand complex calibration or suffer noticeable accuracy degradation. This paper proposes a practical method for sub-8bit floating-point (FP) PTQ. The proposed method optimizes the exponent bias to minimize quantization error in terms of signal-to-quantization noise ratio (SQNR) progressively like stochastic gradient descent. We evaluate that the proposed method achieves close to full-precision model accuracy for 6 to 8 bit FP PTQ of fine-tuned BERT on GLUE and SQuAD tasks with negligible run-time overhead.	-
dc.format.extent	4	-
dc.language	영어	-
dc.language.iso	ENG	-
dc.publisher	Institute of Electrical and Electronics Engineers Inc.	-
dc.title	Optimizing Exponent Bias for Sub-8bit Floating-Point Inference of Fine-tuned Transformers	-
dc.type	Article	-
dc.publisher.location	미국	-
dc.identifier.doi	10.1109/AICAS54282.2022.9869965	-
dc.identifier.scopusid	2-s2.0-85139072464	-
dc.identifier.wosid	000859273200026	-
dc.identifier.bibliographicCitation	Proceeding - IEEE International Conference on Artificial Intelligence Circuits and Systems, AICAS 2022, pp 98 - 101	-
dc.citation.title	Proceeding - IEEE International Conference on Artificial Intelligence Circuits and Systems, AICAS 2022	-
dc.citation.startPage	98	-
dc.citation.endPage	101	-
dc.type.docType	Proceedings Paper	-
dc.description.isOpenAccess	N	-
dc.description.journalRegisteredClass	scopus	-
dc.relation.journalResearchArea	Computer Science	-
dc.relation.journalResearchArea	Engineering	-
dc.relation.journalWebOfScienceCategory	Computer Science, Artificial Intelligence	-
dc.relation.journalWebOfScienceCategory	Computer Science, Hardware & Architecture	-
dc.relation.journalWebOfScienceCategory	Engineering, Electrical & Electronic	-
dc.subject.keywordPlus	Digital arithmetic	-
dc.subject.keywordPlus	Natural language processing systems	-
dc.subject.keywordPlus	Optimization	-
dc.subject.keywordPlus	Quantization (signal)	-
dc.subject.keywordPlus	Stochastic systems	-
dc.subject.keywordPlus	Gradient methods	-
dc.subject.keywordPlus	BERT	-
dc.subject.keywordPlus	Exponent bias	-
dc.subject.keywordPlus	Floating points	-
dc.subject.keywordPlus	Natural languages	-
dc.subject.keywordPlus	Neural-networks	-
dc.subject.keywordPlus	Post-training quantization	-
dc.subject.keywordPlus	Quantisation	-
dc.subject.keywordPlus	Reduced precision	-
dc.subject.keywordPlus	Signal to quantization noise ratios	-
dc.subject.keywordPlus	Transformer	-
dc.subject.keywordAuthor	BERT	-
dc.subject.keywordAuthor	exponent bias	-
dc.subject.keywordAuthor	floating-point	-
dc.subject.keywordAuthor	post-training quantization	-
dc.subject.keywordAuthor	reduced-precision	-
dc.subject.keywordAuthor	SQNR	-
dc.subject.keywordAuthor	Transformer	-
dc.identifier.url	https://ieeexplore.ieee.org/document/9869965/	-

Files in This Item: Go to Link

Appears in Collections: 서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Choi, Jung wook photo

Choi, Jung wook: COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE