Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Optimizing Exponent Bias for Sub-8bit Floating-Point Inference of Fine-tuned Transformers

Full metadata record
DC Field Value Language
dc.contributor.author이장환-
dc.contributor.authorChoi, Jung wook-
dc.date.accessioned2022-12-20T10:37:29Z-
dc.date.available2022-12-20T10:37:29Z-
dc.date.issued2022-06-
dc.identifier.urihttps://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/173245-
dc.description.abstractThe Transformer-based fine-tuned neural networks have demonstrated remarkable success in natural language processing (NLP) at the cost of a substantial computational burden. Post-training quantization (PTQ) is a promising technique to reduce the computational cost without expensive re-training. But prior works either demand complex calibration or suffer noticeable accuracy degradation. This paper proposes a practical method for sub-8bit floating-point (FP) PTQ. The proposed method optimizes the exponent bias to minimize quantization error in terms of signal-to-quantization noise ratio (SQNR) progressively like stochastic gradient descent. We evaluate that the proposed method achieves close to full-precision model accuracy for 6 to 8 bit FP PTQ of fine-tuned BERT on GLUE and SQuAD tasks with negligible run-time overhead.-
dc.format.extent4-
dc.language영어-
dc.language.isoENG-
dc.publisherInstitute of Electrical and Electronics Engineers Inc.-
dc.titleOptimizing Exponent Bias for Sub-8bit Floating-Point Inference of Fine-tuned Transformers-
dc.typeArticle-
dc.publisher.location미국-
dc.identifier.doi10.1109/AICAS54282.2022.9869965-
dc.identifier.scopusid2-s2.0-85139072464-
dc.identifier.wosid000859273200026-
dc.identifier.bibliographicCitationProceeding - IEEE International Conference on Artificial Intelligence Circuits and Systems, AICAS 2022, pp 98 - 101-
dc.citation.titleProceeding - IEEE International Conference on Artificial Intelligence Circuits and Systems, AICAS 2022-
dc.citation.startPage98-
dc.citation.endPage101-
dc.type.docTypeProceedings Paper-
dc.description.isOpenAccessN-
dc.description.journalRegisteredClassscopus-
dc.relation.journalResearchAreaComputer Science-
dc.relation.journalResearchAreaEngineering-
dc.relation.journalWebOfScienceCategoryComputer Science, Artificial Intelligence-
dc.relation.journalWebOfScienceCategoryComputer Science, Hardware & Architecture-
dc.relation.journalWebOfScienceCategoryEngineering, Electrical & Electronic-
dc.subject.keywordPlusDigital arithmetic-
dc.subject.keywordPlusNatural language processing systems-
dc.subject.keywordPlusOptimization-
dc.subject.keywordPlusQuantization (signal)-
dc.subject.keywordPlusStochastic systems-
dc.subject.keywordPlusGradient methods-
dc.subject.keywordPlusBERT-
dc.subject.keywordPlusExponent bias-
dc.subject.keywordPlusFloating points-
dc.subject.keywordPlusNatural languages-
dc.subject.keywordPlusNeural-networks-
dc.subject.keywordPlusPost-training quantization-
dc.subject.keywordPlusQuantisation-
dc.subject.keywordPlusReduced precision-
dc.subject.keywordPlusSignal to quantization noise ratios-
dc.subject.keywordPlusTransformer-
dc.subject.keywordAuthorBERT-
dc.subject.keywordAuthorexponent bias-
dc.subject.keywordAuthorfloating-point-
dc.subject.keywordAuthorpost-training quantization-
dc.subject.keywordAuthorreduced-precision-
dc.subject.keywordAuthorSQNR-
dc.subject.keywordAuthorTransformer-
dc.identifier.urlhttps://ieeexplore.ieee.org/document/9869965/-
Files in This Item
Go to Link
Appears in
Collections
서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Choi, Jung wook photo

Choi, Jung wook
COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)
Read more

Altmetrics

Total Views & Downloads

BROWSE