Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Understanding and Optimizing INT4 Convolution for Accelerated DNN Inference on Tensor Cores

Full metadata record
DC Field Value Language
dc.contributor.authorChoi, Junkyeong-
dc.contributor.authorKwon, Hyucksung-
dc.contributor.authorLee, Woongkyu-
dc.contributor.authorLim, Jieun-
dc.contributor.authorChoi, Jungwook-
dc.date.accessioned2022-12-20T05:05:00Z-
dc.date.available2022-12-20T05:05:00Z-
dc.date.created2022-12-07-
dc.date.issued2022-11-
dc.identifier.issn1520-6130-
dc.identifier.urihttps://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/172849-
dc.description.abstractConvolution is one of the fundamental operations of deep neural networks with demanding matrix computation. In a graphic processing unit (GPU), Tensor Core is a specialized matrix processing hardware equipped with reduced-precision warp matrix-multiply-accumulate (WMMA) instructions to increase throughput. However, it is challenging to achieve optimal performance since the reduced-precision WMMA requires many elements grouped as a matrix operand, seriously limiting data reuse and imposing packing and layout overhead on the schedule. This work proposes three techniques to enhance INT4 WMMA utilization on Tensor Cores: duplicate-aware load for increasing the reuse of convolution input, register-level packing for alleviating overhead of handling INT4 data, and data layout optimization for coalesced data transfer. The proposed INT4 WMMA optimization techniques are evaluated on convolution operations of popular neural networks to demonstrate substantial speedup on Tensor Core compared to the state of the art.-
dc.language영어-
dc.language.isoen-
dc.publisherInstitute of Electrical and Electronics Engineers Inc.-
dc.titleUnderstanding and Optimizing INT4 Convolution for Accelerated DNN Inference on Tensor Cores-
dc.typeArticle-
dc.contributor.affiliatedAuthorChoi, Jungwook-
dc.identifier.doi10.1109/SiPS55645.2022.9919243-
dc.identifier.scopusid2-s2.0-85141793386-
dc.identifier.wosid001081960800004-
dc.identifier.bibliographicCitationIEEE Workshop on Signal Processing Systems, SiPS: Design and Implementation, v.2022-November, pp.1 - 6-
dc.relation.isPartOfIEEE Workshop on Signal Processing Systems, SiPS: Design and Implementation-
dc.citation.titleIEEE Workshop on Signal Processing Systems, SiPS: Design and Implementation-
dc.citation.volume2022-November-
dc.citation.startPage1-
dc.citation.endPage6-
dc.type.rimsART-
dc.type.docTypeProceedings Paper-
dc.description.journalClass1-
dc.description.isOpenAccessN-
dc.description.journalRegisteredClassscopus-
dc.relation.journalResearchAreaComputer Science-
dc.relation.journalResearchAreaEngineering-
dc.relation.journalResearchAreaTelecommunications-
dc.relation.journalWebOfScienceCategoryComputer Science, Information Systems-
dc.relation.journalWebOfScienceCategoryEngineering, Electrical & Electronic-
dc.relation.journalWebOfScienceCategoryTelecommunications-
dc.subject.keywordPlusData reduction-
dc.subject.keywordPlusData transfer-
dc.subject.keywordPlusDeep neural networks-
dc.subject.keywordPlusGraphics processing unit-
dc.subject.keywordPlusMatrix algebra-
dc.subject.keywordPlusTensors-
dc.subject.keywordPlusConvolution-
dc.subject.keywordPlusFundamental operations-
dc.subject.keywordPlusmatrix-
dc.subject.keywordPlusMatrix computation-
dc.subject.keywordPlusMatrix multiply-
dc.subject.keywordPlusMultiplyaccumulate (MAC)-
dc.subject.keywordPlusProcessing hardware-
dc.subject.keywordPlusReduced precision-
dc.subject.keywordPlusReduced precision DNN-
dc.subject.keywordPlusTensor core-
dc.subject.keywordPlusUnit tensor-
dc.subject.keywordAuthorconvolution-
dc.subject.keywordAuthorreduced precision DNN-
dc.subject.keywordAuthortensor core-
dc.identifier.urlhttps://ieeexplore.ieee.org/document/9919243-
Files in This Item
Go to Link
Appears in
Collections
서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Choi, Jung wook photo

Choi, Jung wook
COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)
Read more

Altmetrics

Total Views & Downloads

BROWSE