TernGEMM: GEneral Matrix Multiply Library with Ternary Weights for Fast DNN Inference

Choi, Seokhyeon; Shim, Kyuhong; Choi, Jungwook; Sung, Wonyong; Shim, Byonghyo

doi:10.1109/SiPS52927.2021.00028

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

TernGEMM: GEneral Matrix Multiply Library with Ternary Weights for Fast DNN Inference

Full metadata record

DC Field	Value	Language
dc.contributor.author	Choi, Seokhyeon	-
dc.contributor.author	Shim, Kyuhong	-
dc.contributor.author	Choi, Jungwook	-
dc.contributor.author	Sung, Wonyong	-
dc.contributor.author	Shim, Byonghyo	-
dc.date.accessioned	2022-07-06T11:33:47Z	-
dc.date.available	2022-07-06T11:33:47Z	-
dc.date.created	2022-01-26	-
dc.date.issued	2021-11	-
dc.identifier.issn	1520-6130	-
dc.identifier.uri	https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/140385	-
dc.description.abstract	Efficient implementation of deep neural networks on CPU-based systems is very critical because applications proliferate to embedded and Internet of Things (IoT) systems. Many CPUs for personal computers and embedded systems equip Single Instruction Multiple Data (SIMD) instructions, which can be utilized to implement an efficient GEneral Matrix Multiply (GEMM) library that is very necessary for efficient deep neural network implementation. While many deep neural networks show quite good performance even at 1-bit or 2-bit precision, the current CPU instruction and library do not efficiently support arithmetic operations below 8-bit. We propose TernGEMM, a special GEMM library using SIMD instructions for Deep Neural Network (DNN) inference with ternary weights and activations under 8-bit. TernGEMM improves the speed by replacing slow multiply-add with logical operations and also accumulating a number of multiplications without bit expansion operations. We compared the speedup of TernGEMM with tiling optimization and GEMMLowp, an 8-bit precision GEMM library. For Intel CPU, the speedup of ×2.052, ×2.973, and ×2.986 is achieved on ResNet-50, MobileNet-V2, EfficientNet-B0, respectively. For ARM CPU, TernGEMM's speedup is ×2.143, ×1.765, and ×1.856, respectively.	-
dc.language	영어	-
dc.language.iso	en	-
dc.publisher	Institute of Electrical and Electronics Engineers Inc.	-
dc.title	TernGEMM: GEneral Matrix Multiply Library with Ternary Weights for Fast DNN Inference	-
dc.type	Article	-
dc.contributor.affiliatedAuthor	Choi, Jungwook	-
dc.identifier.doi	10.1109/SiPS52927.2021.00028	-
dc.identifier.scopusid	2-s2.0-85122863833	-
dc.identifier.wosid	000783958100020	-
dc.identifier.bibliographicCitation	IEEE Workshop on Signal Processing Systems, SiPS: Design and Implementation, v.2021, no.October, pp.111 - 116	-
dc.relation.isPartOf	IEEE Workshop on Signal Processing Systems, SiPS: Design and Implementation	-
dc.citation.title	IEEE Workshop on Signal Processing Systems, SiPS: Design and Implementation	-
dc.citation.volume	2021	-
dc.citation.number	October	-
dc.citation.startPage	111	-
dc.citation.endPage	116	-
dc.type.rims	ART	-
dc.type.docType	Proceedings Paper	-
dc.description.journalClass	1	-
dc.description.isOpenAccess	N	-
dc.description.journalRegisteredClass	scopus	-
dc.relation.journalResearchArea	Engineering	-
dc.relation.journalResearchArea	Telecommunications	-
dc.relation.journalWebOfScienceCategory	Engineering, Electrical & Electronic	-
dc.relation.journalWebOfScienceCategory	Telecommunications	-
dc.subject.keywordPlus	Deep neural networks	-
dc.subject.keywordPlus	Embedded systems	-
dc.subject.keywordPlus	Internet of things	-
dc.subject.keywordPlus	Personal computers	-
dc.subject.keywordPlus	Program processors	-
dc.subject.keywordPlus	Bit precision	-
dc.subject.keywordPlus	Efficient implementation	-
dc.subject.keywordPlus	Embedded-system	-
dc.subject.keywordPlus	Implementation	-
dc.subject.keywordPlus	Inference	-
dc.subject.keywordPlus	MAtrix multiplication	-
dc.subject.keywordPlus	Matrix multiply	-
dc.subject.keywordPlus	Network inference	-
dc.subject.keywordPlus	Performance	-
dc.subject.keywordPlus	Single instruction multiple data instructions	-
dc.subject.keywordPlus	Matrix algebra	-
dc.subject.keywordAuthor	Deep neural networks	-
dc.subject.keywordAuthor	Implementation	-
dc.subject.keywordAuthor	Inference	-
dc.subject.keywordAuthor	Matrix multiplication	-
dc.identifier.url	https://ieeexplore.ieee.org/document/9605039	-

Files in This Item: Go to Link

Appears in Collections: 서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Choi, Jung wook photo

Choi, Jung wook: COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)

Read more

Altmetrics

Total Views & Downloads

STATISTICS: Total View :6,027,337; Today View :55

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1365

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE