Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

TernGEMM: GEneral Matrix Multiply Library with Ternary Weights for Fast DNN Inference

Authors
Choi, SeokhyeonShim, KyuhongChoi, JungwookSung, WonyongShim, Byonghyo
Issue Date
Nov-2021
Publisher
Institute of Electrical and Electronics Engineers Inc.
Keywords
Deep neural networks; Implementation; Inference; Matrix multiplication
Citation
IEEE Workshop on Signal Processing Systems, SiPS: Design and Implementation, v.2021, no.October, pp.111 - 116
Indexed
SCOPUS
Journal Title
IEEE Workshop on Signal Processing Systems, SiPS: Design and Implementation
Volume
2021
Number
October
Start Page
111
End Page
116
URI
https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/140385
DOI
10.1109/SiPS52927.2021.00028
ISSN
1520-6130
Abstract
Efficient implementation of deep neural networks on CPU-based systems is very critical because applications proliferate to embedded and Internet of Things (IoT) systems. Many CPUs for personal computers and embedded systems equip Single Instruction Multiple Data (SIMD) instructions, which can be utilized to implement an efficient GEneral Matrix Multiply (GEMM) library that is very necessary for efficient deep neural network implementation. While many deep neural networks show quite good performance even at 1-bit or 2-bit precision, the current CPU instruction and library do not efficiently support arithmetic operations below 8-bit. We propose TernGEMM, a special GEMM library using SIMD instructions for Deep Neural Network (DNN) inference with ternary weights and activations under 8-bit. TernGEMM improves the speed by replacing slow multiply-add with logical operations and also accumulating a number of multiplications without bit expansion operations. We compared the speedup of TernGEMM with tiling optimization and GEMMLowp, an 8-bit precision GEMM library. For Intel CPU, the speedup of ×2.052, ×2.973, and ×2.986 is achieved on ResNet-50, MobileNet-V2, EfficientNet-B0, respectively. For ARM CPU, TernGEMM's speedup is ×2.143, ×1.765, and ×1.856, respectively.
Files in This Item
Go to Link
Appears in
Collections
서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Choi, Jung wook photo

Choi, Jung wook
COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)
Read more

Altmetrics

Total Views & Downloads

BROWSE