TernGEMM: GEneral Matrix Multiply Library with Ternary Weights for Fast DNN Inference
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Choi, Seokhyeon | - |
dc.contributor.author | Shim, Kyuhong | - |
dc.contributor.author | Choi, Jungwook | - |
dc.contributor.author | Sung, Wonyong | - |
dc.contributor.author | Shim, Byonghyo | - |
dc.date.accessioned | 2022-07-06T11:33:47Z | - |
dc.date.available | 2022-07-06T11:33:47Z | - |
dc.date.created | 2022-01-26 | - |
dc.date.issued | 2021-11 | - |
dc.identifier.issn | 1520-6130 | - |
dc.identifier.uri | https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/140385 | - |
dc.description.abstract | Efficient implementation of deep neural networks on CPU-based systems is very critical because applications proliferate to embedded and Internet of Things (IoT) systems. Many CPUs for personal computers and embedded systems equip Single Instruction Multiple Data (SIMD) instructions, which can be utilized to implement an efficient GEneral Matrix Multiply (GEMM) library that is very necessary for efficient deep neural network implementation. While many deep neural networks show quite good performance even at 1-bit or 2-bit precision, the current CPU instruction and library do not efficiently support arithmetic operations below 8-bit. We propose TernGEMM, a special GEMM library using SIMD instructions for Deep Neural Network (DNN) inference with ternary weights and activations under 8-bit. TernGEMM improves the speed by replacing slow multiply-add with logical operations and also accumulating a number of multiplications without bit expansion operations. We compared the speedup of TernGEMM with tiling optimization and GEMMLowp, an 8-bit precision GEMM library. For Intel CPU, the speedup of ×2.052, ×2.973, and ×2.986 is achieved on ResNet-50, MobileNet-V2, EfficientNet-B0, respectively. For ARM CPU, TernGEMM's speedup is ×2.143, ×1.765, and ×1.856, respectively. | - |
dc.language | 영어 | - |
dc.language.iso | en | - |
dc.publisher | Institute of Electrical and Electronics Engineers Inc. | - |
dc.title | TernGEMM: GEneral Matrix Multiply Library with Ternary Weights for Fast DNN Inference | - |
dc.type | Article | - |
dc.contributor.affiliatedAuthor | Choi, Jungwook | - |
dc.identifier.doi | 10.1109/SiPS52927.2021.00028 | - |
dc.identifier.scopusid | 2-s2.0-85122863833 | - |
dc.identifier.wosid | 000783958100020 | - |
dc.identifier.bibliographicCitation | IEEE Workshop on Signal Processing Systems, SiPS: Design and Implementation, v.2021, no.October, pp.111 - 116 | - |
dc.relation.isPartOf | IEEE Workshop on Signal Processing Systems, SiPS: Design and Implementation | - |
dc.citation.title | IEEE Workshop on Signal Processing Systems, SiPS: Design and Implementation | - |
dc.citation.volume | 2021 | - |
dc.citation.number | October | - |
dc.citation.startPage | 111 | - |
dc.citation.endPage | 116 | - |
dc.type.rims | ART | - |
dc.type.docType | Proceedings Paper | - |
dc.description.journalClass | 1 | - |
dc.description.isOpenAccess | N | - |
dc.description.journalRegisteredClass | scopus | - |
dc.relation.journalResearchArea | Engineering | - |
dc.relation.journalResearchArea | Telecommunications | - |
dc.relation.journalWebOfScienceCategory | Engineering, Electrical & Electronic | - |
dc.relation.journalWebOfScienceCategory | Telecommunications | - |
dc.subject.keywordPlus | Deep neural networks | - |
dc.subject.keywordPlus | Embedded systems | - |
dc.subject.keywordPlus | Internet of things | - |
dc.subject.keywordPlus | Personal computers | - |
dc.subject.keywordPlus | Program processors | - |
dc.subject.keywordPlus | Bit precision | - |
dc.subject.keywordPlus | Efficient implementation | - |
dc.subject.keywordPlus | Embedded-system | - |
dc.subject.keywordPlus | Implementation | - |
dc.subject.keywordPlus | Inference | - |
dc.subject.keywordPlus | MAtrix multiplication | - |
dc.subject.keywordPlus | Matrix multiply | - |
dc.subject.keywordPlus | Network inference | - |
dc.subject.keywordPlus | Performance | - |
dc.subject.keywordPlus | Single instruction multiple data instructions | - |
dc.subject.keywordPlus | Matrix algebra | - |
dc.subject.keywordAuthor | Deep neural networks | - |
dc.subject.keywordAuthor | Implementation | - |
dc.subject.keywordAuthor | Inference | - |
dc.subject.keywordAuthor | Matrix multiplication | - |
dc.identifier.url | https://ieeexplore.ieee.org/document/9605039 | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1365
COPYRIGHT © 2021 HANYANG UNIVERSITY.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.