Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

BitBlade: Energy-Efficient Variable Bit-Precision Hardware Accelerator for Quantized Neural Networks

Full metadata record
DC Field Value Language
dc.contributor.authorRyu, Sungju-
dc.contributor.authorKim, Hyungjun-
dc.contributor.authorYi, Wooseok-
dc.contributor.authorKim, Eunhwan-
dc.contributor.authorKim, Yulhwa-
dc.contributor.authorKim, Taesu-
dc.contributor.authorKim, Jae-Joon-
dc.date.accessioned2022-02-22T07:40:02Z-
dc.date.available2022-02-22T07:40:02Z-
dc.date.created2022-02-22-
dc.date.issued2022-06-
dc.identifier.issn0018-9200-
dc.identifier.urihttp://scholarworks.bwise.kr/ssu/handle/2018.sw.ssu/41722-
dc.description.abstractWe introduce an area/energy-efficient precisionscalable neural network accelerator architecture. Previous precision-scalable hardware accelerators have limitations such as the under-utilization of multipliers for low bit-width operations and the large area overhead to support various bit precisions. To mitigate the problems, we first propose a bitwise summation, which reduces the area overhead for the bit-width scaling. In addition, we present a channel-wise aligning scheme (CAS) to efficiently fetch inputs and weights from on-chip SRAM buffers and a channel-first and pixel-last tiling (CFPL) scheme to maximize the utilization of multipliers on various kernel sizes. A test chip was implemented in 28-nm CMOS technology, and the experimental results show that the throughput and energy efficiency of our chip are up to 7.7x and 1.64x higher than those of the state-of-the-art designs, respectively. Moreover, additional 1.5-3.4x throughput gains can be achieved using the CFPL method compared to the CAS.-
dc.language영어-
dc.language.isoen-
dc.publisherIEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC-
dc.relation.isPartOfIEEE JOURNAL OF SOLID-STATE CIRCUITS-
dc.titleBitBlade: Energy-Efficient Variable Bit-Precision Hardware Accelerator for Quantized Neural Networks-
dc.typeArticle-
dc.identifier.doi10.1109/JSSC.2022.3141050-
dc.type.rimsART-
dc.identifier.bibliographicCitationIEEE JOURNAL OF SOLID-STATE CIRCUITS, v.57, no.6, pp.1924 - 1935-
dc.description.journalClass1-
dc.identifier.wosid000748324200001-
dc.identifier.scopusid2-s2.0-85123683357-
dc.citation.endPage1935-
dc.citation.number6-
dc.citation.startPage1924-
dc.citation.titleIEEE JOURNAL OF SOLID-STATE CIRCUITS-
dc.citation.volume57-
dc.contributor.affiliatedAuthorRyu, Sungju-
dc.type.docTypeArticle-
dc.description.isOpenAccessN-
dc.subject.keywordAuthorComputer architecture-
dc.subject.keywordAuthorNeural networks-
dc.subject.keywordAuthorHardware acceleration-
dc.subject.keywordAuthorAdders-
dc.subject.keywordAuthorArrays-
dc.subject.keywordAuthorRandom access memory-
dc.subject.keywordAuthorThroughput-
dc.subject.keywordAuthorBit-precision scaling-
dc.subject.keywordAuthorbitwise summation-
dc.subject.keywordAuthorchannel-first and pixel-last tiling (CFPL)-
dc.subject.keywordAuthorchannel-wise aligning-
dc.subject.keywordAuthordeep neural network-
dc.subject.keywordAuthorhardware accelerator-
dc.subject.keywordAuthormultiply-accumulate unit-
dc.relation.journalResearchAreaEngineering-
dc.relation.journalWebOfScienceCategoryEngineering, Electrical & Electronic-
dc.description.journalRegisteredClassscie-
dc.description.journalRegisteredClassscopus-
Files in This Item
There are no files associated with this item.
Appears in
Collections
College of Information Technology > ETC > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Altmetrics

Total Views & Downloads

BROWSE