Cited 0 time in
Power-Efficient Deep Neural Network Accelerator Minimizing Global Buffer Access without Data Transfer between Neighboring Multiplier-Accumulator Units
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Lee, Jeonghyeok | - |
| dc.contributor.author | Han, Sangwook | - |
| dc.contributor.author | Choi, Seungwon | - |
| dc.contributor.author | Choi, Jungwook | - |
| dc.date.accessioned | 2024-12-20T06:38:37Z | - |
| dc.date.available | 2024-12-20T06:38:37Z | - |
| dc.date.issued | 2022-07 | - |
| dc.identifier.issn | 2079-9292 | - |
| dc.identifier.issn | 2079-9292 | - |
| dc.identifier.uri | https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/203152 | - |
| dc.description.abstract | This paper presents a novel method for minimizing the power consumption of weight data movements required by a convolutional operation performed on a two-dimensional multiplier-accumulator (MAC) array of a deep neural-network accelerator. The proposed technique employs a local register file (LRF) at each MAC unit in a manner such that once weight pixels are read from the global buffer into the LRF, they are reused from the LRF as many times as desired instead of being repeatedly fetched from the global buffer in each convolutional operation. One of the most evident merits of the proposed method is that the procedure is completely free from the burden of data transfer between neighboring MAC units. It was found from our simulations that the proposed method provides a power saving of approximately 83.33% and 97.62% compared with the power savings recorded by the conventional methods, respectively, when the dimensions of the input data matrix and weight matrix are 128 x 128 and 5 x 5, respectively. The power savings increase as the dimensions of the input data matrix or weight matrix increase. | - |
| dc.format.extent | 12 | - |
| dc.language | 영어 | - |
| dc.language.iso | ENG | - |
| dc.publisher | MDPI | - |
| dc.title | Power-Efficient Deep Neural Network Accelerator Minimizing Global Buffer Access without Data Transfer between Neighboring Multiplier-Accumulator Units | - |
| dc.type | Article | - |
| dc.publisher.location | 스위스 | - |
| dc.identifier.doi | 10.3390/electronics11131996 | - |
| dc.identifier.scopusid | 2-s2.0-85132716541 | - |
| dc.identifier.wosid | 000825627500001 | - |
| dc.identifier.bibliographicCitation | ELECTRONICS, v.11, no.13, pp 1 - 12 | - |
| dc.citation.title | ELECTRONICS | - |
| dc.citation.volume | 11 | - |
| dc.citation.number | 13 | - |
| dc.citation.startPage | 1 | - |
| dc.citation.endPage | 12 | - |
| dc.type.docType | Article | - |
| dc.description.isOpenAccess | Y | - |
| dc.description.journalRegisteredClass | scie | - |
| dc.description.journalRegisteredClass | scopus | - |
| dc.relation.journalResearchArea | Computer Science | - |
| dc.relation.journalResearchArea | Engineering | - |
| dc.relation.journalResearchArea | Physics | - |
| dc.relation.journalWebOfScienceCategory | Computer Science, Information Systems | - |
| dc.relation.journalWebOfScienceCategory | Engineering, Electrical & Electronic | - |
| dc.relation.journalWebOfScienceCategory | Physics, Applied | - |
| dc.subject.keywordPlus | COPROCESSOR | - |
| dc.subject.keywordAuthor | deep learning accelerator | - |
| dc.subject.keywordAuthor | field-programmable gate array (FPGA) | - |
| dc.subject.keywordAuthor | deep neural networks (DNNs) | - |
| dc.identifier.url | https://www.mdpi.com/2079-9292/11/13/1996 | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366
COPYRIGHT © 2024 HANYANG UNIVERSITY.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.
