Cited 2 time in
Convergence-aware neural network training
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Oh, Hyungjun | - |
| dc.contributor.author | Yu, Yongseung | - |
| dc.contributor.author | Ryu, Giha | - |
| dc.contributor.author | Ahn, Gunjoo | - |
| dc.contributor.author | Jeong, Yuri | - |
| dc.contributor.author | Park, Yongjun | - |
| dc.contributor.author | Seo, Jiwon | - |
| dc.date.accessioned | 2022-07-07T22:13:51Z | - |
| dc.date.available | 2022-07-07T22:13:51Z | - |
| dc.date.issued | 2020-07 | - |
| dc.identifier.issn | 0738-100X | - |
| dc.identifier.issn | 0146-7123 | - |
| dc.identifier.uri | https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/145403 | - |
| dc.description.abstract | Training a deep neural network(DNN) is expensive, requiring a large amount of computation time. While the training overhead is high, not all computation in DNN training is equal. Some parameters converge faster and thus their gradient computation may contribute little to the parameter update; in nearstationary points a subset of parameters may change very little. In this paper we exploit the parameter convergence to optimize gradient computation in DNN training. We design a light-weight monitoring technique to track the parameter convergence; we prune the gradient computation stochastically for a group of semantically related parameters, exploiting their convergence correlations. These techniques are efficiently implemented in existing GPU kernels. In our evaluation the optimization techniques substantially and robustly improve the training throughput for four DNN models on three public datasets. | - |
| dc.format.extent | 6 | - |
| dc.language | 영어 | - |
| dc.language.iso | ENG | - |
| dc.title | Convergence-aware neural network training | - |
| dc.type | Article | - |
| dc.identifier.doi | 10.1109/DAC18072.2020.9218518 | - |
| dc.identifier.scopusid | 2-s2.0-85093932138 | - |
| dc.identifier.bibliographicCitation | Proceedings - Design Automation Conference, v.2020-July, pp 1 - 6 | - |
| dc.citation.title | Proceedings - Design Automation Conference | - |
| dc.citation.volume | 2020-July | - |
| dc.citation.startPage | 1 | - |
| dc.citation.endPage | 6 | - |
| dc.type.docType | Conference Paper | - |
| dc.description.isOpenAccess | N | - |
| dc.description.journalRegisteredClass | scopus | - |
| dc.subject.keywordPlus | Computer aided design | - |
| dc.subject.keywordPlus | Deep neural networks | - |
| dc.subject.keywordPlus | Parameter estimation | - |
| dc.subject.keywordPlus | Computation time | - |
| dc.subject.keywordPlus | Gradient computation | - |
| dc.subject.keywordPlus | Monitoring techniques | - |
| dc.subject.keywordPlus | Neural network training | - |
| dc.subject.keywordPlus | Optimization techniques | - |
| dc.subject.keywordPlus | Parameter convergence | - |
| dc.subject.keywordPlus | Training overhead | - |
| dc.subject.keywordPlus | Training throughputs | - |
| dc.subject.keywordPlus | Neural networks | - |
| dc.identifier.url | https://ieeexplore.ieee.org/document/9218518 | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366
COPYRIGHT © 2024 HANYANG UNIVERSITY.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.
