Detailed Information

Cited 0 time in webofscience Cited 2 time in scopus
Metadata Downloads

Convergence-aware neural network training

Full metadata record
DC Field Value Language
dc.contributor.authorOh, Hyungjun-
dc.contributor.authorYu, Yongseung-
dc.contributor.authorRyu, Giha-
dc.contributor.authorAhn, Gunjoo-
dc.contributor.authorJeong, Yuri-
dc.contributor.authorPark, Yongjun-
dc.contributor.authorSeo, Jiwon-
dc.date.accessioned2022-07-07T22:13:51Z-
dc.date.available2022-07-07T22:13:51Z-
dc.date.issued2020-07-
dc.identifier.issn0738-100X-
dc.identifier.issn0146-7123-
dc.identifier.urihttps://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/145403-
dc.description.abstractTraining a deep neural network(DNN) is expensive, requiring a large amount of computation time. While the training overhead is high, not all computation in DNN training is equal. Some parameters converge faster and thus their gradient computation may contribute little to the parameter update; in nearstationary points a subset of parameters may change very little. In this paper we exploit the parameter convergence to optimize gradient computation in DNN training. We design a light-weight monitoring technique to track the parameter convergence; we prune the gradient computation stochastically for a group of semantically related parameters, exploiting their convergence correlations. These techniques are efficiently implemented in existing GPU kernels. In our evaluation the optimization techniques substantially and robustly improve the training throughput for four DNN models on three public datasets.-
dc.format.extent6-
dc.language영어-
dc.language.isoENG-
dc.titleConvergence-aware neural network training-
dc.typeArticle-
dc.identifier.doi10.1109/DAC18072.2020.9218518-
dc.identifier.scopusid2-s2.0-85093932138-
dc.identifier.bibliographicCitationProceedings - Design Automation Conference, v.2020-July, pp 1 - 6-
dc.citation.titleProceedings - Design Automation Conference-
dc.citation.volume2020-July-
dc.citation.startPage1-
dc.citation.endPage6-
dc.type.docTypeConference Paper-
dc.description.isOpenAccessN-
dc.description.journalRegisteredClassscopus-
dc.subject.keywordPlusComputer aided design-
dc.subject.keywordPlusDeep neural networks-
dc.subject.keywordPlusParameter estimation-
dc.subject.keywordPlusComputation time-
dc.subject.keywordPlusGradient computation-
dc.subject.keywordPlusMonitoring techniques-
dc.subject.keywordPlusNeural network training-
dc.subject.keywordPlusOptimization techniques-
dc.subject.keywordPlusParameter convergence-
dc.subject.keywordPlusTraining overhead-
dc.subject.keywordPlusTraining throughputs-
dc.subject.keywordPlusNeural networks-
dc.identifier.urlhttps://ieeexplore.ieee.org/document/9218518-
Files in This Item
Go to Link
Appears in
Collections
서울 공과대학 > 서울 컴퓨터소프트웨어학부 > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Altmetrics

Total Views & Downloads

BROWSE