Detailed Information

Cited 0 time in webofscience Cited 1 time in scopus
Metadata Downloads

SHAT: A Novel Asynchronous Training Algorithm That Provides Fast Model Convergence in Distributed Deep Learning

Full metadata record
DC Field Value Language
dc.contributor.authorKo, Yunyong-
dc.contributor.authorKim, Sang-Wook-
dc.date.accessioned2022-07-06T02:15:36Z-
dc.date.available2022-07-06T02:15:36Z-
dc.date.created2022-01-26-
dc.date.issued2022-01-
dc.identifier.urihttps://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/138459-
dc.description.abstractThe recent unprecedented success of deep learning (DL) in various fields is underlied by its use of large-scale data and models. Training a large-scale deep neural network (DNN) model with large-scale data, however, is time-consuming. To speed up the training of massive DNN models, data-parallel distributed training based on the parameter server (PS) has been widely applied. In general, a synchronous PS-based training suffers from the synchronization overhead, especially in heterogeneous environments. To reduce the synchronization overhead, asynchronous PS-based training employs the asynchronous communication between PS and workers so that PS processes the request of each worker independently without waiting. Despite the performance improvement of asynchronous training, however, it inevitably incurs the difference among the local models of workers, where such a difference among workers may cause slower model convergence. Fro addressing this problem, in this work, we propose a novel asynchronous PS-based training algorithm, SHAT that considers (1) the scale of distributed training and (2) the heterogeneity among workers for successfully reducing the difference among the local models of workers. The extensive empirical evaluation demonstrates that (1) the model trained by SHAT converges to the higher accuracy up to 5.22% than state-of-the-art algorithms, and (2) the model convergence of SHAT is robust under various heterogeneous environments.-
dc.language영어-
dc.language.isoen-
dc.publisherMDPI-
dc.titleSHAT: A Novel Asynchronous Training Algorithm That Provides Fast Model Convergence in Distributed Deep Learning-
dc.typeArticle-
dc.contributor.affiliatedAuthorKim, Sang-Wook-
dc.identifier.doi10.3390/app12010292-
dc.identifier.scopusid2-s2.0-85121979484-
dc.identifier.wosid000741122100001-
dc.identifier.bibliographicCitationAPPLIED SCIENCES-BASEL, v.12, no.1, pp.1 - 14-
dc.relation.isPartOfAPPLIED SCIENCES-BASEL-
dc.citation.titleAPPLIED SCIENCES-BASEL-
dc.citation.volume12-
dc.citation.number1-
dc.citation.startPage1-
dc.citation.endPage14-
dc.type.rimsART-
dc.type.docTypeArticle-
dc.description.journalClass1-
dc.description.isOpenAccessY-
dc.description.journalRegisteredClassscie-
dc.description.journalRegisteredClassscopus-
dc.relation.journalResearchAreaChemistry-
dc.relation.journalResearchAreaEngineering-
dc.relation.journalResearchAreaMaterials Science-
dc.relation.journalResearchAreaPhysics-
dc.relation.journalWebOfScienceCategoryChemistry, Multidisciplinary-
dc.relation.journalWebOfScienceCategoryEngineering, Multidisciplinary-
dc.relation.journalWebOfScienceCategoryMaterials Science, Multidisciplinary-
dc.relation.journalWebOfScienceCategoryPhysics, Applied-
dc.subject.keywordAuthordistributed deep learning-
dc.subject.keywordAuthordata parallelism-
dc.subject.keywordAuthorPS-based distributed training-
dc.subject.keywordAuthorheterogeneous environments-
dc.identifier.urlhttps://www.mdpi.com/2076-3417/12/1/292-
Files in This Item
Appears in
Collections
서울 공과대학 > 서울 컴퓨터소프트웨어학부 > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Kim, Sang-Wook photo

Kim, Sang-Wook
COLLEGE OF ENGINEERING (SCHOOL OF COMPUTER SCIENCE)
Read more

Altmetrics

Total Views & Downloads

BROWSE