Legion: Tailoring Grouped Neural Execution Considering Heterogeneity on Multiple Edge Devices

Choi, Kyunghwan; Lee, Seongju; Kang, Beom Woo; Park, Yong jun

doi:10.1109/ICCD53106.2021.00067

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Legion: Tailoring Grouped Neural Execution Considering Heterogeneity on Multiple Edge Devices

Full metadata record

DC Field	Value	Language
dc.contributor.author	Choi, Kyunghwan	-
dc.contributor.author	Lee, Seongju	-
dc.contributor.author	Kang, Beom Woo	-
dc.contributor.author	Park, Yong jun	-
dc.date.accessioned	2022-07-06T10:54:03Z	-
dc.date.available	2022-07-06T10:54:03Z	-
dc.date.created	2022-03-07	-
dc.date.issued	2021-12	-
dc.identifier.issn	1063-6404	-
dc.identifier.uri	https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/140065	-
dc.description.abstract	Distributing workloads that cannot be handled by a single edge device across multiple edge devices is a promising solution that minimizes the inference latency of deep learning applications by exploiting model parallelism. Several prior solutions have been proposed to partition target models efficiently, but most studies have focused on finding the optimal fused layer configurations, which minimize the data-transfer overhead between layers. However, as recent deep learning network models have become more complex and the ability to deploy them quickly has become a key challenge, the search for the best fused layer configurations of target models has become a major requirement. To solve this problem, we propose a lightweight model partitioning framework called Legion to find the optimal fused layer configurations with minimal profiling execution trials. By finding the optimal configurations using cost matrix construction and wild card selection, the experimental results showed that Legion achieved a similar performance to the full configuration search at a fraction of the search time. Moreover, Legion performed effectively even on a group of heterogeneous target devices by introducing a per-device cost-related matrix construction. With three popular networks, Legion shows only 3.4% performance loss as compared to a full searching scheme (FSS), on various different device configurations consisting of up to six heterogeneous devices, and minimizes the profiling overhead by 48.7× on average.	-
dc.language	영어	-
dc.language.iso	en	-
dc.publisher	IEEE	-
dc.title	Legion: Tailoring Grouped Neural Execution Considering Heterogeneity on Multiple Edge Devices	-
dc.type	Article	-
dc.contributor.affiliatedAuthor	Park, Yong jun	-
dc.identifier.doi	10.1109/ICCD53106.2021.00067	-
dc.identifier.scopusid	2-s2.0-85123926387	-
dc.identifier.wosid	000763821700056	-
dc.identifier.bibliographicCitation	Proceedings - IEEE International Conference on Computer Design: VLSI in Computers and Processors, v.2021-October, pp.383 - 390	-
dc.relation.isPartOf	Proceedings - IEEE International Conference on Computer Design: VLSI in Computers and Processors	-
dc.citation.title	Proceedings - IEEE International Conference on Computer Design: VLSI in Computers and Processors	-
dc.citation.volume	2021-October	-
dc.citation.startPage	383	-
dc.citation.endPage	390	-
dc.type.rims	ART	-
dc.type.docType	Proceedings Paper	-
dc.description.journalClass	1	-
dc.description.isOpenAccess	N	-
dc.description.journalRegisteredClass	scopus	-
dc.relation.journalResearchArea	Computer Science	-
dc.relation.journalResearchArea	Engineering	-
dc.relation.journalWebOfScienceCategory	Computer Science, Hardware & Architecture	-
dc.relation.journalWebOfScienceCategory	Engineering, Electrical & Electronic	-
dc.subject.keywordPlus	Deep learning	-
dc.subject.keywordPlus	Deep learning	-
dc.subject.keywordPlus	Edge device	-
dc.subject.keywordPlus	Inference	-
dc.subject.keywordPlus	Layer configuration	-
dc.subject.keywordPlus	Learning network	-
dc.subject.keywordPlus	Model partitioning	-
dc.subject.keywordPlus	Network models	-
dc.subject.keywordPlus	Parallel executions	-
dc.subject.keywordPlus	Partitioning frameworks	-
dc.subject.keywordPlus	Target model	-
dc.subject.keywordPlus	Data transfer	-
dc.subject.keywordAuthor	Deep learning	-
dc.subject.keywordAuthor	Edge devices	-
dc.subject.keywordAuthor	Inference	-
dc.subject.keywordAuthor	Layer fusion	-
dc.subject.keywordAuthor	Parallel execution	-
dc.identifier.url	https://ieeexplore.ieee.org/document/9643757	-

Files in This Item: Go to Link

Appears in Collections: 서울 공과대학 > 서울 컴퓨터소프트웨어학부 > 1. Journal Articles

Show simple item record

qrcode

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Altmetrics

Total Views & Downloads

BROWSE