Cited 0 time in
Legion: Tailoring Grouped Neural Execution Considering Heterogeneity on Multiple Edge Devices
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Choi, Kyunghwan | - |
| dc.contributor.author | Lee, Seongju | - |
| dc.contributor.author | Kang, Beom Woo | - |
| dc.contributor.author | Park, Yong jun | - |
| dc.date.accessioned | 2022-07-06T10:54:03Z | - |
| dc.date.available | 2022-07-06T10:54:03Z | - |
| dc.date.created | 2022-03-07 | - |
| dc.date.issued | 2021-12 | - |
| dc.identifier.issn | 1063-6404 | - |
| dc.identifier.uri | https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/140065 | - |
| dc.description.abstract | Distributing workloads that cannot be handled by a single edge device across multiple edge devices is a promising solution that minimizes the inference latency of deep learning applications by exploiting model parallelism. Several prior solutions have been proposed to partition target models efficiently, but most studies have focused on finding the optimal fused layer configurations, which minimize the data-transfer overhead between layers. However, as recent deep learning network models have become more complex and the ability to deploy them quickly has become a key challenge, the search for the best fused layer configurations of target models has become a major requirement. To solve this problem, we propose a lightweight model partitioning framework called Legion to find the optimal fused layer configurations with minimal profiling execution trials. By finding the optimal configurations using cost matrix construction and wild card selection, the experimental results showed that Legion achieved a similar performance to the full configuration search at a fraction of the search time. Moreover, Legion performed effectively even on a group of heterogeneous target devices by introducing a per-device cost-related matrix construction. With three popular networks, Legion shows only 3.4% performance loss as compared to a full searching scheme (FSS), on various different device configurations consisting of up to six heterogeneous devices, and minimizes the profiling overhead by 48.7× on average. | - |
| dc.language | 영어 | - |
| dc.language.iso | en | - |
| dc.publisher | IEEE | - |
| dc.title | Legion: Tailoring Grouped Neural Execution Considering Heterogeneity on Multiple Edge Devices | - |
| dc.type | Article | - |
| dc.contributor.affiliatedAuthor | Park, Yong jun | - |
| dc.identifier.doi | 10.1109/ICCD53106.2021.00067 | - |
| dc.identifier.scopusid | 2-s2.0-85123926387 | - |
| dc.identifier.wosid | 000763821700056 | - |
| dc.identifier.bibliographicCitation | Proceedings - IEEE International Conference on Computer Design: VLSI in Computers and Processors, v.2021-October, pp.383 - 390 | - |
| dc.relation.isPartOf | Proceedings - IEEE International Conference on Computer Design: VLSI in Computers and Processors | - |
| dc.citation.title | Proceedings - IEEE International Conference on Computer Design: VLSI in Computers and Processors | - |
| dc.citation.volume | 2021-October | - |
| dc.citation.startPage | 383 | - |
| dc.citation.endPage | 390 | - |
| dc.type.rims | ART | - |
| dc.type.docType | Proceedings Paper | - |
| dc.description.journalClass | 1 | - |
| dc.description.isOpenAccess | N | - |
| dc.description.journalRegisteredClass | scopus | - |
| dc.relation.journalResearchArea | Computer Science | - |
| dc.relation.journalResearchArea | Engineering | - |
| dc.relation.journalWebOfScienceCategory | Computer Science, Hardware & Architecture | - |
| dc.relation.journalWebOfScienceCategory | Engineering, Electrical & Electronic | - |
| dc.subject.keywordPlus | Deep learning | - |
| dc.subject.keywordPlus | Deep learning | - |
| dc.subject.keywordPlus | Edge device | - |
| dc.subject.keywordPlus | Inference | - |
| dc.subject.keywordPlus | Layer configuration | - |
| dc.subject.keywordPlus | Learning network | - |
| dc.subject.keywordPlus | Model partitioning | - |
| dc.subject.keywordPlus | Network models | - |
| dc.subject.keywordPlus | Parallel executions | - |
| dc.subject.keywordPlus | Partitioning frameworks | - |
| dc.subject.keywordPlus | Target model | - |
| dc.subject.keywordPlus | Data transfer | - |
| dc.subject.keywordAuthor | Deep learning | - |
| dc.subject.keywordAuthor | Edge devices | - |
| dc.subject.keywordAuthor | Inference | - |
| dc.subject.keywordAuthor | Layer fusion | - |
| dc.subject.keywordAuthor | Parallel execution | - |
| dc.identifier.url | https://ieeexplore.ieee.org/document/9643757 | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366
COPYRIGHT © 2024 HANYANG UNIVERSITY.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.
