Architecture-Aware Optimization of Layer Fusion for Latency-Optimal CNN Inference
- Authors
- Yoon, Minyong; Choi, Jungwook
- Issue Date
- Jun-2023
- Publisher
- Institute of Electrical and Electronics Engineers Inc.
- Keywords
- analytic cost model; convolutional neural network; dataflow optimization; Layer fusion; systolic array
- Citation
- AICAS 2023 - IEEE International Conference on Artificial Intelligence Circuits and Systems, Proceeding, pp.1 - 4
- Indexed
- SCOPUS
- Journal Title
- AICAS 2023 - IEEE International Conference on Artificial Intelligence Circuits and Systems, Proceeding
- Start Page
- 1
- End Page
- 4
- URI
- https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/189408
- DOI
- 10.1109/AICAS57966.2023.10168659
- Abstract
- Layer fusion is an effective technique for accelerating latency-sensitive CNN inference tasks on resource-constrained accelerators that exploit distributed on-chip integrated memory-accelerator processing-in memory (PIM). However, previous research primarily focused on optimizing memory access, neglecting the significant impact of hardware architecture on latency. This study presents an analytical latency model for a 2D systolic array accelerator, taking into account various hardware factors such as array dimensions, buffer size, and bandwidth. We then investigate the influence of hardware architecture and fusion strategies, including weight and overlap reuse, on performance; these aspects are insufficiently addressed in existing access-based fusion models. By incorporating layer fusion with our proposed latency model across different architectures, dataflows, and workloads, we achieve up to a 53.1% reduction in end-to-end network latency compared to an access-based model.
- Files in This Item
-
Go to Link
- Appears in
Collections - 서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles
![qrcode](https://api.qrserver.com/v1/create-qr-code/?size=55x55&data=https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/189408)
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.