Scale-CIM: Precision-scalable computing-in-memory for energy-efficient quantized neural networks

Lee, Young Seo; Gong, Young -Ho; Chung, Sung Woo

doi:10.1016/j.sysarc.2022.102787

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Scale-CIM: Precision-scalable computing-in-memory for energy-efficient quantized neural networks

Authors: Lee, Young Seo; Gong, Young -Ho; Chung, Sung Woo

Issue Date: Jan-2023

Publisher: ELSEVIER

Keywords: Digital -based computing -in -memory; Quantized neural networks; Precision -scalable computation

Citation: JOURNAL OF SYSTEMS ARCHITECTURE, v.134

Journal Title: JOURNAL OF SYSTEMS ARCHITECTURE

Volume: 134

URI: http://scholarworks.bwise.kr/ssu/handle/2018.sw.ssu/43429

DOI: 10.1016/j.sysarc.2022.102787

ISSN: 1383-7621

Abstract: Quantized neural networks (QNNs), which perform multiply-accumulate (MAC) operations with low-precision weights or activations, have been widely exploited to reduce energy consumption. QNNs usually have a tradeoff between energy consumption and accuracy depending on the quantized precision, so that it is necessary to select an appropriate precision for energy efficiency. Nevertheless, the conventional hardware accelerators such as Google TPU are typically designed and optimized for a specific precision (e.g., 8-bit), which may degrade energy efficiency for other precisions. Though an analog-based computing-in-memory (CIM) technology supporting variable precision has been proposed to improve energy efficiency, its implementation requires extremely large and power-consuming analog-to-digital converters (ADCs). In this paper, we propose Scale-CIM, a precision-scalable CIM architecture which supports MAC operations based on digital computations (not analog computations). Scale-CIM performs binary MAC operations with high parallelism, by executing digital-based multiplication operations in the CIM array and accumulation operations in the peripheral logic. In addition, Scale-CIM supports multi-bit MAC operations without ADCs, based on the binary MAC operations and shift operations depending on the precision. Since Scale-CIM fully utilizes the CIM array for various quantized precisions (not for a specific precision), it achieves high compute-throughput. Consequently, Scale-CIM enables precision-scalable CIM-based MAC operations with high parallelism. Our simulation results show that Scale-CIM achieves 1.5-15.8 x speedup and reduces system energy consumption by 53.7-95.7% across different quantized precisions, compared to the state-of-the-art precision-scalable accelerator.

Files in This Item: Go to Link

Appears in Collections: College of Information Technology > School of Software > 1. Journal Articles

Show full item record

qrcode

Related Researcher

Researcher Lee, Young Seo photo

Lee, Young Seo: College of Information Technology (Department of IT Convergence)

Read more

Altmetrics

Total Views & Downloads

STATISTICS: Total View :8,300,529; Today View :1,070

RSS_1.0 RSS_2.0 ATOM_1.0

Soongsil University Library 369 Sangdo-Ro, Dongjak-Gu, Seoul, Korea (06978)02-820-0733

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE