Direct Conversion: Accelerating Convolutional Neural Networks Utilizing Sparse Input Activation

Lee, Won-Hyu; Roh, Si-Dong; Park, Sangki; Chung, Ki Seok

doi:10.1109/IECON43393.2020.9254473

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Direct Conversion: Accelerating Convolutional Neural Networks Utilizing Sparse Input Activation

Authors: Lee, Won-Hyu; Roh, Si-Dong; Park, Sangki; Chung, Ki Seok

Issue Date: Oct-2020

Publisher: IEEE Computer Society

Keywords: convolutional neural network; embedded system; sparsity-aware acceleration

Citation: IECON Proceedings (Industrial Electronics Conference), v.2020, no.October, pp.441 - 446

Indexed: SCOPUS

Journal Title: IECON Proceedings (Industrial Electronics Conference)

Volume: 2020

Number: October

Start Page: 441

End Page: 446

URI: https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/3654

DOI: 10.1109/IECON43393.2020.9254473

Abstract: The amount of computation and the number of parameters of neural networks are increasing rapidly as the depth of convolutional neural networks (CNNs) is increasing. Therefore, it is very crucial to reduce both the amount of computation and that of memory usage. The pruning method, which compresses a neural network, has been actively studied. Depending on the layer characteristics, the sparsity level of each layer varies significantly after the pruning is conducted. If weights are sparse, most results of convolution operations will be zeroes. Although several studies have proposed methods to utilize the weight sparsity to avoid carrying out meaningless operations, those studies lack consideration that input activations may also have a high sparsity level. The Rectified Linear Unit (ReLU) function is one of the most popular activation functions because it is simple and yet pretty effective. Due to properties of the ReLU function, it is often observed that the input activation sparsity level is high (up to 85%). Therefore, it is important to consider both the input activation sparsity and the weight one to accelerate CNN to minimize carrying out meaningless computation. In this paper, we propose a new acceleration method called Direct Conversion that considers the weight sparsity under the sparse input activation condition. The Direct Conversion method converts a 3D input tensor directly into a compressed format. This method selectively applies one of two different methods: a method called image to Compressed Sparse Row (im2CSR) when input activations are sparse and weights are dense; the other method called image to Compressed Sparse Overlapped Activations (im2CSOA) when both input activations and weights are sparse. Our experimental results show that Direct Conversion improves the inference speed up to 2.82× compared to the conventional method.

Files in This Item: Go to Link

Appears in Collections: 서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

Show full item record

qrcode

Related Researcher

Researcher Chung, Ki Seok photo

Chung, Ki Seok: COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1365

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE