Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Improving Inference Time of Deep Learning Model with Partial Skip of ReLU-fused Matrix Multiplication Operations

Authors
Kim, SungkyunKim, JaeminKim, NahunKang, MinchealSeo, Jiwon
Issue Date
Apr-2022
Publisher
Institute of Electrical and Electronics Engineers Inc.
Keywords
deep learning optimization; fully-connected layer; inference optimization; omitted computation
Citation
2022 International Conference on Electronics, Information, and Communication, ICEIC 2022, pp.1 - 4
Indexed
SCOPUS
Journal Title
2022 International Conference on Electronics, Information, and Communication, ICEIC 2022
Start Page
1
End Page
4
URI
https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/138782
DOI
10.1109/ICEIC54506.2022.9748210
ISSN
0000-0000
Abstract
Deep learning has been expanding its application, while large-scale models tend to perform well. However, as such a model inevitably requires a vast amount of resources and computations, lengthy inference time is a crucial, but essential, consequence that needs to be optimized for the efficient utilization of deep learning. To achieve the goal, we aim at fusing the Rectified Linear Unit and matrix multiplication in the inference process, which we may reduce the total amount of computation by predicting the sign bit of output value. We propose four methods of prediction and statistically choose an optimal method for reducing inference time with low accuracy loss. © 2022 IEEE.
Files in This Item
Go to Link
Appears in
Collections
서울 공과대학 > 서울 컴퓨터소프트웨어학부 > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Seo, Ji won photo

Seo, Ji won
COLLEGE OF ENGINEERING (SCHOOL OF COMPUTER SCIENCE)
Read more

Altmetrics

Total Views & Downloads

BROWSE