Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

OPTIMUS: OPTImized matrix MUltiplication Structure for Transformer neural network accelerator

Authors
Park, JunkiYoon, HyunsungAhn, DaehyunChoi, JungwookKim, Jae-Joon
Issue Date
Mar-2020
Publisher
Machine Learning and Systems
Citation
Machine Learning and Systems 2020 (MLSys 2020), pp.1 - 16
Indexed
OTHER
Journal Title
Machine Learning and Systems 2020 (MLSys 2020)
Start Page
1
End Page
16
URI
https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/1988
Abstract
We present a high-performance Transformer neural network inference accelerator named OPTIMUS. OPTIMUS has several features for performance enhancement such as the redundant computation skipping method to accelerate the decoding process and the Set-Associative RCSC (SA-RCSC) sparse matrix format to maintain high utilization even when large number of MACs are used in hardware. OPTIMUS also has a flexible hardware architecture to support diverse matrix multiplications and it keeps all the intermediate computation values fully local and completely eliminate the DRAM access to achieve exceptionally fast single batch inference. It also reduces the data transfer overhead by carefully matching the data compute and load cycles. The simulation using the WMT15(EN-DE) dataset shows that latency of OPTIMUS is 41.62×, 24.23×, 16.01× smaller than that of Intel(R) i7 6900K CPU, NVIDIA Titan Xp GPU, and the baseline custom hardware, respectively. In addition, the throughput of OPTIMUS is 43.35×, 25.45× and 19.00× higher and the energy efficiency of OPTIMUS is 2393.85×, 1464× and 19.01× better than that of CPU, GPU and the baseline custom hardware, respectively.
Files in This Item
Go to Link
Appears in
Collections
서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Choi, Jung wook photo

Choi, Jung wook
COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)
Read more

Altmetrics

Total Views & Downloads

BROWSE