Lightweight Error Correction for In-Storage Acceleration of Large Language Model Inference

Jeong, Jinwoo; Ahn, Byungmin; Shin, Dongmin; Choi, Jungwook

doi:10.1109/ICEIC61013.2024.10457117

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Lightweight Error Correction for In-Storage Acceleration of Large Language Model Inference

Full metadata record

DC Field	Value	Language
dc.contributor.author	Jeong, Jinwoo	-
dc.contributor.author	Ahn, Byungmin	-
dc.contributor.author	Shin, Dongmin	-
dc.contributor.author	Choi, Jungwook	-
dc.date.accessioned	2024-11-28T14:31:32Z	-
dc.date.available	2024-11-28T14:31:32Z	-
dc.date.issued	2024-01	-
dc.identifier.issn	2574-1403	-
dc.identifier.issn	2767-7699	-
dc.identifier.uri	https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/196965	-
dc.description.abstract	As large language models (LLMs) expand their sizes, conventional GPU-based LLM inference systems face memory bandwidth and capacity limitations. An LLM inference accelerator using NAND flash storage has been proposed to overcome these challenges. However, this necessitates a significant expansion of flash channels to ensure adequate bandwidth for inference, subsequently escalating error correction code (ECC) costs. This paper examines the impact of flash memory errors on LLM inference accuracy and explores the possibility of lightweight ECC by leveraging LLM's inherent error resilience. We analyze the impact of 1) high-order bit indices masking for FP32 LLM parameters, 2) clipping, and 3) a dependency by parameter type of error robustness, and show that a combination of them can reduce ECC bandwidth by up to 9.38%.	-
dc.format.extent	4	-
dc.language	영어	-
dc.language.iso	ENG	-
dc.publisher	Institute of Electrical and Electronics Engineers Inc.	-
dc.title	Lightweight Error Correction for In-Storage Acceleration of Large Language Model Inference	-
dc.type	Article	-
dc.publisher.location	미국	-
dc.identifier.doi	10.1109/ICEIC61013.2024.10457117	-
dc.identifier.scopusid	2-s2.0-85189238917	-
dc.identifier.bibliographicCitation	2024 International Conference on Electronics, Information, and Communication, ICEIC 2024, pp 1 - 4	-
dc.citation.title	2024 International Conference on Electronics, Information, and Communication, ICEIC 2024	-
dc.citation.startPage	1	-
dc.citation.endPage	4	-
dc.type.docType	Conference paper	-
dc.description.isOpenAccess	N	-
dc.description.journalRegisteredClass	scopus	-
dc.subject.keywordPlus	Error correction codes	-
dc.subject.keywordPlus	Errors correction	-
dc.subject.keywordPlus	Inference systems	-
dc.subject.keywordPlus	Language model	-
dc.subject.keywordPlus	Large language model	-
dc.subject.keywordPlus	Memory bandwidths	-
dc.subject.keywordPlus	Memory capacity	-
dc.subject.keywordPlus	Model inference	-
dc.subject.keywordPlus	NAND Flash	-
dc.subject.keywordPlus	NAND flash error	-
dc.subject.keywordAuthor	error correction code	-
dc.subject.keywordAuthor	large language model	-
dc.subject.keywordAuthor	NAND flash errors	-
dc.identifier.url	https://ieeexplore.ieee.org/document/10457117	-

Files in This Item: Go to Link

Appears in Collections: 서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Choi, Jung wook photo

Choi, Jung wook: COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE