Cited 0 time in
InfiniPot: Infinite Context Processing on Memory-Constrained LLMs
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Kim, Minsoo | - |
| dc.contributor.author | Shim, Kyuhong | - |
| dc.contributor.author | Choi, Jungwook | - |
| dc.contributor.author | Chang, Simyung | - |
| dc.date.accessioned | 2025-03-11T01:00:10Z | - |
| dc.date.available | 2025-03-11T01:00:10Z | - |
| dc.date.issued | 2024-11 | - |
| dc.identifier.uri | https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/206725 | - |
| dc.description.abstract | Handling long input contexts remains a significant challenge for Large Language Models (LLMs), particularly in resource-constrained environments such as mobile devices. Our work aims to address this limitation by introducing InfiniPot, a novel KV cache control framework designed to enable pre-trained LLMs to manage extensive sequences within fixed memory constraints efficiently, without requiring additional training. InfiniPot leverages Continual Context Distillation (CCD), an iterative process that compresses and retains essential information through novel importance metrics, effectively maintaining critical data even without access to future context. Our comprehensive evaluations indicate that InfiniPot significantly outperforms models trained for long contexts in various NLP tasks, establishing its efficacy and versatility. This work represents a substantial advancement toward making LLMs applicable to a broader range of real-world scenarios. | - |
| dc.format.extent | 15 | - |
| dc.language | 영어 | - |
| dc.language.iso | ENG | - |
| dc.publisher | Association for Computational Linguistics (ACL) | - |
| dc.title | InfiniPot: Infinite Context Processing on Memory-Constrained LLMs | - |
| dc.type | Article | - |
| dc.identifier.doi | 10.48550/arXiv.2410.01518 | - |
| dc.identifier.scopusid | 2-s2.0-85217816955 | - |
| dc.identifier.bibliographicCitation | EMNLP 2024 - 2024 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference, pp 16046 - 16060 | - |
| dc.citation.title | EMNLP 2024 - 2024 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference | - |
| dc.citation.startPage | 16046 | - |
| dc.citation.endPage | 16060 | - |
| dc.type.docType | Conference paper | - |
| dc.description.isOpenAccess | N | - |
| dc.description.journalRegisteredClass | scopus | - |
| dc.subject.keywordPlus | Cache memory | - |
| dc.subject.keywordPlus | Context free languages | - |
| dc.subject.keywordPlus | Memory management | - |
| dc.identifier.url | https://arxiv.org/abs/2410.01518 | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366
COPYRIGHT © 2024 HANYANG UNIVERSITY.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.
