InfiniPot: Infinite Context Processing on Memory-Constrained LLMs
- Authors
- Kim, Minsoo; Shim, Kyuhong; Choi, Jungwook; Chang, Simyung
- Issue Date
- Nov-2024
- Publisher
- Association for Computational Linguistics (ACL)
- Citation
- EMNLP 2024 - 2024 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference, pp 16046 - 16060
- Pages
- 15
- Indexed
- SCOPUS
- Journal Title
- EMNLP 2024 - 2024 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference
- Start Page
- 16046
- End Page
- 16060
- URI
- https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/206725
- DOI
- 10.48550/arXiv.2410.01518
- Abstract
- Handling long input contexts remains a significant challenge for Large Language Models (LLMs), particularly in resource-constrained environments such as mobile devices. Our work aims to address this limitation by introducing InfiniPot, a novel KV cache control framework designed to enable pre-trained LLMs to manage extensive sequences within fixed memory constraints efficiently, without requiring additional training. InfiniPot leverages Continual Context Distillation (CCD), an iterative process that compresses and retains essential information through novel importance metrics, effectively maintaining critical data even without access to future context. Our comprehensive evaluations indicate that InfiniPot significantly outperforms models trained for long contexts in various NLP tasks, establishing its efficacy and versatility. This work represents a substantial advancement toward making LLMs applicable to a broader range of real-world scenarios.
- Files in This Item
-
Go to Link
- Appears in
Collections - 서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.