RA-LoRA: Rank-Adaptive Parameter-Efficient Fine-Tuning for Accurate 2-bit Quantized Large Language Models
- Authors
- Kim, Minsoo; Lee, Sihwa; Sung, Wonyong; Choi, Jungwook
- Issue Date
- Aug-2024
- Publisher
- ASSOC COMPUTATIONAL LINGUISTICS-ACL
- Citation
- FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, pp 15773 - 15786
- Pages
- 14
- Indexed
- SCOPUS
- Journal Title
- FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024
- Start Page
- 15773
- End Page
- 15786
- URI
- https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/207022
- DOI
- 10.18653/v1/2024.findings-acl.933
- Abstract
- Deploying large language models (LLMs) with their extensive parameters and high memory demands challenges computational efficiency, particularly in fine-tuning for specific applications with limited resources. Techniques like LowRank Adaptation (LoRA) help by training a smaller, modifiable extension of the base model to reduce memory usage. However, combining quantization with LoRA, especially in low-bit scenarios, can lead to performance losses due to quantization errors. Our innovative RankAdaptive LoRA (RA-LoRA) addresses this by dynamically adjusting the adapter's rank using rank-subspace analysis, optimizing performance with fewer parameters. We tested RALoRA on state-of-the-art LLMs for 2-bit efficient fine-tuning, showing it can improve model accuracy with minimal trainable parameters, marking a leap forward in quantization-aware fine-tuning methods and highlighting the significance of rank dynamics in optimizing quantized LLMs.
- Files in This Item
-
Go to Link
- Appears in
Collections - 서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.