MITIGATING PARAMETER INTERFERENCE IN MODEL MERGING VIA SHARPNESS-AWARE FINE-TUNING
- Authors
- Lee, Yeoreum; Jung, Jinwook; Baik, Sungyong
- Issue Date
- Apr-2025
- Publisher
- International Conference on Learning Representations, ICLR
- Citation
- 13th International Conference on Learning Representations, ICLR 2025, pp 31270 - 31292
- Pages
- 23
- Indexed
- SCOPUS
- Journal Title
- 13th International Conference on Learning Representations, ICLR 2025
- Start Page
- 31270
- End Page
- 31292
- URI
- https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/208489
- DOI
- 10.48550/arXiv.2504.14662
- Abstract
- Large-scale deep learning models with a pretraining-finetuning paradigm have led to a surge of numerous task-specific models fine-tuned from a common pre-trained model. Recently, several research efforts have been made on merging these large models into a single multi-task model, particularly with simple arithmetic on parameters. Such merging methodology faces a central challenge: interference between model parameters fine-tuned on different tasks. Few recent works have focused on designing a new fine-tuning scheme that can lead to small parameter interference, however at the cost of the performance of each task-specific fine-tuned model and thereby limiting that of a merged model. To improve the performance of a merged model, we note that a fine-tuning scheme should aim for (1) smaller parameter interference and (2) better performance of each fine-tuned model on the corresponding task. In this work, we aim to design a new fine-tuning objective function to work towards these two goals. In the course of this process, we find such objective function to be strikingly similar to sharpness-aware minimization (SAM) objective function, which aims to achieve generalization by finding flat minima. Drawing upon our observation, we propose to fine-tune pre-trained models via sharpness-aware minimization. The experimental and theoretical results showcase the effectiveness and orthogonality of our proposed approach, improving performance upon various merging and fine-tuning methods. Our code is available at https://github.com/baiklab/SAFT-Merge.
- Files in This Item
-
Go to Link
- Appears in
Collections - 서울 공과대학 > ETC > 1. Journal Articles

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.