MITIGATING PARAMETER INTERFERENCE IN MODEL MERGING VIA SHARPNESS-AWARE FINE-TUNING

Lee, Yeoreum; Jung, Jinwook; Baik, Sungyong

doi:10.48550/arXiv.2504.14662

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

MITIGATING PARAMETER INTERFERENCE IN MODEL MERGING VIA SHARPNESS-AWARE FINE-TUNING

Full metadata record

DC Field	Value	Language
dc.contributor.author	Lee, Yeoreum	-
dc.contributor.author	Jung, Jinwook	-
dc.contributor.author	Baik, Sungyong	-
dc.date.accessioned	2025-08-12T06:30:24Z	-
dc.date.available	2025-08-12T06:30:24Z	-
dc.date.issued	2025-04	-
dc.identifier.uri	https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/208489	-
dc.description.abstract	Large-scale deep learning models with a pretraining-finetuning paradigm have led to a surge of numerous task-specific models fine-tuned from a common pre-trained model. Recently, several research efforts have been made on merging these large models into a single multi-task model, particularly with simple arithmetic on parameters. Such merging methodology faces a central challenge: interference between model parameters fine-tuned on different tasks. Few recent works have focused on designing a new fine-tuning scheme that can lead to small parameter interference, however at the cost of the performance of each task-specific fine-tuned model and thereby limiting that of a merged model. To improve the performance of a merged model, we note that a fine-tuning scheme should aim for (1) smaller parameter interference and (2) better performance of each fine-tuned model on the corresponding task. In this work, we aim to design a new fine-tuning objective function to work towards these two goals. In the course of this process, we find such objective function to be strikingly similar to sharpness-aware minimization (SAM) objective function, which aims to achieve generalization by finding flat minima. Drawing upon our observation, we propose to fine-tune pre-trained models via sharpness-aware minimization. The experimental and theoretical results showcase the effectiveness and orthogonality of our proposed approach, improving performance upon various merging and fine-tuning methods. Our code is available at https://github.com/baiklab/SAFT-Merge.	-
dc.format.extent	23	-
dc.language	영어	-
dc.language.iso	ENG	-
dc.publisher	International Conference on Learning Representations, ICLR	-
dc.title	MITIGATING PARAMETER INTERFERENCE IN MODEL MERGING VIA SHARPNESS-AWARE FINE-TUNING	-
dc.type	Article	-
dc.identifier.doi	10.48550/arXiv.2504.14662	-
dc.identifier.scopusid	2-s2.0-105010229105	-
dc.identifier.bibliographicCitation	13th International Conference on Learning Representations, ICLR 2025, pp 31270 - 31292	-
dc.citation.title	13th International Conference on Learning Representations, ICLR 2025	-
dc.citation.startPage	31270	-
dc.citation.endPage	31292	-
dc.type.docType	Conference paper	-
dc.description.isOpenAccess	N	-
dc.description.journalRegisteredClass	scopus	-
dc.identifier.url	https://arxiv.org/abs/2504.14662	-

Files in This Item: Go to Link

Appears in Collections: 서울 공과대학 > ETC > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Baik, Sungyong photo

Baik, Sungyong: COLLEGE OF ENGINEERING (DEPARTMENT OF INTELLIGENCE COMPUTING)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE