Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

MITIGATING PARAMETER INTERFERENCE IN MODEL MERGING VIA SHARPNESS-AWARE FINE-TUNING

Full metadata record
DC Field Value Language
dc.contributor.authorLee, Yeoreum-
dc.contributor.authorJung, Jinwook-
dc.contributor.authorBaik, Sungyong-
dc.date.accessioned2025-08-12T06:30:24Z-
dc.date.available2025-08-12T06:30:24Z-
dc.date.issued2025-04-
dc.identifier.urihttps://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/208489-
dc.description.abstractLarge-scale deep learning models with a pretraining-finetuning paradigm have led to a surge of numerous task-specific models fine-tuned from a common pre-trained model. Recently, several research efforts have been made on merging these large models into a single multi-task model, particularly with simple arithmetic on parameters. Such merging methodology faces a central challenge: interference between model parameters fine-tuned on different tasks. Few recent works have focused on designing a new fine-tuning scheme that can lead to small parameter interference, however at the cost of the performance of each task-specific fine-tuned model and thereby limiting that of a merged model. To improve the performance of a merged model, we note that a fine-tuning scheme should aim for (1) smaller parameter interference and (2) better performance of each fine-tuned model on the corresponding task. In this work, we aim to design a new fine-tuning objective function to work towards these two goals. In the course of this process, we find such objective function to be strikingly similar to sharpness-aware minimization (SAM) objective function, which aims to achieve generalization by finding flat minima. Drawing upon our observation, we propose to fine-tune pre-trained models via sharpness-aware minimization. The experimental and theoretical results showcase the effectiveness and orthogonality of our proposed approach, improving performance upon various merging and fine-tuning methods. Our code is available at https://github.com/baiklab/SAFT-Merge.-
dc.format.extent23-
dc.language영어-
dc.language.isoENG-
dc.publisherInternational Conference on Learning Representations, ICLR-
dc.titleMITIGATING PARAMETER INTERFERENCE IN MODEL MERGING VIA SHARPNESS-AWARE FINE-TUNING-
dc.typeArticle-
dc.identifier.doi10.48550/arXiv.2504.14662-
dc.identifier.scopusid2-s2.0-105010229105-
dc.identifier.bibliographicCitation13th International Conference on Learning Representations, ICLR 2025, pp 31270 - 31292-
dc.citation.title13th International Conference on Learning Representations, ICLR 2025-
dc.citation.startPage31270-
dc.citation.endPage31292-
dc.type.docTypeConference paper-
dc.description.isOpenAccessN-
dc.description.journalRegisteredClassscopus-
dc.identifier.urlhttps://arxiv.org/abs/2504.14662-
Files in This Item
Go to Link
Appears in
Collections
서울 공과대학 > ETC > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Baik, Sungyong photo

Baik, Sungyong
COLLEGE OF ENGINEERING (DEPARTMENT OF INTELLIGENCE COMPUTING)
Read more

Altmetrics

Total Views & Downloads

BROWSE