Benchmarking Direct Preference Optimization for Medical Large Vision–Language Models

Kim, Dain; Lee, Jiwoo; Yun, Jaehoon; Koo, Yong Hoe; Chen, Qingyu; Kim, Hyunjae; Kang, Jaewoo

doi:10.18653/v1/2026.findings-eacl.267

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Benchmarking Direct Preference Optimization for Medical Large Vision–Language Models

Full metadata record

DC Field	Value	Language
dc.contributor.author	Kim, Dain	-
dc.contributor.author	Lee, Jiwoo	-
dc.contributor.author	Yun, Jaehoon	-
dc.contributor.author	Koo, Yong Hoe	-
dc.contributor.author	Chen, Qingyu	-
dc.contributor.author	Kim, Hyunjae	-
dc.contributor.author	Kang, Jaewoo	-
dc.date.accessioned	2026-06-01T07:30:31Z	-
dc.date.available	2026-06-01T07:30:31Z	-
dc.date.issued	2026-03	-
dc.identifier.uri	https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/212923	-
dc.description.abstract	Large vision-language models (LVLMs) are gaining traction in clinical tasks such as diagnostic support, report generation, and medical question answering. Among post-training techniques, Direct Preference Optimization (DPO) has shown promise in aligning model outputs with human preferences, yet its effectiveness in high-stakes medical contexts remains underexplored. In this work, we present the first systematic evaluation of nine DPO variants applied to two leading medical LVLMs, LLaVA-Med and HuatuoGPT-Vision. We benchmark these models on five curated datasets covering diverse clinical tasks. Evaluations include both automated metrics and expert assessments. Our results show that while DPO improves alignment and reduces severe hallucinations, it yields inconsistent gains over supervised fine-tuning. We further introduce DPO variant that better handles visual misinterpretations and enhances clinical understanding. These findings reveal both the potential and limitations of DPO in medical AI. To support future research, we will release all DPO training data, model checkpoints, and expert annotations upon acceptance.	-
dc.format.extent	16	-
dc.language	영어	-
dc.language.iso	ENG	-
dc.publisher	Association for Computational Linguistics (ACL)	-
dc.title	Benchmarking Direct Preference Optimization for Medical Large Vision–Language Models	-
dc.type	Article	-
dc.identifier.doi	10.18653/v1/2026.findings-eacl.267	-
dc.identifier.scopusid	2-s2.0-105038865684	-
dc.identifier.bibliographicCitation	19th Conference of the European Chapter of the Association for Computational Linguistics, Findings of EACL 2026, pp 5052 - 5067	-
dc.citation.title	19th Conference of the European Chapter of the Association for Computational Linguistics, Findings of EACL 2026	-
dc.citation.startPage	5052	-
dc.citation.endPage	5067	-
dc.type.docType	Conference paper	-
dc.description.isOpenAccess	Y	-
dc.description.journalRegisteredClass	scopus	-
dc.subject.keywordPlus	Computational linguistics	-
dc.subject.keywordPlus	Computer vision	-
dc.subject.keywordPlus	Natural language processing systems	-
dc.identifier.url	https://aclanthology.org/2026.findings-eacl.267/	-

Files in This Item: Go to Link

Appears in Collections: 서울 의과대학 > 서울 내과학교실 > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Yoon, Jai Hoon photo

Yoon, Jai Hoon: 서울 의과대학 (DEPARTMENT OF INTERNAL MEDICINE)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE