MAGIC: A Multi-Hop and Graph-Based Benchmark for Inter-Context Conflicts in Retrieval-Augmented Generation

Lee, Jungyeon; Lee, Kangmin; Kim, Taeuk

doi:10.18653/v1/2025.findings-emnlp.466

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

MAGIC: A Multi-Hop and Graph-Based Benchmark for Inter-Context Conflicts in Retrieval-Augmented Generation

Full metadata record

DC Field	Value	Language
dc.contributor.author	Lee, Jungyeon	-
dc.contributor.author	Lee, Kangmin	-
dc.contributor.author	Kim, Taeuk	-
dc.date.accessioned	2026-02-20T06:00:34Z	-
dc.date.available	2026-02-20T06:00:34Z	-
dc.date.issued	2025-11	-
dc.identifier.uri	https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/210875	-
dc.description.abstract	Knowledge conflict often arises in retrieval-augmented generation (RAG) systems, where retrieved documents may be inconsistent with one another or contradict the model’s parametric knowledge.Existing benchmarks for investigating the phenomenon have notable limitations, including a narrow focus on the question answering setup, heavy reliance on entity substitution techniques, and a restricted range of conflict types. To address these issues, we propose a knowledge graph (KG)-based framework that generates varied and subtle conflicts between two similar yet distinct contexts, while ensuring interpretability through the explicit relational structure of KGs.Experimental results on our benchmark, MAGIC, provide intriguing insights into the inner workings of LLMs regarding knowledge conflict: both open-source and proprietary models struggle with conflict detection—especially when multi-hop reasoning is required—and often fail to pinpoint the exact source of contradictions.Finally, we present in-depth analyses that serve as a foundation for improving LLMs in integrating diverse, sometimes even conflicting, information.	-
dc.format.extent	21	-
dc.language	영어	-
dc.language.iso	ENG	-
dc.publisher	Association for Computational Linguistics	-
dc.title	MAGIC: A Multi-Hop and Graph-Based Benchmark for Inter-Context Conflicts in Retrieval-Augmented Generation	-
dc.type	Article	-
dc.identifier.doi	10.18653/v1/2025.findings-emnlp.466	-
dc.identifier.scopusid	2-s2.0-105028984845	-
dc.identifier.bibliographicCitation	Findings of the Association for Computational Linguistics: EMNLP 2025, pp 8783 - 8803	-
dc.citation.title	Findings of the Association for Computational Linguistics: EMNLP 2025	-
dc.citation.startPage	8783	-
dc.citation.endPage	8803	-
dc.type.docType	Conference paper	-
dc.description.isOpenAccess	N	-
dc.description.journalRegisteredClass	scopus	-
dc.subject.keywordPlus	Computational linguistics	-
dc.subject.keywordPlus	Graph structures	-
dc.subject.keywordPlus	Graphic methods	-
dc.subject.keywordPlus	Information retrieval	-
dc.subject.keywordPlus	Knowledge graph	-
dc.identifier.url	https://aclanthology.org/2025.findings-emnlp.466/	-

Files in This Item: Go to Link

Appears in Collections: 서울 공과대학 > 서울 컴퓨터소프트웨어학부 > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Kim, Taeuk photo

Kim, Taeuk: COLLEGE OF ENGINEERING (SCHOOL OF COMPUTER SCIENCE)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE