Developing a Pragmatic Benchmark for Assessing Korean Legal Language Understanding in Large Language Models

Kim, Yeeun; Choi, Jinhwan; Choi, Young Rok; Park, Hai Jin; Choi, Eunkyung; Hwang, Wonseok

doi:10.48550/arXiv.2410.08731

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Developing a Pragmatic Benchmark for Assessing Korean Legal Language Understanding in Large Language Models

Full metadata record

DC Field	Value	Language
dc.contributor.author	Kim, Yeeun	-
dc.contributor.author	Choi, Jinhwan	-
dc.contributor.author	Choi, Young Rok	-
dc.contributor.author	Park, Hai Jin	-
dc.contributor.author	Choi, Eunkyung	-
dc.contributor.author	Hwang, Wonseok	-
dc.date.accessioned	2025-03-11T02:30:19Z	-
dc.date.available	2025-03-11T02:30:19Z	-
dc.date.issued	2024-11	-
dc.identifier.uri	https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/206738	-
dc.description.abstract	Large language models (LLMs) have demonstrated remarkable performance in the legal domain, with GPT-4 even passing the Uniform Bar Exam in the U.S. However their efficacy remains limited for non-standardized tasks and tasks in languages other than English. This underscores the need for careful evaluation of LLMs within each legal system before application. Here, we introduce KBL, a benchmark for assessing the Korean legal language understanding of LLMs, consisting of (1) 7 legal knowledge tasks (510 examples), (2) 4 legal reasoning tasks (288 examples), and (3) the Korean bar exam (4 domains, 53 tasks, 2,510 examples). First two datasets were developed in close collaboration with lawyers to evaluate LLMs in practical scenarios in a certified manner. Furthermore, considering legal practitioners' frequent use of extensive legal documents for research, we assess LLMs in both a closed book setting, where they rely solely on internal knowledge, and a retrieval-augmented generation (RAG) setting, using a corpus of Korean statutes and precedents. The results indicate substantial room and opportunities for improvement.	-
dc.format.extent	23	-
dc.language	영어	-
dc.language.iso	ENG	-
dc.publisher	Association for Computational Linguistics (ACL)	-
dc.title	Developing a Pragmatic Benchmark for Assessing Korean Legal Language Understanding in Large Language Models	-
dc.type	Article	-
dc.identifier.doi	10.48550/arXiv.2410.08731	-
dc.identifier.scopusid	2-s2.0-85217622280	-
dc.identifier.bibliographicCitation	EMNLP 2024 - 2024 Conference on Empirical Methods in Natural Language Processing, Findings of EMNLP 2024, pp 5573 - 5595	-
dc.citation.title	EMNLP 2024 - 2024 Conference on Empirical Methods in Natural Language Processing, Findings of EMNLP 2024	-
dc.citation.startPage	5573	-
dc.citation.endPage	5595	-
dc.type.docType	Conference paper	-
dc.description.isOpenAccess	N	-
dc.description.journalRegisteredClass	scopus	-
dc.subject.keywordPlus	Benchmarking	-
dc.subject.keywordPlus	Laws and legislation	-
dc.subject.keywordPlus	Modeling languages	-
dc.subject.keywordPlus	Natural language processing systems	-
dc.identifier.url	https://arxiv.org/abs/2410.08731	-

Files in This Item: Go to Link

Appears in Collections: 서울 법학전문대학원 > 서울 법학전문대학원 > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Park, Hai Jin photo

Park, Hai Jin: SCHOOL OF LAW (SCHOOL OF LAW)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE