Analysis of Multi-Source Language Training in Cross-Lingual Transfer

Lim, Seong Hoon; Yun, Taejun; Kim, Jinhyeon; Choi, Jihun; Kim, Taeuk

doi:10.48550/arXiv.2402.13562

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Analysis of Multi-Source Language Training in Cross-Lingual Transfer

Full metadata record

DC Field	Value	Language
dc.contributor.author	Lim, Seong Hoon	-
dc.contributor.author	Yun, Taejun	-
dc.contributor.author	Kim, Jinhyeon	-
dc.contributor.author	Choi, Jihun	-
dc.contributor.author	Kim, Taeuk	-
dc.date.accessioned	2024-11-28T08:36:07Z	-
dc.date.available	2024-11-28T08:36:07Z	-
dc.date.issued	2024-08	-
dc.identifier.issn	0736-587X	-
dc.identifier.uri	https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/195360	-
dc.description.abstract	The successful adaptation of multilingual language models (LMs) to a specific language-task pair critically depends on the availability of data tailored for that condition. While cross-lingual transfer (XLT) methods have contributed to addressing this data scarcity problem, there still exists ongoing debate about the mechanisms behind their effectiveness. In this work, we focus on one of the promising assumptions about the inner workings of XLT, that it encourages multilingual LMs to place greater emphasis on language-agnostic or task-specific features. We test this hypothesis by examining how the patterns of XLT change with a varying number of source languages involved in the process. Our experimental findings show that the use of multiple source languages in XLT-a technique we term Multi-Source Language Training (MSLT)-leads to increased mingling of embedding spaces for different languages, supporting the claim that XLT benefits from making use of language-independent information. On the other hand, we discover that using an arbitrary combination of source languages does not always guarantee better performance. We suggest simple heuristics for identifying effective language combinations for MSLT and empirically prove its effectiveness.	-
dc.format.extent	14	-
dc.language	영어	-
dc.language.iso	ENG	-
dc.title	Analysis of Multi-Source Language Training in Cross-Lingual Transfer	-
dc.type	Article	-
dc.publisher.location	영국	-
dc.identifier.doi	10.48550/arXiv.2402.13562	-
dc.identifier.scopusid	2-s2.0-85204431784	-
dc.identifier.bibliographicCitation	Association for Computational Linguistics (ACL). Annual Meeting Conference Proceedings, v.1, pp 712 - 725	-
dc.citation.title	Association for Computational Linguistics (ACL). Annual Meeting Conference Proceedings	-
dc.citation.volume	1	-
dc.citation.startPage	712	-
dc.citation.endPage	725	-
dc.type.docType	Conference paper	-
dc.description.isOpenAccess	N	-
dc.description.journalRegisteredClass	scopus	-
dc.subject.keywordPlus	Formal languages	-
dc.subject.keywordPlus	Translation (languages)	-

Files in This Item: There are no files associated with this item.

Appears in Collections: 서울 공과대학 > 서울 컴퓨터소프트웨어학부 > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Kim, Taeuk photo

Kim, Taeuk: COLLEGE OF ENGINEERING (SCHOOL OF COMPUTER SCIENCE)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE