Cited 1 time in
Reusing monolingual pre-trained models by cross-connecting seq2seq models for machine translation
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Oh, Jiun | - |
| dc.contributor.author | Choi, Yong-Suk | - |
| dc.date.accessioned | 2022-07-06T13:40:14Z | - |
| dc.date.available | 2022-07-06T13:40:14Z | - |
| dc.date.issued | 2021-09 | - |
| dc.identifier.issn | 2076-3417 | - |
| dc.identifier.uri | https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/141039 | - |
| dc.description.abstract | This work uses sequence-to-sequence (seq2seq) models pre-trained on monolingual corpora for machine translation. We pre-train two seq2seq models with monolingual corpora for the source and target languages, then combine the encoder of the source language model and the decoder of the target language model, i.e., the cross-connection. We add an intermediate layer between the pre-trained encoder and the decoder to help the mapping of each other since the modules are pre-trained completely independently. These monolingual pre-trained models can work as a multilingual pre-trained model because one model can be cross-connected with another model pre-trained on any other language, while their capacity is not affected by the number of languages. We will demonstrate that our method improves the translation performance significantly over the random baseline. Moreover, we will analyze the appropriate choice of the intermediate layer, the importance of each part of a pre-trained model, and the performance change along with the size of the bitext. | - |
| dc.format.extent | 13 | - |
| dc.language | 영어 | - |
| dc.language.iso | ENG | - |
| dc.publisher | MDPI | - |
| dc.title | Reusing monolingual pre-trained models by cross-connecting seq2seq models for machine translation | - |
| dc.type | Article | - |
| dc.publisher.location | 스위스 | - |
| dc.identifier.doi | 10.3390/app11188737 | - |
| dc.identifier.scopusid | 2-s2.0-85115338543 | - |
| dc.identifier.wosid | 000699162300001 | - |
| dc.identifier.bibliographicCitation | Applied Sciences-basel, v.11, no.18, pp 1 - 13 | - |
| dc.citation.title | Applied Sciences-basel | - |
| dc.citation.volume | 11 | - |
| dc.citation.number | 18 | - |
| dc.citation.startPage | 1 | - |
| dc.citation.endPage | 13 | - |
| dc.type.docType | Article | - |
| dc.description.isOpenAccess | Y | - |
| dc.description.journalRegisteredClass | scie | - |
| dc.description.journalRegisteredClass | scopus | - |
| dc.relation.journalResearchArea | Chemistry | - |
| dc.relation.journalResearchArea | Engineering | - |
| dc.relation.journalResearchArea | Materials Science | - |
| dc.relation.journalResearchArea | Physics | - |
| dc.relation.journalWebOfScienceCategory | Chemistry, Multidisciplinary | - |
| dc.relation.journalWebOfScienceCategory | Engineering, Multidisciplinary | - |
| dc.relation.journalWebOfScienceCategory | Materials Science, Multidisciplinary | - |
| dc.relation.journalWebOfScienceCategory | Physics, Applied | - |
| dc.subject.keywordAuthor | natural language processing | - |
| dc.subject.keywordAuthor | transfer learning | - |
| dc.subject.keywordAuthor | neural machine translation | - |
| dc.identifier.url | https://www.mdpi.com/2076-3417/11/18/8737 | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366
COPYRIGHT © 2024 HANYANG UNIVERSITY.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.
