Cited 0 time in
Image Manipulation Using Korean Translation and CLIP: Ko-CLIP
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Kim, Sieun | - |
| dc.contributor.author | Joe, Inwhee | - |
| dc.date.accessioned | 2023-11-14T08:26:30Z | - |
| dc.date.available | 2023-11-14T08:26:30Z | - |
| dc.date.issued | 2023-04 | - |
| dc.identifier.issn | 2367-3370 | - |
| dc.identifier.issn | 2367-3389 | - |
| dc.identifier.uri | https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/192229 | - |
| dc.description.abstract | Deep Learning, a field of artistic intelligence (AI), is showing good results in natural language processing (NLP) and image processing classification. In the NLP field, in particular, the BERT-based model has become the main focus of the latest language model. It is a representative model that utilizes BERT pre-training and fine-tuning. Through the process of pre-training vast amounts of data and fine-tuning it, more natural NLP can be implemented. CLIP recently built a dataset with only web crawling without manual labeling to create a huge dataset that forms image-text pairs. With the CLIP Model, it tells you which image the input text is deeply related to. However, CLIP does not recognize Korean text when it is input, so it cannot accurately analyze it. In this paper, we propose to use the BERT Model of NLP and CLIP in the field of image processing to process images by receiving Korean text input. The Korean text is translated into English through the BERT Model and used as input text in the CLIP Model. The output that went through the two models reflected the contents of the Korean text. It can be seen that Output is related to the accuracy of Korean text. | - |
| dc.format.extent | 9 | - |
| dc.language | 영어 | - |
| dc.language.iso | ENG | - |
| dc.publisher | Springer International Publishing AG | - |
| dc.title | Image Manipulation Using Korean Translation and CLIP: Ko-CLIP | - |
| dc.type | Article | - |
| dc.publisher.location | 스위스 | - |
| dc.identifier.doi | 10.1007/978-3-031-35314-7_21 | - |
| dc.identifier.scopusid | 2-s2.0-85172735362 | - |
| dc.identifier.bibliographicCitation | Lecture Notes in Networks and Systems, v.724 LNNS, pp 222 - 230 | - |
| dc.citation.title | Lecture Notes in Networks and Systems | - |
| dc.citation.volume | 724 LNNS | - |
| dc.citation.startPage | 222 | - |
| dc.citation.endPage | 230 | - |
| dc.type.docType | Conference paper | - |
| dc.description.isOpenAccess | N | - |
| dc.description.journalRegisteredClass | scopus | - |
| dc.subject.keywordPlus | Character recognition | - |
| dc.subject.keywordPlus | Deep learning | - |
| dc.subject.keywordPlus | Learning algorithms | - |
| dc.subject.keywordPlus | Natural language processing systems | - |
| dc.subject.keywordPlus | Translation (languages) | - |
| dc.subject.keywordPlus | Web crawler | - |
| dc.subject.keywordPlus | Computer vision | - |
| dc.subject.keywordAuthor | Computer Vision | - |
| dc.subject.keywordAuthor | Image Processing | - |
| dc.subject.keywordAuthor | Machine Learning | - |
| dc.subject.keywordAuthor | Natural Language Processing | - |
| dc.identifier.url | https://link.springer.com/chapter/10.1007/978-3-031-35314-7_21 | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366
COPYRIGHT © 2024 HANYANG UNIVERSITY.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.
