Image Manipulation Using Korean Translation and CLIP: Ko-CLIP

Kim, Sieun; Joe, Inwhee

doi:10.1007/978-3-031-35314-7_21

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Image Manipulation Using Korean Translation and CLIP: Ko-CLIP

Full metadata record

DC Field	Value	Language
dc.contributor.author	Kim, Sieun	-
dc.contributor.author	Joe, Inwhee	-
dc.date.accessioned	2023-11-14T08:26:30Z	-
dc.date.available	2023-11-14T08:26:30Z	-
dc.date.issued	2023-04	-
dc.identifier.issn	2367-3370	-
dc.identifier.issn	2367-3389	-
dc.identifier.uri	https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/192229	-
dc.description.abstract	Deep Learning, a field of artistic intelligence (AI), is showing good results in natural language processing (NLP) and image processing classification. In the NLP field, in particular, the BERT-based model has become the main focus of the latest language model. It is a representative model that utilizes BERT pre-training and fine-tuning. Through the process of pre-training vast amounts of data and fine-tuning it, more natural NLP can be implemented. CLIP recently built a dataset with only web crawling without manual labeling to create a huge dataset that forms image-text pairs. With the CLIP Model, it tells you which image the input text is deeply related to. However, CLIP does not recognize Korean text when it is input, so it cannot accurately analyze it. In this paper, we propose to use the BERT Model of NLP and CLIP in the field of image processing to process images by receiving Korean text input. The Korean text is translated into English through the BERT Model and used as input text in the CLIP Model. The output that went through the two models reflected the contents of the Korean text. It can be seen that Output is related to the accuracy of Korean text.	-
dc.format.extent	9	-
dc.language	영어	-
dc.language.iso	ENG	-
dc.publisher	Springer International Publishing AG	-
dc.title	Image Manipulation Using Korean Translation and CLIP: Ko-CLIP	-
dc.type	Article	-
dc.publisher.location	스위스	-
dc.identifier.doi	10.1007/978-3-031-35314-7_21	-
dc.identifier.scopusid	2-s2.0-85172735362	-
dc.identifier.bibliographicCitation	Lecture Notes in Networks and Systems, v.724 LNNS, pp 222 - 230	-
dc.citation.title	Lecture Notes in Networks and Systems	-
dc.citation.volume	724 LNNS	-
dc.citation.startPage	222	-
dc.citation.endPage	230	-
dc.type.docType	Conference paper	-
dc.description.isOpenAccess	N	-
dc.description.journalRegisteredClass	scopus	-
dc.subject.keywordPlus	Character recognition	-
dc.subject.keywordPlus	Deep learning	-
dc.subject.keywordPlus	Learning algorithms	-
dc.subject.keywordPlus	Natural language processing systems	-
dc.subject.keywordPlus	Translation (languages)	-
dc.subject.keywordPlus	Web crawler	-
dc.subject.keywordPlus	Computer vision	-
dc.subject.keywordAuthor	Computer Vision	-
dc.subject.keywordAuthor	Image Processing	-
dc.subject.keywordAuthor	Machine Learning	-
dc.subject.keywordAuthor	Natural Language Processing	-
dc.identifier.url	https://link.springer.com/chapter/10.1007/978-3-031-35314-7_21	-

Files in This Item: Go to Link

Appears in Collections: 서울 공과대학 > 서울 컴퓨터소프트웨어학부 > 1. Journal Articles

Show simple item record

qrcode

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Altmetrics

Total Views & Downloads

BROWSE