대규모 언어모델을 활용한 데이터셋 생성을 위한 프롬프트 디자인 및 생성 방법론 분석

김강민; 채동규

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

대규모 언어모델을 활용한 데이터셋 생성을 위한 프롬프트 디자인 및 생성 방법론 분석Prompt Designing and Analysis on Generation Method for Dataset Generation by Large Language models

Other Titles: Prompt Designing and Analysis on Generation Method for Dataset Generation by Large Language models

Authors: 김강민; 채동규

Issue Date: Jun-2023

Publisher: 한국정보과학회

Citation: 2023 한국컴퓨터종합학술대회 (KCC 2023), pp.337 - 339

Indexed: OTHER

Journal Title: 2023 한국컴퓨터종합학술대회 (KCC 2023)

Start Page: 337

End Page: 339

URI: https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/190027

Abstract: 최근 데이터셋 생성을 위해 대규모 언어모델을 활용하여 양질의 데이터를 얻는 여러 방법들이 연구되어왔다. 본 논문에서는 기존의 여러 방법들 중, 언어모델의 능력을 가장 잘 이끌어 낼 수 있는 프롬프트를설계하는 방법과 그 유형을 정리한다. 또한, 토큰 생성 확률을 self-debiasing 방법론을 통해 조정하여각기 다른 과제(task)에 적합한 데이터셋을 만드는 방법론을 적용한다. 이 두가지 방법론들을 활용해 대규모 언어모델을 활용한 한국어 데이터셋 제작 시 고려해야 할 사항들을 탐구한다

Files in This Item: Go to Link

Appears in Collections: 서울 공과대학 > 서울 컴퓨터소프트웨어학부 > 1. Journal Articles

Show full item record

qrcode

Related Researcher

Researcher Chae, Dong Kyu photo

Chae, Dong Kyu: COLLEGE OF ENGINEERING (SCHOOL OF COMPUTER SCIENCE)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE