Predicting Missing Values in Survey Data Using Prompt Engineering for Addressing Item Non-Response

Ji, Junyung; Kim, Jiwoo; Kim, Younghoon

doi:10.3390/fi16100351

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Predicting Missing Values in Survey Data Using Prompt Engineering for Addressing Item Non-Response

Full metadata record

DC Field	Value	Language
dc.contributor.author	Ji, Junyung	-
dc.contributor.author	Kim, Jiwoo	-
dc.contributor.author	Kim, Younghoon	-
dc.date.accessioned	2024-12-10T07:30:28Z	-
dc.date.available	2024-12-10T07:30:28Z	-
dc.date.issued	2024-10	-
dc.identifier.issn	1999-5903	-
dc.identifier.issn	1999-5903	-
dc.identifier.uri	https://scholarworks.bwise.kr/erica/handle/2021.sw.erica/121275	-
dc.description.abstract	Survey data play a crucial role in various research fields, including economics, education, and healthcare, by providing insights into human behavior and opinions. However, item non-response, where respondents fail to answer specific questions, presents a significant challenge by creating incomplete datasets that undermine data integrity and can hinder or even prevent accurate analysis. Traditional methods for addressing missing data, such as statistical imputation techniques and deep learning models, often fall short when dealing with the rich linguistic content of survey data. These approaches are also hampered by high time complexity for training and the need for extensive preprocessing or feature selection. In this paper, we introduce an approach that leverages Large Language Models (LLMs) through prompt engineering for predicting item non-responses in survey data. Our method combines the strengths of both traditional imputation techniques and deep learning methods with the advanced linguistic understanding of LLMs. By integrating respondent similarities, question relevance, and linguistic semantics, our approach enhances the accuracy and comprehensiveness of survey data analysis. The proposed method bypasses the need for complex preprocessing and additional training, making it adaptable, scalable, and capable of generating explainable predictions in natural language. We evaluated the effectiveness of our LLM-based approach through a series of experiments, demonstrating its competitive performance against established methods such as Multivariate Imputation by Chained Equations (MICE), MissForest, and deep learning models like TabTransformer. The results show that our approach not only matches but, in some cases, exceeds the performance of these methods while significantly reducing the time required for data processing.	-
dc.format.extent	19	-
dc.language	영어	-
dc.language.iso	ENG	-
dc.publisher	Multidisciplinary Digital Publishing Institute (MDPI)	-
dc.title	Predicting Missing Values in Survey Data Using Prompt Engineering for Addressing Item Non-Response	-
dc.type	Article	-
dc.publisher.location	스위스	-
dc.identifier.doi	10.3390/fi16100351	-
dc.identifier.scopusid	2-s2.0-85207660289	-
dc.identifier.wosid	001342807200001	-
dc.identifier.bibliographicCitation	Future Internet, v.16, no.10, pp 1 - 19	-
dc.citation.title	Future Internet	-
dc.citation.volume	16	-
dc.citation.number	10	-
dc.citation.startPage	1	-
dc.citation.endPage	19	-
dc.type.docType	Article	-
dc.description.isOpenAccess	Y	-
dc.description.journalRegisteredClass	scopus	-
dc.description.journalRegisteredClass	esci	-
dc.relation.journalResearchArea	Computer Science	-
dc.relation.journalWebOfScienceCategory	Computer Science, Information Systems	-
dc.subject.keywordPlus	IMPUTATION	-
dc.subject.keywordAuthor	survey data	-
dc.subject.keywordAuthor	item non-response	-
dc.subject.keywordAuthor	large language models	-
dc.subject.keywordAuthor	prompt engineering	-
dc.identifier.url	https://www.mdpi.com/1999-5903/16/10/351	-

Files in This Item: Go to Link

Appears in Collections: COLLEGE OF COMPUTING > DEPARTMENT OF ARTIFICIAL INTELLIGENCE > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Kim, Young hoon photo

Kim, Young hoon: ERICA 소프트웨어융합대학 (DEPARTMENT OF ARTIFICIAL INTELLIGENCE)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

55 Hanyangdeahak-ro, Sangnok-gu, Ansan, Gyeonggi-do, 15588, Korea+82-31-400-4269 sweetbrain@hanyang.ac.kr

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE