XGBoost 기반의 조기 중지를 활용한 광고 클릭 예측 방안

한영진; 조인휘

doi:10.7840/kics.2021.46.6.993

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

XGBoost 기반의 조기 중지를 활용한 광고 클릭 예측 방안Prediction of Ad Clicks Using Early Stop Based on XGBoost

Other Titles: Prediction of Ad Clicks Using Early Stop Based on XGBoost

Authors: 한영진; 조인휘

Issue Date: Jun-2021

Publisher: 한국통신학회

Keywords: 머신러닝; 익스트림 그라디언트 부스팅; 그라디언트 부스팅; 과적합; 조기중지; Machine Learning, XGBoost, Gradient Boosting Machine, Overfitting, Early Stopping

Citation: 한국통신학회논문지, v.46, no.6, pp 993 - 1000

Pages: 8

Indexed: KCI

Journal Title: 한국통신학회논문지

Volume: 46

Number: 6

Start Page: 993

End Page: 1000

URI: https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/141778

DOI: 10.7840/kics.2021.46.6.993

ISSN: 1226-4717
2287-3880

Abstract: 웹 사이트 및 소셜 미디어 플랫폼에 특정 사용자가 광고를 클릭하는지 예측하는 머신러닝 알고리즘으로 데이터훈련을 지속적으로 시킬수록 트레이닝 데이터 셋은 성능이 향상되는 반면에 테스트 데이터 셋은 고정된 횟수의학습 반복 후에 성능이 향상되지 않는 과적합 문제가 발생한다. 이 논문에서 과적합을 피하기 위해 기존 알고리즘보다 XGBoost 알고리즘[1]을 기반으로 학습 프로세스의 조기 중지를 제안한다. XGBoost는 복잡한 데이터 모델을훈련시켜 과적합[2]을 피하는 방법이며, 별도의 테스트 데이터 군집에서 학습되는 모델의 성능을 모니터링하고 고정된 횟수의 교육 반복 후에 테스트 데이터 집합의 성능이 향상되지 않은 경우 교육 절차를 중지한다. 테스트 데이터 셋의 성능이 감소하기 시작하는 변곡점을 자동으로 선택하여, 모델의 과적합에 따라 트레이닝 데이터 셋의성능이 계속 향상되는 과적합을 피하면서 정확도를 구현하였다. 마지막으로 실험 결과 Logistic Regression 알고리즘, Decision Tree 알고리즘과 비교하여 XGBoot 알고리즘 기반의 성능 향상을 확인할 수 있었다.
Continuous data training on websites and social media platforms with machine learning algorithms that predict if a particular user clicks on ads results in better performance for training datasets, while test datasets experience overfitting problems that do not improve after a fixed number of learning iterations. In this paper, we propose an early stop of the learning process based on the XGBoost algorithm rather than the existing algorithm to avoid overfitting. XGBoost is a method to avoid overfitting by training complex data models, monitoring the performance of the models learned in a separate cluster of test data and stopping the training procedure if the performance of the test dataset has not improved after a fixed number of training iterations. We automatically select inflection points where the performance of the test dataset begins to decrease, thus implementing accuracy while avoiding overfitting, which continues to improve the performance of the training dataset according to the model's overfitting. Finally, the experimental results showed performance improvements based on the XGBoot algorithm compared to the Logistic Regression algorithm and the Decision Tree algorithm.

Files in This Item: Go to Link

Appears in Collections: 서울 공과대학 > 서울 컴퓨터소프트웨어학부 > 1. Journal Articles

Show full item record

qrcode

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Altmetrics

Total Views & Downloads

BROWSE