Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Local Metric Learning for Off-Policy Evaluation in Contextual Bandits with Continuous Actions

Full metadata record
DC Field Value Language
dc.contributor.authorLee, Haanvid-
dc.contributor.authorLee, Jongmin-
dc.contributor.authorChoi, Yunseon-
dc.contributor.authorJeon, Wonseok-
dc.contributor.authorLee, Byung-Jun-
dc.contributor.authorNoh, Yung-Kyun-
dc.contributor.authorKim, Kee-Eung-
dc.date.accessioned2024-12-04T08:30:18Z-
dc.date.available2024-12-04T08:30:18Z-
dc.date.issued2022-11-
dc.identifier.issn1049-5258-
dc.identifier.issn1049-5258-
dc.identifier.urihttps://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/199892-
dc.description.abstractWe consider local kernel metric learning for off-policy evaluation (OPE) of deterministic policies in contextual bandits with continuous action spaces. Our work is motivated by practical scenarios where the target policy needs to be deterministic due to domain requirements, such as prescription of treatment dosage and duration in medicine. Although importance sampling (IS) provides a basic principle for OPE, it is ill-posed for the deterministic target policy with continuous actions. Our main idea is to relax the target policy and pose the problem as kernel-based estimation, where we learn the kernel metric in order to minimize the overall mean squared error (MSE). We present an analytic solution for the optimal metric, based on the analysis of bias and variance. Whereas prior work has been limited to scalar action spaces or kernel bandwidth selection, our work takes a step further being capable of vector action spaces and metric optimization. We show that our estimator is consistent, and significantly reduces the MSE compared to baseline OPE methods through experiments on various domains.-
dc.format.extent13-
dc.language영어-
dc.language.isoENG-
dc.titleLocal Metric Learning for Off-Policy Evaluation in Contextual Bandits with Continuous Actions-
dc.typeArticle-
dc.publisher.location미국-
dc.identifier.wosid001213811605018-
dc.identifier.bibliographicCitationAdvances in Neural Information Processing Systems, pp 1 - 13-
dc.citation.titleAdvances in Neural Information Processing Systems-
dc.citation.startPage1-
dc.citation.endPage13-
dc.type.docTypeProceedings Paper-
dc.description.isOpenAccessN-
dc.relation.journalResearchAreaComputer Science-
dc.relation.journalWebOfScienceCategoryComputer Science, Artificial Intelligence-
dc.relation.journalWebOfScienceCategoryComputer Science, Information Systems-
dc.identifier.urlhttps://proceedings.neurips.cc/paper_files/paper/2022/hash/18fee39e2666f43cf44425138bae9def-Abstract-Conference.html-
Files in This Item
Go to Link
Appears in
Collections
서울 공과대학 > 서울 컴퓨터소프트웨어학부 > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Noh, Yung Kyun photo

Noh, Yung Kyun
COLLEGE OF ENGINEERING (SCHOOL OF COMPUTER SCIENCE)
Read more

Altmetrics

Total Views & Downloads

BROWSE