Cited 0 time in
On Practical Robust Reinforcement Learning: Adjacent Uncertainty Set and Double-Agent Algorithm
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Hwang, Ukjo | - |
| dc.contributor.author | Hong, Songnam | - |
| dc.date.accessioned | 2026-03-23T05:00:45Z | - |
| dc.date.available | 2026-03-23T05:00:45Z | - |
| dc.date.issued | 2025-04 | - |
| dc.identifier.issn | 2162-237X | - |
| dc.identifier.issn | 2162-2388 | - |
| dc.identifier.uri | https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/211437 | - |
| dc.description.abstract | Robust reinforcement learning (RRL) aims to seek a robust policy by optimizing the worst case performance over an uncertainty set. This set contains some perturbed Markov decision processes (MDPs) from a nominal MDP (N-MDP) that generate samples for training, which reflects some potential mismatches between the training simulator (i.e., N-MDP) and real-world settings (i.e., the testing environments). Unfortunately, existing RRL algorithms are only applied to the tabular setting and it is still an open problem to extend them into more general continuous state space. We contribute to this subject in the following ways. We first construct an elaborated uncertainty set, which contains plausible (perturbed) MDPs only compared with the existing sets. Based on this, we propose a sample-based RRL algorithm named adjacent robust Q-learning (ARQ-Learning) for the tabular setting and characterize its finite-time error bound. Also, it is proved that ARQ-Learning converges as fast as the standard Q-learning and robust Q-learning (Robust-Q) while guaranteeing better robustness. Our major contribution is to introduce an additional pessimistic agent that can address the major hurdle for the extension of ARQ-Learning into cases with large or continuous state spaces. Leveraging this double-agent approach, we for the first time develop (model-free) RRL algorithms for continuous state/action spaces. Via experiments, we demonstrate the effectiveness of our algorithms. | - |
| dc.format.extent | 15 | - |
| dc.language | 영어 | - |
| dc.language.iso | ENG | - |
| dc.publisher | IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC | - |
| dc.title | On Practical Robust Reinforcement Learning: Adjacent Uncertainty Set and Double-Agent Algorithm | - |
| dc.type | Article | - |
| dc.publisher.location | 미국 | - |
| dc.identifier.doi | 10.1109/TNNLS.2024.3385234 | - |
| dc.identifier.scopusid | 2-s2.0-85190742331 | - |
| dc.identifier.wosid | 001205847100001 | - |
| dc.identifier.bibliographicCitation | IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, v.36, no.4, pp 7696 - 7710 | - |
| dc.citation.title | IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | - |
| dc.citation.volume | 36 | - |
| dc.citation.number | 4 | - |
| dc.citation.startPage | 7696 | - |
| dc.citation.endPage | 7710 | - |
| dc.type.docType | Article; Early Access | - |
| dc.description.isOpenAccess | N | - |
| dc.description.journalRegisteredClass | scie | - |
| dc.description.journalRegisteredClass | scopus | - |
| dc.relation.journalResearchArea | Computer Science | - |
| dc.relation.journalResearchArea | Engineering | - |
| dc.relation.journalWebOfScienceCategory | Computer Science, Artificial Intelligence | - |
| dc.relation.journalWebOfScienceCategory | Computer Science, Hardware & Architecture | - |
| dc.relation.journalWebOfScienceCategory | Computer Science, Theory & Methods | - |
| dc.relation.journalWebOfScienceCategory | Engineering, Electrical & Electronic | - |
| dc.subject.keywordPlus | MARKOV DECISION-PROCESSES | - |
| dc.subject.keywordAuthor | Uncertainty | - |
| dc.subject.keywordAuthor | Training | - |
| dc.subject.keywordAuthor | Standards | - |
| dc.subject.keywordAuthor | Robustness | - |
| dc.subject.keywordAuthor | Optimization | - |
| dc.subject.keywordAuthor | Testing | - |
| dc.subject.keywordAuthor | Q-learning | - |
| dc.subject.keywordAuthor | Reinforcement learning (RL) | - |
| dc.subject.keywordAuthor | robust RL (RRL) | - |
| dc.subject.keywordAuthor | robustness | - |
| dc.subject.keywordAuthor | uncertainty set | - |
| dc.identifier.url | https://ieeexplore.ieee.org/document/10499720 | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366
COPYRIGHT © 2024 HANYANG UNIVERSITY.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.
