On Practical Robust Reinforcement Learning: Adjacent Uncertainty Set and Double-Agent Algorithm

Hwang, Ukjo; Hong, Songnam

doi:10.1109/TNNLS.2024.3385234

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

On Practical Robust Reinforcement Learning: Adjacent Uncertainty Set and Double-Agent Algorithm

Full metadata record

DC Field	Value	Language
dc.contributor.author	Hwang, Ukjo	-
dc.contributor.author	Hong, Songnam	-
dc.date.accessioned	2026-03-23T05:00:45Z	-
dc.date.available	2026-03-23T05:00:45Z	-
dc.date.issued	2025-04	-
dc.identifier.issn	2162-237X	-
dc.identifier.issn	2162-2388	-
dc.identifier.uri	https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/211437	-
dc.description.abstract	Robust reinforcement learning (RRL) aims to seek a robust policy by optimizing the worst case performance over an uncertainty set. This set contains some perturbed Markov decision processes (MDPs) from a nominal MDP (N-MDP) that generate samples for training, which reflects some potential mismatches between the training simulator (i.e., N-MDP) and real-world settings (i.e., the testing environments). Unfortunately, existing RRL algorithms are only applied to the tabular setting and it is still an open problem to extend them into more general continuous state space. We contribute to this subject in the following ways. We first construct an elaborated uncertainty set, which contains plausible (perturbed) MDPs only compared with the existing sets. Based on this, we propose a sample-based RRL algorithm named adjacent robust Q-learning (ARQ-Learning) for the tabular setting and characterize its finite-time error bound. Also, it is proved that ARQ-Learning converges as fast as the standard Q-learning and robust Q-learning (Robust-Q) while guaranteeing better robustness. Our major contribution is to introduce an additional pessimistic agent that can address the major hurdle for the extension of ARQ-Learning into cases with large or continuous state spaces. Leveraging this double-agent approach, we for the first time develop (model-free) RRL algorithms for continuous state/action spaces. Via experiments, we demonstrate the effectiveness of our algorithms.	-
dc.format.extent	15	-
dc.language	영어	-
dc.language.iso	ENG	-
dc.publisher	IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC	-
dc.title	On Practical Robust Reinforcement Learning: Adjacent Uncertainty Set and Double-Agent Algorithm	-
dc.type	Article	-
dc.publisher.location	미국	-
dc.identifier.doi	10.1109/TNNLS.2024.3385234	-
dc.identifier.scopusid	2-s2.0-85190742331	-
dc.identifier.wosid	001205847100001	-
dc.identifier.bibliographicCitation	IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, v.36, no.4, pp 7696 - 7710	-
dc.citation.title	IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS	-
dc.citation.volume	36	-
dc.citation.number	4	-
dc.citation.startPage	7696	-
dc.citation.endPage	7710	-
dc.type.docType	Article; Early Access	-
dc.description.isOpenAccess	N	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-
dc.relation.journalResearchArea	Computer Science	-
dc.relation.journalResearchArea	Engineering	-
dc.relation.journalWebOfScienceCategory	Computer Science, Artificial Intelligence	-
dc.relation.journalWebOfScienceCategory	Computer Science, Hardware & Architecture	-
dc.relation.journalWebOfScienceCategory	Computer Science, Theory & Methods	-
dc.relation.journalWebOfScienceCategory	Engineering, Electrical & Electronic	-
dc.subject.keywordPlus	MARKOV DECISION-PROCESSES	-
dc.subject.keywordAuthor	Uncertainty	-
dc.subject.keywordAuthor	Training	-
dc.subject.keywordAuthor	Standards	-
dc.subject.keywordAuthor	Robustness	-
dc.subject.keywordAuthor	Optimization	-
dc.subject.keywordAuthor	Testing	-
dc.subject.keywordAuthor	Q-learning	-
dc.subject.keywordAuthor	Reinforcement learning (RL)	-
dc.subject.keywordAuthor	robust RL (RRL)	-
dc.subject.keywordAuthor	robustness	-
dc.subject.keywordAuthor	uncertainty set	-
dc.identifier.url	https://ieeexplore.ieee.org/document/10499720	-

Files in This Item: Go to Link

Appears in Collections: 서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Hong, Song nam photo

Hong, Song nam: COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE