Integrating Risk-Averse and Constrained Reinforcement Learning for Robust Decision-Making in High-Stakes Scenarios

Ahmad, Moiz; Ramzan, Muhammad Babar; Omair, Muhammad; Habib, Muhammad Salman

doi:10.3390/math12131954

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Integrating Risk-Averse and Constrained Reinforcement Learning for Robust Decision-Making in High-Stakes Scenarios

Full metadata record

DC Field	Value	Language
dc.contributor.author	Ahmad, Moiz	-
dc.contributor.author	Ramzan, Muhammad Babar	-
dc.contributor.author	Omair, Muhammad	-
dc.contributor.author	Habib, Muhammad Salman	-
dc.date.accessioned	2025-06-16T06:00:25Z	-
dc.date.available	2025-06-16T06:00:25Z	-
dc.date.issued	2024-07	-
dc.identifier.issn	2227-7390	-
dc.identifier.issn	2227-7390	-
dc.identifier.uri	https://scholarworks.bwise.kr/erica/handle/2021.sw.erica/125637	-
dc.description.abstract	This paper considers a risk-averse Markov decision process (MDP) with non-risk constraints as a dynamic optimization framework to ensure robustness against unfavorable outcomes in high-stakes sequential decision-making situations such as disaster response. In this regard, strong duality is proved while making no assumptions on the problem’s convexity. This is necessary for some real-world issues, e.g., in the case of deprivation costs in the context of disaster relief, where convexity cannot be ensured. Our theoretical results imply that the problem can be exactly solved in a dual domain where it becomes convex. Based on our duality results, an augmented Lagrangian-based constraint handling mechanism is also developed for risk-averse reinforcement learning algorithms. The mechanism is proved to be theoretically convergent. Finally, we have also empirically established the convergence of the mechanism using a multi-stage disaster response relief allocation problem while using a fixed negative reward scheme as a benchmark. © 2024 by the authors.	-
dc.format.extent	29	-
dc.language	영어	-
dc.language.iso	ENG	-
dc.publisher	Multidisciplinary Digital Publishing Institute (MDPI)	-
dc.title	Integrating Risk-Averse and Constrained Reinforcement Learning for Robust Decision-Making in High-Stakes Scenarios	-
dc.type	Article	-
dc.publisher.location	스위스	-
dc.identifier.doi	10.3390/math12131954	-
dc.identifier.scopusid	2-s2.0-85198409366	-
dc.identifier.wosid	001269761800001	-
dc.identifier.bibliographicCitation	Mathematics, v.12, no.13, pp 1 - 29	-
dc.citation.title	Mathematics	-
dc.citation.volume	12	-
dc.citation.number	13	-
dc.citation.startPage	1	-
dc.citation.endPage	29	-
dc.type.docType	Article	-
dc.description.isOpenAccess	Y	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-
dc.relation.journalResearchArea	Mathematics	-
dc.relation.journalWebOfScienceCategory	Mathematics	-
dc.subject.keywordPlus	MANAGEMENT	-
dc.subject.keywordPlus	RELIEF	-
dc.subject.keywordAuthor	augmented Lagrangian	-
dc.subject.keywordAuthor	constrained reinforcement learning	-
dc.subject.keywordAuthor	dynamic decision-making	-
dc.subject.keywordAuthor	Markov risk	-
dc.subject.keywordAuthor	non-convexities	-
dc.subject.keywordAuthor	robust decision-making	-
dc.identifier.url	https://www.mdpi.com/2227-7390/12/13/1954	-

Files in This Item: Go to Link

Appears in Collections: ETC > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher HABIB, MUHAMMAD SALMAN photo

HABIB, MUHAMMAD SALMAN: ERICA부총장 한양인재개발원 (ERICA 창의융합교육원)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

55 Hanyangdeahak-ro, Sangnok-gu, Ansan, Gyeonggi-do, 15588, Korea+82-31-400-4269 sweetbrain@hanyang.ac.kr

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE