Cited 0 time in
Maximum Norm Minimization: A Single-Policy Multi-Objective Reinforcement Learning to Expansion of the Pareto Front
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Lee, Seonjae | - |
| dc.contributor.author | Lee, Myoung Hoon | - |
| dc.contributor.author | Moon, Jun | - |
| dc.date.accessioned | 2023-08-07T07:37:16Z | - |
| dc.date.available | 2023-08-07T07:37:16Z | - |
| dc.date.issued | 2022-10 | - |
| dc.identifier.uri | https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/188825 | - |
| dc.description.abstract | In this paper, we propose Maximum Norm Minimization (MNM), a single-policy Multi-Objective Reinforcement Learning (MORL) algorithm to solve the multi-objective RL problem. The main objective of our MNM is to provide the Pareto optimal points constituting the Pareto front in the multi-objective space. First, MNM measures distances among the Pareto optimal points in the current Pareto front and then normalizes the distances based on maximum and minimum reward values for each objective in the multi-objective space. Second, MNM identifies the maximum norm, i.e., the maximum value of the normalized Pareto optimal distances. Then MNM seeks to find a new Pareto optimal point, which corresponds to the middle of the two Pareto optimal points constituting the maximum norm. By iterating these two processes, MNM is able to expand and densify the Pareto front with increasing summation of the Pareto front volumes and decreasing mean-squared distance of the Pareto optimal points. To validate the performance of MNM, we provide the experimental results of five complex robotic multi-objective environments. In particular, we compare the performance of MNM with those of other state-of-the-art methods in terms of the summation of volumes and the mean-squared distance of the Pareto optimal points. | - |
| dc.format.extent | 10 | - |
| dc.language | 영어 | - |
| dc.language.iso | ENG | - |
| dc.publisher | ACM | - |
| dc.title | Maximum Norm Minimization: A Single-Policy Multi-Objective Reinforcement Learning to Expansion of the Pareto Front | - |
| dc.type | Article | - |
| dc.publisher.location | 미국 | - |
| dc.identifier.doi | 10.1145/3511808.3557389 | - |
| dc.identifier.scopusid | 2-s2.0-85140854726 | - |
| dc.identifier.wosid | 001074639601006 | - |
| dc.identifier.bibliographicCitation | ACM Conference on Information and Knowledge Management, pp 1064 - 1073 | - |
| dc.citation.title | ACM Conference on Information and Knowledge Management | - |
| dc.citation.startPage | 1064 | - |
| dc.citation.endPage | 1073 | - |
| dc.type.docType | Proceedings Paper | - |
| dc.description.isOpenAccess | N | - |
| dc.description.journalRegisteredClass | scopus | - |
| dc.relation.journalResearchArea | Computer Science | - |
| dc.relation.journalWebOfScienceCategory | Computer Science, Information Systems | - |
| dc.subject.keywordPlus | Maximum norm | - |
| dc.subject.keywordPlus | Maximum norm minimization | - |
| dc.subject.keywordPlus | Minimisation | - |
| dc.subject.keywordPlus | Multi objective | - |
| dc.subject.keywordPlus | Multi-objective reinforcement learning | - |
| dc.subject.keywordPlus | Pareto-optimality | - |
| dc.subject.keywordPlus | Reinforcement learnings | - |
| dc.subject.keywordPlus | Vector selections | - |
| dc.subject.keywordPlus | Weight vector | - |
| dc.subject.keywordPlus | Weight vector selection | - |
| dc.subject.keywordAuthor | maximum norm minimization | - |
| dc.subject.keywordAuthor | multi-objective reinforcement learning | - |
| dc.subject.keywordAuthor | pareto optimality | - |
| dc.subject.keywordAuthor | weight vector selection | - |
| dc.identifier.url | https://dl.acm.org/doi/abs/10.1145/3511808.3557389 | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366
COPYRIGHT © 2024 HANYANG UNIVERSITY.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.
