Batch Prioritization in Multigoal Reinforcement Learning

Vecchietti, Luiz Felipe; Kim, Taeyoung; Choi, Kyujin; Hong, Junhee; Har, Dongsoo

Detailed Information

Cited 6 time in webofscience

Cited 9 time in scopus

Metadata Downloads

Batch Prioritization in Multigoal Reinforcement Learning

Full metadata record

DC Field	Value	Language
dc.contributor.author	Vecchietti, Luiz Felipe	-
dc.contributor.author	Kim, Taeyoung	-
dc.contributor.author	Choi, Kyujin	-
dc.contributor.author	Hong, Junhee	-
dc.contributor.author	Har, Dongsoo	-
dc.date.available	2020-08-24T01:35:16Z	-
dc.date.created	2020-08-24	-
dc.date.issued	2020-07	-
dc.identifier.issn	2169-3536	-
dc.identifier.uri	https://scholarworks.bwise.kr/gachon/handle/2020.sw.gachon/78031	-
dc.description.abstract	In multigoal reinforcement learning, an agent interacts with an environment and learns to achieve multiple goals. The goal-conditioned policy is trained to effectively generalize its behavior for multiple goals. During training, the experiences collected by the agent are randomly sampled from a replay buffer. Because biased sampling of achieved goals affects the success rate of a given task, it should be avoided by considering the valid goal space, introduced here as the set of goals to achieve, and the current competence of the policy. To this end, a novel prioritization method for creation of batches, e.g., collections of samples, is proposed. Candidate batches are sampled and associated with costs; in each iteration the batch with the minimum cost is chosen to train the policy. The cost function is modeled by an intended goal, which is proposed as a hypothetical goal that the policy is trying to learn in each cycle, and the information of the valid goal space. The minimum cost of the batch selected for each iteration decreases throughout training as the policy learns to achieve goals near the center of the valid goal space. The proposed batch prioritization method is combined with hindsight experience replay (HER) for experiments in robotic control tasks presented in the OpenAI gym suite to demonstrate learning performance comparable to that of other state-of-the-art prioritization methods. As a result, the proposed batch prioritization method can achieve improved learning performance in 4 out of 5 tasks, particularly for harder tasks. The experimental results suggest that the proposed method for the creation of training batches, using the valid goal space information and current competence of the policy, can enhance learning performance in multigoal tasks with high-dimensional goal space.	-
dc.language	영어	-
dc.language.iso	en	-
dc.publisher	IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC	-
dc.relation.isPartOf	IEEE ACCESS	-
dc.title	Batch Prioritization in Multigoal Reinforcement Learning	-
dc.type	Article	-
dc.type.rims	ART	-
dc.description.journalClass	1	-
dc.identifier.wosid	000557774300001	-
dc.identifier.doi	10.1109/ACCESS.2020.3012204	-
dc.identifier.bibliographicCitation	IEEE ACCESS, v.8, pp.137449 - 137461	-
dc.description.isOpenAccess	N	-
dc.identifier.scopusid	2-s2.0-85090113302	-
dc.citation.endPage	137461	-
dc.citation.startPage	137449	-
dc.citation.title	IEEE ACCESS	-
dc.citation.volume	8	-
dc.contributor.affiliatedAuthor	Hong, Junhee	-
dc.type.docType	Article	-
dc.subject.keywordAuthor	Training	-
dc.subject.keywordAuthor	Task analysis	-
dc.subject.keywordAuthor	Learning (artificial intelligence)	-
dc.subject.keywordAuthor	Robots	-
dc.subject.keywordAuthor	Erbium	-
dc.subject.keywordAuthor	Cost function	-
dc.subject.keywordAuthor	Aerospace electronics	-
dc.subject.keywordAuthor	Experience replay	-
dc.subject.keywordAuthor	batch prioritization	-
dc.subject.keywordAuthor	goal distribution	-
dc.subject.keywordAuthor	reinforcement learning	-
dc.subject.keywordAuthor	intended goal	-
dc.subject.keywordPlus	LEVEL	-
dc.relation.journalResearchArea	Computer Science	-
dc.relation.journalResearchArea	Engineering	-
dc.relation.journalResearchArea	Telecommunications	-
dc.relation.journalWebOfScienceCategory	Computer Science, Information Systems	-
dc.relation.journalWebOfScienceCategory	Engineering, Electrical & Electronic	-
dc.relation.journalWebOfScienceCategory	Telecommunications	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-

Files in This Item: There are no files associated with this item.

Appears in Collections: IT융합대학 > 에너지IT학과 > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Hong, Jun Hee photo

Hong, Jun Hee: College of IT Convergence (Department of smart city)

Read more

Altmetrics

Total Views & Downloads

STATISTICS: Total View :4,164,857; Today View :26,667

RSS_1.0 RSS_2.0 ATOM_1.0

1342, Seongnam-daero, Sujeong-gu, Seongnam-si, Gyeonggi-do, Republic of Korea(13120)031-750-5114

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE