No-regret shannon entropy regularized neural contextual bandit online learning for robotic grasping

Lee, K.; Choy, J.; Choi, Y.; Kee, H.; Oh, S.

doi:10.1109/IROS45743.2020.9341123

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

No-regret shannon entropy regularized neural contextual bandit online learning for robotic grasping

Full metadata record

DC Field	Value	Language
dc.contributor.author	Lee, K.	-
dc.contributor.author	Choy, J.	-
dc.contributor.author	Choi, Y.	-
dc.contributor.author	Kee, H.	-
dc.contributor.author	Oh, S.	-
dc.date.accessioned	2022-11-28T01:57:51Z	-
dc.date.available	2022-11-28T01:57:51Z	-
dc.date.issued	2020-10	-
dc.identifier.issn	2153-0858	-
dc.identifier.uri	https://scholarworks.bwise.kr/cau/handle/2019.sw.cau/59359	-
dc.description.abstract	In this paper, we propose a novel contextual bandit algorithm that employs a neural network as a reward estimator and utilizes Shannon entropy regularization to encourage exploration, which is called Shannon entropy regularized neural contextual bandits (SERN). In many learning-based algorithms for robotic grasping, the lack of the real-world data hampers the generalization performance of a model and makes it difficult to apply a trained model to real-world problems. To handle this issue, the proposed method utilizes the benefit of an online learning. The proposed method trains a neural network to predict the success probability of a given grasp pose based on a depth image, which is called a grasp quality. We theoretically show that the SERN has a no regret property. We empirically demonstrate that the SERN outperforms ϵ-greedy in terms of sample efficiency.	-
dc.format.extent	6	-
dc.language	영어	-
dc.language.iso	ENG	-
dc.publisher	Institute of Electrical and Electronics Engineers Inc.	-
dc.title	No-regret shannon entropy regularized neural contextual bandit online learning for robotic grasping	-
dc.type	Article	-
dc.identifier.doi	10.1109/IROS45743.2020.9341123	-
dc.identifier.bibliographicCitation	IEEE International Conference on Intelligent Robots and Systems, pp 9620 - 9625	-
dc.description.isOpenAccess	N	-
dc.identifier.scopusid	2-s2.0-85102409713	-
dc.citation.endPage	9625	-
dc.citation.startPage	9620	-
dc.citation.title	IEEE International Conference on Intelligent Robots and Systems	-
dc.type.docType	Conference Paper	-
dc.publisher.location	미국	-
dc.subject.keywordPlus	Agricultural robots	-
dc.subject.keywordPlus	E-learning	-
dc.subject.keywordPlus	End effectors	-
dc.subject.keywordPlus	Intelligent robots	-
dc.subject.keywordPlus	Learning systems	-
dc.subject.keywordPlus	Neural networks	-
dc.subject.keywordPlus	Robotics	-
dc.subject.keywordPlus	Contextual bandits	-
dc.subject.keywordPlus	Generalization performance	-
dc.subject.keywordPlus	Grasp qualities	-
dc.subject.keywordPlus	Learning-based algorithms	-
dc.subject.keywordPlus	Online learning	-
dc.subject.keywordPlus	Real-world problem	-
dc.subject.keywordPlus	Robotic grasping	-
dc.subject.keywordPlus	Shannon entropy	-
dc.subject.keywordPlus	Learning algorithms	-
dc.description.journalRegisteredClass	scopus	-

Files in This Item: There are no files associated with this item.

Appears in Collections: College of Software > Department of Artificial Intelligence > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Lee, Kyungjae photo

Lee, Kyungjae: 소프트웨어대학 (AI학과)

Read more

Altmetrics

Total Views & Downloads

STATISTICS: Total View :7,595,796; Today View :3,219

RSS_1.0 RSS_2.0 ATOM_1.0

84, Heukseok-ro, Dongjak-gu, Seoul, Republic of Korea (06974)02-820-6194

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE