Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

No-regret shannon entropy regularized neural contextual bandit online learning for robotic grasping

Authors
Lee, K.Choy, J.Choi, Y.Kee, H.Oh, S.
Issue Date
Oct-2020
Publisher
Institute of Electrical and Electronics Engineers Inc.
Citation
IEEE International Conference on Intelligent Robots and Systems, pp 9620 - 9625
Pages
6
Journal Title
IEEE International Conference on Intelligent Robots and Systems
Start Page
9620
End Page
9625
URI
https://scholarworks.bwise.kr/cau/handle/2019.sw.cau/59359
DOI
10.1109/IROS45743.2020.9341123
ISSN
2153-0858
Abstract
In this paper, we propose a novel contextual bandit algorithm that employs a neural network as a reward estimator and utilizes Shannon entropy regularization to encourage exploration, which is called Shannon entropy regularized neural contextual bandits (SERN). In many learning-based algorithms for robotic grasping, the lack of the real-world data hampers the generalization performance of a model and makes it difficult to apply a trained model to real-world problems. To handle this issue, the proposed method utilizes the benefit of an online learning. The proposed method trains a neural network to predict the success probability of a given grasp pose based on a depth image, which is called a grasp quality. We theoretically show that the SERN has a no regret property. We empirically demonstrate that the SERN outperforms ϵ-greedy in terms of sample efficiency.
Files in This Item
There are no files associated with this item.
Appears in
Collections
College of Software > Department of Artificial Intelligence > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Lee, Kyungjae photo

Lee, Kyungjae
소프트웨어대학 (AI학과)
Read more

Altmetrics

Total Views & Downloads

BROWSE