Cited 0 time in
From Language to Grasp: Object Retrieval and Grasping Through Explicit and Implicit Linguistic Commands
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Yoon, Dongmin | - |
| dc.contributor.author | Cha, Seonghun | - |
| dc.contributor.author | Oh, Yoonseon | - |
| dc.date.accessioned | 2025-02-12T08:00:34Z | - |
| dc.date.available | 2025-02-12T08:00:34Z | - |
| dc.date.issued | 2024-10 | - |
| dc.identifier.issn | 1598-7833 | - |
| dc.identifier.uri | https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/206472 | - |
| dc.description.abstract | In human-centered environments, assistive robots are required to understand verbal commands to retrieve and grasp objects within complex scenes. We propose a novel Language Understanding Object Retrieval module (LUOR) by fine-tuning the CLIP text encoder to enhance robot manipulators' understanding of both explicit and implicit natural language commands. A new dataset with 712 verb-object pairs is created for training. This dataset includes 78 verbs associated with 244 ImageNet classes, providing a comprehensive range of scenarios. Additionally, 336 verb-object pairs cover 54 verbs for 138 ObjectNet classes, further expanding the model's applicability. Experimental results demonstrate that LUOR outperforms existing baselines in both accuracy and efficiency, particularly in handling implicit commands. The integrated system with the Multi-Task Detection module (MTD) shows strong performance in real-world robotic applications using a Panda Franka manipulator. These findings confirm the practical applicability of our approach and suggest potential for further improvements in robotic grasping and manipulation tasks. | - |
| dc.format.extent | 2 | - |
| dc.language | 영어 | - |
| dc.language.iso | ENG | - |
| dc.title | From Language to Grasp: Object Retrieval and Grasping Through Explicit and Implicit Linguistic Commands | - |
| dc.type | Article | - |
| dc.identifier.doi | 10.23919/ICCAS63016.2024.10773029 | - |
| dc.identifier.scopusid | 2-s2.0-85214363266 | - |
| dc.identifier.bibliographicCitation | International Conference on Control, Automation and Systems, pp 1565 - 1566 | - |
| dc.citation.title | International Conference on Control, Automation and Systems | - |
| dc.citation.startPage | 1565 | - |
| dc.citation.endPage | 1566 | - |
| dc.type.docType | Conference paper | - |
| dc.description.isOpenAccess | N | - |
| dc.description.journalRegisteredClass | scopus | - |
| dc.subject.keywordPlus | Adversarial machine learning | - |
| dc.subject.keywordPlus | Content based retrieval | - |
| dc.subject.keywordPlus | Contrastive Learning | - |
| dc.subject.keywordPlus | Industrial robots | - |
| dc.subject.keywordPlus | Linguistics | - |
| dc.subject.keywordPlus | Modular robots | - |
| dc.subject.keywordPlus | Multi-task learning | - |
| dc.subject.keywordPlus | Natural language processing systems | - |
| dc.subject.keywordPlus | Object detection | - |
| dc.subject.keywordPlus | Object recognition | - |
| dc.subject.keywordPlus | Robot applications | - |
| dc.subject.keywordPlus | Robot learning | - |
| dc.subject.keywordAuthor | grasp detection | - |
| dc.subject.keywordAuthor | multi-modal learning | - |
| dc.subject.keywordAuthor | Robotic object retrieval | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366
COPYRIGHT © 2024 HANYANG UNIVERSITY.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.
