Cited 0 time in
LUOR: A Framework for Language Understanding in Object Retrieval and Grasping
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Yoon, Dongmin | - |
| dc.contributor.author | Cha, Seonghun | - |
| dc.contributor.author | Oh, Yoonseon | - |
| dc.date.accessioned | 2025-03-05T07:30:13Z | - |
| dc.date.available | 2025-03-05T07:30:13Z | - |
| dc.date.issued | 2025-02 | - |
| dc.identifier.issn | 1598-6446 | - |
| dc.identifier.issn | 2005-4092 | - |
| dc.identifier.uri | https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/206683 | - |
| dc.description.abstract | In human-centered environments, assistive robots are required to understand verbal commands to retrieve and grasp objects within complex scenes. Previous research on natural language object retrieval tasks has mainly focused on commands explicitly mentioning an object's name. However, in real-world environments, responding to implicit commands based on an object's function is also essential. To address this problem, we propose a new dataset consisting of 712 verb-object pairs containing 78 verbs for 244 ImageNet classes and 336 verb-object pairs covering 54 verbs for 138 ObjectNet classes. Utilizing this dataset, we propose a novel language understanding object retrieval (LUOR) module by fine-tuning the CLIP text encoder. This approach enables effective learning for the downstream task of object retrieval while preserving the object classification performance. Additionally, we integrate LUOR with a YOLOv3-based multi-task detection (MTD) module for simultaneous object and grasp pose detection. This integration enables the robot manipulator to accurately grasp objects based on verbal commands in complex environments containing multiple objects. Our results demonstrate that LUOR outperforms CLIP in both explicit and implicit retrieval tasks while preserving object classification accuracy for both the ImageNet and ObjectNet datasets. Also, the real-world applicability of the integrated system is demonstrated through experiments with the Franka Panda manipulator. | - |
| dc.format.extent | 11 | - |
| dc.language | 영어 | - |
| dc.language.iso | ENG | - |
| dc.publisher | 제어·로봇·시스템학회 | - |
| dc.title | LUOR: A Framework for Language Understanding in Object Retrieval and Grasping | - |
| dc.type | Article | - |
| dc.publisher.location | 대한민국 | - |
| dc.identifier.doi | 10.1007/s12555-024-0527-7 | - |
| dc.identifier.scopusid | 2-s2.0-85218338733 | - |
| dc.identifier.wosid | 001415358700009 | - |
| dc.identifier.bibliographicCitation | International Journal of Control, Automation, and Systems, v.23, no.2, pp 530 - 540 | - |
| dc.citation.title | International Journal of Control, Automation, and Systems | - |
| dc.citation.volume | 23 | - |
| dc.citation.number | 2 | - |
| dc.citation.startPage | 530 | - |
| dc.citation.endPage | 540 | - |
| dc.type.docType | Article | - |
| dc.identifier.kciid | ART003171801 | - |
| dc.description.isOpenAccess | N | - |
| dc.description.journalRegisteredClass | scie | - |
| dc.description.journalRegisteredClass | scopus | - |
| dc.description.journalRegisteredClass | kci | - |
| dc.relation.journalResearchArea | Automation & Control Systems | - |
| dc.relation.journalWebOfScienceCategory | Automation & Control Systems | - |
| dc.subject.keywordPlus | Adversarial machine learning | - |
| dc.subject.keywordPlus | Linguistics | - |
| dc.subject.keywordPlus | Modular robots | - |
| dc.subject.keywordPlus | Multi-task learning | - |
| dc.subject.keywordPlus | Object detection | - |
| dc.subject.keywordPlus | Object recognition | - |
| dc.subject.keywordPlus | Problem oriented languages | - |
| dc.subject.keywordPlus | Robot applications | - |
| dc.subject.keywordPlus | Robot learning | - |
| dc.subject.keywordAuthor | Grasp detection | - |
| dc.subject.keywordAuthor | multi-modal learning | - |
| dc.subject.keywordAuthor | robotic object retrieval | - |
| dc.identifier.url | https://link.springer.com/article/10.1007/s12555-024-0527-7 | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366
COPYRIGHT © 2024 HANYANG UNIVERSITY.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.
