From Language to Grasp: Object Retrieval and Grasping Through Explicit and Implicit Linguistic Commands

Yoon, Dongmin; Cha, Seonghun; Oh, Yoonseon

doi:10.23919/ICCAS63016.2024.10773029

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

From Language to Grasp: Object Retrieval and Grasping Through Explicit and Implicit Linguistic Commands

Full metadata record

DC Field	Value	Language
dc.contributor.author	Yoon, Dongmin	-
dc.contributor.author	Cha, Seonghun	-
dc.contributor.author	Oh, Yoonseon	-
dc.date.accessioned	2025-02-12T08:00:34Z	-
dc.date.available	2025-02-12T08:00:34Z	-
dc.date.issued	2024-10	-
dc.identifier.issn	1598-7833	-
dc.identifier.uri	https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/206472	-
dc.description.abstract	In human-centered environments, assistive robots are required to understand verbal commands to retrieve and grasp objects within complex scenes. We propose a novel Language Understanding Object Retrieval module (LUOR) by fine-tuning the CLIP text encoder to enhance robot manipulators' understanding of both explicit and implicit natural language commands. A new dataset with 712 verb-object pairs is created for training. This dataset includes 78 verbs associated with 244 ImageNet classes, providing a comprehensive range of scenarios. Additionally, 336 verb-object pairs cover 54 verbs for 138 ObjectNet classes, further expanding the model's applicability. Experimental results demonstrate that LUOR outperforms existing baselines in both accuracy and efficiency, particularly in handling implicit commands. The integrated system with the Multi-Task Detection module (MTD) shows strong performance in real-world robotic applications using a Panda Franka manipulator. These findings confirm the practical applicability of our approach and suggest potential for further improvements in robotic grasping and manipulation tasks.	-
dc.format.extent	2	-
dc.language	영어	-
dc.language.iso	ENG	-
dc.title	From Language to Grasp: Object Retrieval and Grasping Through Explicit and Implicit Linguistic Commands	-
dc.type	Article	-
dc.identifier.doi	10.23919/ICCAS63016.2024.10773029	-
dc.identifier.scopusid	2-s2.0-85214363266	-
dc.identifier.bibliographicCitation	International Conference on Control, Automation and Systems, pp 1565 - 1566	-
dc.citation.title	International Conference on Control, Automation and Systems	-
dc.citation.startPage	1565	-
dc.citation.endPage	1566	-
dc.type.docType	Conference paper	-
dc.description.isOpenAccess	N	-
dc.description.journalRegisteredClass	scopus	-
dc.subject.keywordPlus	Adversarial machine learning	-
dc.subject.keywordPlus	Content based retrieval	-
dc.subject.keywordPlus	Contrastive Learning	-
dc.subject.keywordPlus	Industrial robots	-
dc.subject.keywordPlus	Linguistics	-
dc.subject.keywordPlus	Modular robots	-
dc.subject.keywordPlus	Multi-task learning	-
dc.subject.keywordPlus	Natural language processing systems	-
dc.subject.keywordPlus	Object detection	-
dc.subject.keywordPlus	Object recognition	-
dc.subject.keywordPlus	Robot applications	-
dc.subject.keywordPlus	Robot learning	-
dc.subject.keywordAuthor	grasp detection	-
dc.subject.keywordAuthor	multi-modal learning	-
dc.subject.keywordAuthor	Robotic object retrieval	-

Files in This Item: There are no files associated with this item.

Appears in Collections: 서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher oh, yoonseon photo

oh, yoonseon: COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE