Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

GNN-Transformer Task Planning Enhanced with Semantic-Driven Data Augmentation

Authors
Jeong, SoojinByeon, SeongwanKim, SangwooKwon, HyeokJunOh, Yoonseon
Issue Date
Apr-2025
Publisher
Association for the Advancement of Artificial Intelligence
Citation
Proceedings of the AAAI Conference on Artificial Intelligence, v.39, no.14, pp 14585 - 14593
Pages
9
Indexed
SCOPUS
Journal Title
Proceedings of the AAAI Conference on Artificial Intelligence
Volume
39
Number
14
Start Page
14585
End Page
14593
URI
https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/207430
DOI
10.1609/aaai.v39i14.33598
ISSN
2159-5399
2374-3468
Abstract
Natural language is the most intuitive means for humans to interact with robots, making task planning based on natural language commands a longstanding area of research. Large language models (LLMs) have significantly improved task planning by enhancing understanding of language and common sense. However, current methods still face several challenges: they lack a deep understanding of physical environments, their performance relies heavily on prompt examples, LLMs are oversized and not customized for specific tasks, and the planning costs remain high. To overcome these issues, we introduce the GNN-Transformer Task Planner (GTTP), designed to predict task-level actions by leveraging the semantic environment and incorporating historical state data. The GTTP architecture is scalable through the use of GNN layers, while transformer layers facilitate understanding task progression. In addition, our model uses a text encoder to embed environments, allowing it to be trained on simulated datasets and applied directly in real-world scenarios. We also propose an automated data generation method that includes semantic augmentation, planning verification, and instruction generation via LLM. This method enables the collection of 14k instruction-annotated tasks in the VirtualHome environment with minimal human effort. The model has been validated across diverse scenes containing up to 715 objects, achieving significantly higher success rates compared to baseline models. It has also been successfully deployed on a physical mobile manipulator, demonstrating its practical applicability and effectiveness.
Files in This Item
Go to Link
Appears in
Collections
서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher oh, yoonseon photo

oh, yoonseon
COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)
Read more

Altmetrics

Total Views & Downloads

BROWSE