A Large-scale 3D Object Dataset for 6-DoF Pose Estimation6자유도 자세 추정을 위한 대용량 3D 객체 데이터 구축
- Other Titles
- 6자유도 자세 추정을 위한 대용량 3D 객체 데이터 구축
- Authors
- 장재훈; 김준용; 김성흠
- Issue Date
- Dec-2023
- Publisher
- 제어·로봇·시스템학회
- Keywords
- large-scale object dataset; monocular 3D object detection; 6-DoF object pose estimation; .
- Citation
- 제어.로봇.시스템학회 논문지, v.29, no.12, pp 1008 - 1014
- Pages
- 7
- Journal Title
- 제어.로봇.시스템학회 논문지
- Volume
- 29
- Number
- 12
- Start Page
- 1008
- End Page
- 1014
- URI
- https://scholarworks.bwise.kr/ssu/handle/2018.sw.ssu/49059
- DOI
- 10.5302/J.ICROS.2023.23.0141
- ISSN
- 1976-5622
2233-4335
- Abstract
- Given the growing necessity of substantial human annotations in deep learning systems to enhance functionality and performance, it is imperative for researchers to scrutinize existing databases and develop their own datasets with custom labels, particularly for target applications such as object detection and pose estimation. This study introduces a large-scale 3D object dataset tailored for six degrees of freedom pose estimation in real-world scenarios. We describe the key features of our datasets available in the AI hub, emphasizing the expansive 3D object collection. Our methodology involves establishing a correspondence between eight points of an object cube in a 2D image, with the object’s pose determined using the conventional perspective-n-point (PnP) algorithm. To analyze the reprojection error, we employed a high-quality 3D mesh model and a binary mask of the target object in the RGB image. For database validation, all object categories were tested using a representative YOLO-like convolutional neural network architecture, such as real-time singleshot pose estimation. In addition, we conduct an in-depth analysis of the current database’s limitations. In the AI hub, we meticulously released all information regarding our new database, presenting it in a format consistent with our baseline database, LINEMOD. A comparative analysis against this baseline was conducted. To overcome the scalability concerns associated with unseen object categories, we explored an effective methodology that leverages vision and language knowledge distillation.
- Files in This Item
-
Go to Link
- Appears in
Collections - ETC > 1. Journal Articles
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.