SparseVoxNet: 3-D Object Recognition With Sparsely Aggregation of 3-D Dense Blocks
- Authors
- Karambakhsh, Ahmad; Sheng, Bin; Li, Ping; Li, Huating; Kim, Jinman; Jung, Younhyun; Chen, C. L. Philip
- Issue Date
- Jan-2024
- Publisher
- IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
- Keywords
- Feature extraction; Solid modeling; Convolutional neural networks; Object recognition; Training; Data models; Shape; 3-D convolutional network; 3-D recognition; SparseNet; surface normal; volumetric representation
- Citation
- IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, v.35, no.1, pp 532 - 546
- Pages
- 15
- Journal Title
- IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
- Volume
- 35
- Number
- 1
- Start Page
- 532
- End Page
- 546
- URI
- https://scholarworks.bwise.kr/gachon/handle/2020.sw.gachon/90595
- DOI
- 10.1109/TNNLS.2022.3175775
- ISSN
- 2162-237X
2162-2388
- Abstract
- Automatic recognition of 3-D objects in a 3-D model by convolutional neural network (CNN) methods has been successfully applied to various tasks, e.g., robotics and augmented reality. Three-dimensional object recognition is mainly performed by analyzing the object using multi-view images, depth images, graphs, or volumetric data. In some cases, using volumetric data provides the most promising results. However, existing recognition techniques on volumetric data have many drawbacks, such as losing object details on converting points to voxels and the large size of the input volume data that leads to substantial 3-D CNNs. Using point clouds could also provide very promising results; however, point-cloud-based methods typically need sparse data entry and time-consuming training stages. Thus, using volumetric could be a more efficient and flexible recognizer for our special case in the School of Medicine, Shanghai Jiao Tong University. In this article, we propose a novel solution to 3-D object recognition from volumetric data using a combination of three compact CNN models, low-cost SparseNet, and feature representation technique. We achieve an optimized network by estimating extra geometrical information comprising the surface normal and curvature into two separated neural networks. These two models provide supplementary information to each voxel data that consequently improve the results. The primary network model takes advantage of all the predicted features and uses these features in Random Forest (RF) for recognition purposes. Our method outperforms other methods in training speed in our experiments and provides an accurate result as good as the state-of-the-art.
- Files in This Item
- There are no files associated with this item.
- Appears in
Collections - ETC > 1. Journal Articles
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.