Rotational multipyramid network with bounding-box transformation for object detection

Kim, D.; Kim, S.; Jeong, S.; Ham, J.-W.; Son, S.; Moon, J.; Oh, K.-Y.

doi:10.1002/int.22513

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Rotational multipyramid network with bounding-box transformation for object detection

Authors: Kim, D.; Kim, S.; Jeong, S.; Ham, J.-W.; Son, S.; Moon, J.; Oh, K.-Y.

Issue Date: Sep-2021

Publisher: John Wiley and Sons Ltd

Keywords: deep learning; hyperplane; multi-scale and multi-level feature extraction; object detection; space transformation

Citation: International Journal of Intelligent Systems, v.36, no.9, pp 5307 - 5338

Pages: 32

Journal Title: International Journal of Intelligent Systems

Volume: 36

Number: 9

Start Page: 5307

End Page: 5338

URI: https://scholarworks.bwise.kr/cau/handle/2019.sw.cau/50944

DOI: 10.1002/int.22513

ISSN: 0884-8173
1098-111X

Abstract: The study proposes a rotational multipyramid network (RoMP Net) with bounding-box transformation for object detection. The RoMP Net is a single-stage object detection neural network featuring three characteristics. First, the network uses a rotational bounding box to minimize the effect of background images when extracting features of objects. Bounding-box transformation was proposed to compensate for the limitation of the rotational bounding boxes, which have relatively low prediction accuracy for objects with a high aspect ratio. Second, the RoMP Net introduces a multi-scale and multi-level feature pyramid network to extract distinct and semantic features efficiently. This network architecture ensures high prediction accuracy and robustness regardless of the size and complexity of objects. Third, hyperparameters in the bounding boxes are automatically determined through an unsupervised clustering method. This optimization method is also critical in improving accuracy. The performance of the proposed network and preprocessing methods are validated through image-sets comprising critical components in power transmission facilities, which have a variety of sizes and aspect ratios. This case study demonstrates the effectiveness and robustness of the three key characteristics in the RoMP Net. Furthermore, the RoMP Net outperforms other state-of-the-art deep neural networks in prediction accuracy and robustness for object detection. Specifically, the mean average precision of the RoMP Net in the validation image-sets shows that it has the highest prediction accuracy, whereas its values in the test image-sets confirm the network's robustness. The fast yet accurate RoMP Net will expand the range of object detection through deep neural networks. © 2021 Wiley Periodicals LLC

Files in This Item: There are no files associated with this item.

Appears in Collections: College of Engineering > School of Energy System Engineering > 1. Journal Articles

Show full item record

qrcode

Altmetrics

Total Views & Downloads

STATISTICS: Total View :7,455,812; Today View :6,063

RSS_1.0 RSS_2.0 ATOM_1.0

84, Heukseok-ro, Dongjak-gu, Seoul, Republic of Korea (06974)02-820-6194

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Altmetrics

Total Views & Downloads

BROWSE