RoMP-transformer: Rotational bounding box with multi-level feature pyramid transformer for object detection
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Moon, Joonhyeok | - |
dc.contributor.author | Jeon, Munsu | - |
dc.contributor.author | Jeong, Siheon | - |
dc.contributor.author | Oh, Ki-Yong | - |
dc.date.accessioned | 2023-11-14T08:09:40Z | - |
dc.date.available | 2023-11-14T08:09:40Z | - |
dc.date.created | 2023-11-07 | - |
dc.date.issued | 2024-03 | - |
dc.identifier.issn | 0031-3203 | - |
dc.identifier.uri | https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/192133 | - |
dc.description.abstract | This study proposes rotational bounding box with a multi-level feature pyramid transformer (RoMP-Transformer)—a fast and accurate one-stage deep neural network for object detection. The proposed RoMP-Transformer exhibits three characteristics. First, a rotational bounding box is utilized to minimize the effect of the background during the construction of feature maps, enhancing the robustness of the RoMP-Transformer. Second, the RoMP-Transformer employs a multi-level feature pyramid transformer by combining a multi-level feature pyramid network with a pyramid vision-transformer, effectively extracting high-quality features and achieving high accuracy. Third, the RoMP-Transformer executes bounding box optimization by minimizing the optimal intersection of union (IoU) loss by considering both the modified SKEW IoU and distance IoU. The modified SKEW IoU significantly accelerates the calculation, and the fused IoU calculation method improves prediction accuracy. Further, Bayesian optimization and weight lightening with half-tensor are performed to optimize the performance of the RoMP-Transformer for real-time applications. Experiments on three image sets—one on power transmission facilities, MSRA-TD500, and DOTA-v1.0—demonstrate that the proposed RoMP-Transformer outperforms other state-of-the-art neural networks in object detection in terms of accuracy, robustness, and calculation speed. Systematic analysis also reveals that the methods utilized by the RoMP-Transformer optimize object detection performance. The proposed architecture is expected to inspire further study of deep neural networks for object detection in real-world applications. | - |
dc.language | 영어 | - |
dc.language.iso | en | - |
dc.publisher | Elsevier Ltd | - |
dc.title | RoMP-transformer: Rotational bounding box with multi-level feature pyramid transformer for object detection | - |
dc.type | Article | - |
dc.contributor.affiliatedAuthor | Oh, Ki-Yong | - |
dc.identifier.doi | 10.1016/j.patcog.2023.110067 | - |
dc.identifier.scopusid | 2-s2.0-85175162527 | - |
dc.identifier.bibliographicCitation | Pattern Recognition, v.147, pp.1 - 14 | - |
dc.relation.isPartOf | Pattern Recognition | - |
dc.citation.title | Pattern Recognition | - |
dc.citation.volume | 147 | - |
dc.citation.startPage | 1 | - |
dc.citation.endPage | 14 | - |
dc.type.rims | ART | - |
dc.type.docType | Article | - |
dc.description.journalClass | 1 | - |
dc.description.isOpenAccess | N | - |
dc.description.journalRegisteredClass | scopus | - |
dc.subject.keywordPlus | Artificial intelligence for object detection | - |
dc.subject.keywordPlus | Autonomous flight | - |
dc.subject.keywordPlus | Autonomous surveillance | - |
dc.subject.keywordPlus | Feature map | - |
dc.subject.keywordPlus | Feature pyramid | - |
dc.subject.keywordPlus | Multi-level feature pyramid transformer | - |
dc.subject.keywordPlus | Multi-level multi-scale feature map | - |
dc.subject.keywordPlus | Multi-scale features | - |
dc.subject.keywordPlus | Multilevels | - |
dc.subject.keywordPlus | Objects detection | - |
dc.subject.keywordPlus | Pyramid vision transformer | - |
dc.subject.keywordAuthor | Artificial intelligence for object detection | - |
dc.subject.keywordAuthor | Autonomous flight and surveillance | - |
dc.subject.keywordAuthor | Multi-level feature pyramid transformer | - |
dc.subject.keywordAuthor | Multi-level multi-scale feature map | - |
dc.subject.keywordAuthor | Pyramid vision transformer | - |
dc.identifier.url | https://www.sciencedirect.com/science/article/pii/S0031320323007641?via%3Dihub | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1365
COPYRIGHT © 2021 HANYANG UNIVERSITY.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.