RoMP-transformer: Rotational bounding box with multi-level feature pyramid transformer for object detection

Moon, Joonhyeok; Jeon, Munsu; Jeong, Siheon; Oh, Ki-Yong

doi:10.1016/j.patcog.2023.110067

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

RoMP-transformer: Rotational bounding box with multi-level feature pyramid transformer for object detection

Full metadata record

DC Field	Value	Language
dc.contributor.author	Moon, Joonhyeok	-
dc.contributor.author	Jeon, Munsu	-
dc.contributor.author	Jeong, Siheon	-
dc.contributor.author	Oh, Ki-Yong	-
dc.date.accessioned	2023-11-14T08:09:40Z	-
dc.date.available	2023-11-14T08:09:40Z	-
dc.date.created	2023-11-07	-
dc.date.issued	2024-03	-
dc.identifier.issn	0031-3203	-
dc.identifier.uri	https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/192133	-
dc.description.abstract	This study proposes rotational bounding box with a multi-level feature pyramid transformer (RoMP-Transformer)—a fast and accurate one-stage deep neural network for object detection. The proposed RoMP-Transformer exhibits three characteristics. First, a rotational bounding box is utilized to minimize the effect of the background during the construction of feature maps, enhancing the robustness of the RoMP-Transformer. Second, the RoMP-Transformer employs a multi-level feature pyramid transformer by combining a multi-level feature pyramid network with a pyramid vision-transformer, effectively extracting high-quality features and achieving high accuracy. Third, the RoMP-Transformer executes bounding box optimization by minimizing the optimal intersection of union (IoU) loss by considering both the modified SKEW IoU and distance IoU. The modified SKEW IoU significantly accelerates the calculation, and the fused IoU calculation method improves prediction accuracy. Further, Bayesian optimization and weight lightening with half-tensor are performed to optimize the performance of the RoMP-Transformer for real-time applications. Experiments on three image sets—one on power transmission facilities, MSRA-TD500, and DOTA-v1.0—demonstrate that the proposed RoMP-Transformer outperforms other state-of-the-art neural networks in object detection in terms of accuracy, robustness, and calculation speed. Systematic analysis also reveals that the methods utilized by the RoMP-Transformer optimize object detection performance. The proposed architecture is expected to inspire further study of deep neural networks for object detection in real-world applications.	-
dc.language	영어	-
dc.language.iso	en	-
dc.publisher	Elsevier Ltd	-
dc.title	RoMP-transformer: Rotational bounding box with multi-level feature pyramid transformer for object detection	-
dc.type	Article	-
dc.contributor.affiliatedAuthor	Oh, Ki-Yong	-
dc.identifier.doi	10.1016/j.patcog.2023.110067	-
dc.identifier.scopusid	2-s2.0-85175162527	-
dc.identifier.bibliographicCitation	Pattern Recognition, v.147, pp.1 - 14	-
dc.relation.isPartOf	Pattern Recognition	-
dc.citation.title	Pattern Recognition	-
dc.citation.volume	147	-
dc.citation.startPage	1	-
dc.citation.endPage	14	-
dc.type.rims	ART	-
dc.type.docType	Article	-
dc.description.journalClass	1	-
dc.description.isOpenAccess	N	-
dc.description.journalRegisteredClass	scopus	-
dc.subject.keywordPlus	Artificial intelligence for object detection	-
dc.subject.keywordPlus	Autonomous flight	-
dc.subject.keywordPlus	Autonomous surveillance	-
dc.subject.keywordPlus	Feature map	-
dc.subject.keywordPlus	Feature pyramid	-
dc.subject.keywordPlus	Multi-level feature pyramid transformer	-
dc.subject.keywordPlus	Multi-level multi-scale feature map	-
dc.subject.keywordPlus	Multi-scale features	-
dc.subject.keywordPlus	Multilevels	-
dc.subject.keywordPlus	Objects detection	-
dc.subject.keywordPlus	Pyramid vision transformer	-
dc.subject.keywordAuthor	Artificial intelligence for object detection	-
dc.subject.keywordAuthor	Autonomous flight and surveillance	-
dc.subject.keywordAuthor	Multi-level feature pyramid transformer	-
dc.subject.keywordAuthor	Multi-level multi-scale feature map	-
dc.subject.keywordAuthor	Pyramid vision transformer	-
dc.identifier.url	https://www.sciencedirect.com/science/article/pii/S0031320323007641?via%3Dihub	-

Files in This Item: Go to Link

Appears in Collections: 서울 공과대학 > 서울 기계공학부 > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Oh, Ki-Yong photo

Oh, Ki-Yong: COLLEGE OF ENGINEERING (SCHOOL OF MECHANICAL ENGINEERING)

Read more

Altmetrics

Total Views & Downloads

STATISTICS: Total View :6,007,239; Today View :31,806

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1365

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE