Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

One-stage Detection Model based on Swin Transformeropen access

Authors
Kim, Tae YangNiaz, AsimChoi, Jung SikChoi, Kwang Nam
Issue Date
2024
Publisher
Institute of Electrical and Electronics Engineers Inc.
Keywords
Attention; Computational modeling; Computer-vision; Detectors; Feature extraction; Object Detection; Predictive models; single-stage detection; Task analysis; Transformer Network; Transformers; YOLO
Citation
IEEE Access, v.12, pp 60960 - 60972
Pages
13
Journal Title
IEEE Access
Volume
12
Start Page
60960
End Page
60972
URI
https://scholarworks.bwise.kr/cau/handle/2019.sw.cau/73785
DOI
10.1109/ACCESS.2024.3393152
ISSN
2169-3536
Abstract
Object detection using vision transformers (ViTs) has recently garnered considerable research interest. Vision Transformers execute image classification through a multi-head attention-based MLP head and post-image segmentation into patches. However, conventional models prioritize object classification over predicting bounding boxes crucial for precise object detection. To address this gap, a two-stage detector has been devised based on Transformers, which initially extracts feature maps via a pre-trained CNN model. In contrast, our research introduces a one-stage object detector founded on the Swin-Transformer architecture. This one-stage detector adeptly performs simultaneous object classification and bounding box prediction employing a pure Swin-Transformer Encoder Block, obviating the need for a pre-trained CNN model. Our proposed model is trained, validated, and evaluated on the COCO dataset comprising 82,783 training images, 40,504 validation images, and 40,775 test images. The proposed model showed average precision (AP) 30.2% performance improvement by 5.59% compared to the performance evaluation of the existing ViT-based 1-stage detector. Authors
Files in This Item
Appears in
Collections
College of Software > School of Computer Science and Engineering > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Choi, Kwang Nam photo

Choi, Kwang Nam
소프트웨어대학 (소프트웨어학부)
Read more

Altmetrics

Total Views & Downloads

BROWSE