Joint Representation of Temporal Image Sequences and Object Motion for Video Object Detection
- Authors
- Koh, Junho; Kim, Jaekyum; Shin, Younji; Lee, Byeongwon; Yang, Seungji; Choi, Jun Won
- Issue Date
- May-2021
- Publisher
- IEEE
- Citation
- 2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), v.2021-May, pp.13370 - 13376
- Indexed
- SCOPUS
- Journal Title
- 2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021)
- Volume
- 2021-May
- Start Page
- 13370
- End Page
- 13376
- URI
- https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/141863
- DOI
- 10.1109/ICRA48506.2021.9561778
- ISSN
- 1050-4729
- Abstract
- In this paper, we propose a new video object detection (VoD) method, referred to as temporal feature aggregation and motion-aware VoD (TM-VoD), that produces a joint representation of temporal image sequences and object motion. The TM-VoD generates strong spatio-temporal features for VOD by temporally redundant information in an image sequence and the motion context. These are produced at the feature level in the region proposal stage and at the instance level in the refinement stage. In the region proposal stage, visual features are temporally fused with appropriate weights at the pixel level via gated attention model. Furthermore, pixel level motion features are obtained by capturing the changes between adjacent visual feature maps. In the refinement stage, the visual features are aligned and aggregated at the instance level. We propose a novel feature alignment method, which uses the initial region proposals as anchors to predict the box coordinates for all video frames. Moreover, the instance level motion features are obtained by applying the region of interest (RoI) pooling to the pixel level motion features and by encoding the sequential changes in the box coordinates. Finally, all these instance level features are concatenated to produce a joint representation of the objects. Experiments on the ImageNet VID dataset demonstrate that the proposed method significantly outperforms existing VoDs and achieves performance comparable with that of state-of-the-art VoDs.
- Files in This Item
-
Go to Link
- Appears in
Collections - 서울 공과대학 > 서울 전기공학전공 > 1. Journal Articles
![qrcode](https://api.qrserver.com/v1/create-qr-code/?size=55x55&data=https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/141863)
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.