Open-Vocabulary Multi-Object Tracking with Domain Generalized and Temporally Adaptive Features
- Authors
- Li, Run; Zhang, Dawei; Wang, Yanchao; Jiang, Yunliang; Zheng, Zhonglong; Jeon, Sang-Woon; Wang, Hua
- Issue Date
- Apr-2025
- Publisher
- Institute of Electrical and Electronics Engineers Inc.
- Keywords
- Domain generalization; Dynamic visual scenes; Open-vocabulary multi-object tracking; Temporal adaptability
- Citation
- IEEE Transactions on Multimedia, v.27, pp 1 - 15
- Pages
- 15
- Indexed
- SCIE
SCOPUS
- Journal Title
- IEEE Transactions on Multimedia
- Volume
- 27
- Start Page
- 1
- End Page
- 15
- URI
- https://scholarworks.bwise.kr/erica/handle/2021.sw.erica/125217
- DOI
- 10.1109/TMM.2025.3557619
- ISSN
- 1520-9210
1941-0077
- Abstract
- Open-vocabulary multi-object tracking (OVMOT) is a cutting research direction within the multi-object tracking field. It employs large multi-modal models to effectively address the challenge of tracking unseen objects within dynamic visual scenes. While models require robust domain generalization and temporal adaptability, OVTrack, the only existing open-vocabulary multi-object tracker, relies solely on static appearance information and lacks these crucial adaptive capabilities. In this paper, we propose OVSORT, a new framework designed to improve domain generalization and temporal information processing. Specifically, we first propose the Adaptive Contextual Normalization (ACN) technique in OVSORT, which dynamically adjusts the feature maps based on the dataset's statistical properties, thereby fine-tuning our model's to improve domain generalization. Then, we introduce motion cues for the first time. Using our Joint Motion and Appearance Tracking (JMAT) strategy, we obtain a joint similarity measure and subsequently apply the Hungarian algorithm for data association. Finally, our Hierarchical Adaptive Feature Update (HAFU) strategy adaptively adjusts feature updates according to the current state of each trajectory, which greatly improves the utilization of temporal information. Extensive experiments on the TAO validation set and test set confirm the superiority of OVSORT, which significantly improves the handling of novel and base classes. It surpasses existing methods in terms of accuracy and generalization, setting a new state-of-the-art for OVMOT. © 1999-2012 IEEE.
- Files in This Item
-
Go to Link
- Appears in
Collections - COLLEGE OF ENGINEERING SCIENCES > SCHOOL OF ELECTRICAL ENGINEERING > 1. Journal Articles

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.