Open-Vocabulary Multi-Object Tracking with Domain Generalized and Temporally Adaptive Features
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Li, Run | - |
dc.contributor.author | Zhang, Dawei | - |
dc.contributor.author | Wang, Yanchao | - |
dc.contributor.author | Jiang, Yunliang | - |
dc.contributor.author | Zheng, Zhonglong | - |
dc.contributor.author | Jeon, Sang-Woon | - |
dc.contributor.author | Wang, Hua | - |
dc.date.accessioned | 2025-05-07T08:30:46Z | - |
dc.date.available | 2025-05-07T08:30:46Z | - |
dc.date.issued | 2025-04 | - |
dc.identifier.issn | 1520-9210 | - |
dc.identifier.issn | 1941-0077 | - |
dc.identifier.uri | https://scholarworks.bwise.kr/erica/handle/2021.sw.erica/125217 | - |
dc.description.abstract | Open-vocabulary multi-object tracking (OVMOT) is a cutting research direction within the multi-object tracking field. It employs large multi-modal models to effectively address the challenge of tracking unseen objects within dynamic visual scenes. While models require robust domain generalization and temporal adaptability, OVTrack, the only existing open-vocabulary multi-object tracker, relies solely on static appearance information and lacks these crucial adaptive capabilities. In this paper, we propose OVSORT, a new framework designed to improve domain generalization and temporal information processing. Specifically, we first propose the Adaptive Contextual Normalization (ACN) technique in OVSORT, which dynamically adjusts the feature maps based on the dataset's statistical properties, thereby fine-tuning our model's to improve domain generalization. Then, we introduce motion cues for the first time. Using our Joint Motion and Appearance Tracking (JMAT) strategy, we obtain a joint similarity measure and subsequently apply the Hungarian algorithm for data association. Finally, our Hierarchical Adaptive Feature Update (HAFU) strategy adaptively adjusts feature updates according to the current state of each trajectory, which greatly improves the utilization of temporal information. Extensive experiments on the TAO validation set and test set confirm the superiority of OVSORT, which significantly improves the handling of novel and base classes. It surpasses existing methods in terms of accuracy and generalization, setting a new state-of-the-art for OVMOT. © 1999-2012 IEEE. | - |
dc.format.extent | 15 | - |
dc.language | 영어 | - |
dc.language.iso | ENG | - |
dc.publisher | Institute of Electrical and Electronics Engineers Inc. | - |
dc.title | Open-Vocabulary Multi-Object Tracking with Domain Generalized and Temporally Adaptive Features | - |
dc.type | Article | - |
dc.publisher.location | 미국 | - |
dc.identifier.doi | 10.1109/TMM.2025.3557619 | - |
dc.identifier.scopusid | 2-s2.0-105002153417 | - |
dc.identifier.wosid | 001498274900007 | - |
dc.identifier.bibliographicCitation | IEEE Transactions on Multimedia, v.27, pp 1 - 15 | - |
dc.citation.title | IEEE Transactions on Multimedia | - |
dc.citation.volume | 27 | - |
dc.citation.startPage | 1 | - |
dc.citation.endPage | 15 | - |
dc.type.docType | Article | - |
dc.description.isOpenAccess | N | - |
dc.description.journalRegisteredClass | scie | - |
dc.description.journalRegisteredClass | scopus | - |
dc.relation.journalResearchArea | Computer Science | - |
dc.relation.journalResearchArea | Telecommunications | - |
dc.relation.journalWebOfScienceCategory | Computer Science, Information Systems | - |
dc.relation.journalWebOfScienceCategory | Computer Science, Software Engineering | - |
dc.relation.journalWebOfScienceCategory | Telecommunications | - |
dc.subject.keywordPlus | Visualization | - |
dc.subject.keywordPlus | Adaptation models | - |
dc.subject.keywordPlus | Trajectory | - |
dc.subject.keywordPlus | Training | - |
dc.subject.keywordPlus | Data models | - |
dc.subject.keywordPlus | Accuracy | - |
dc.subject.keywordPlus | Object recognition | - |
dc.subject.keywordPlus | Heuristic algorithms | - |
dc.subject.keywordPlus | Electronic mail | - |
dc.subject.keywordPlus | Vocabulary | - |
dc.subject.keywordAuthor | Domain generalization | - |
dc.subject.keywordAuthor | Dynamic visual scenes | - |
dc.subject.keywordAuthor | Open-vocabulary multi-object tracking | - |
dc.subject.keywordAuthor | Temporal adaptability | - |
dc.identifier.url | https://ieeexplore.ieee.org/document/10948331 | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
55 Hanyangdeahak-ro, Sangnok-gu, Ansan, Gyeonggi-do, 15588, Korea+82-31-400-4269 sweetbrain@hanyang.ac.kr
COPYRIGHT © 2021 HANYANG UNIVERSITY. ALL RIGHTS RESERVED.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.