Abstract
Current popular online multi-object tracking (MOT) solutions apply single object trackers (SOTs) to capture object motions, while often requiring an extra af,nity network to associate objects, especially for the occluded ones. This brings extra computational overhead due to repetitive feature extraction for SOT and af,nity computation. Meanwhile, the model size of the sophisticated af,nity network is usually non-trivial. In this paper, we propose a novel MOT framework that uni,es object motion and af,nity model into a single network, named UMA, in order to learn a compact feature that is discriminative for both object motion and af,nity measure. In particular, UMA integrates single object tracking and metric learning into a uni,ed triplet network by means of multi-task learning. Such design brings advantages of improved computation ef,ciency, low memory requirement and simpli,ed training procedure. In addition, we equip our model with a task-speci,c attention module, which is used to boost task-aware feature learning. The proposed UMA can be easily trained end-to-end, and is elegant - requiring only one training stage. Experimental results show that it achieves promising performance on several MOT Challenge benchmarks.
Original language | English |
---|---|
Article number | 9156557 |
Pages (from-to) | 6767-6776 |
Number of pages | 10 |
Journal | Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition |
DOIs | |
State | Published - 2020 |
Event | 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020 - Virtual, Online, United States Duration: Jun 14 2020 → Jun 19 2020 |
Bibliographical note
Publisher Copyright:© 2020 IEEE Computer Society. All rights reserved.
ASJC Scopus subject areas
- Software
- Computer Vision and Pattern Recognition