Conference item
Detect to track and track to detect
- Abstract:
- Recent approaches for high accuracy detection and tracking of object categories in video consist of complex multistage solutions that become more cumbersome each year. In this paper we propose a ConvNet architecture that jointly performs detection and tracking, solving the task in a simple and effective way. Our contributions are threefold: (i) we set up a ConvNet architecture for simultaneous detection and tracking, using a multi-task objective for frame-based object detection and across-frame track regression; (ii) we introduce correlation features that represent object co-occurrences across time to aid the ConvNet during tracking; and (iii) we link the frame level detections based on our across-frame tracklets to produce high accuracy detections at the video level. Our ConvNet architecture for spatiotemporal object detection is evaluated on the large-scale ImageNet VID dataset where it achieves state-of-the-art results. Our approach provides better single model performance than the winning method of the last ImageNet challenge while being conceptually much simpler. Finally, we show that by increasing the temporal stride we can dramatically increase the tracker speed.
- Publication status:
- Published
- Peer review status:
- Peer reviewed
Actions
Access Document
- Files:
-
-
(Preview, Accepted manuscript, pdf, 5.9MB, Terms of use)
-
- Publisher copy:
- 10.1109/ICCV.2017.330
Authors
- Publisher:
- Institute of Electrical and Electronics Engineers
- Host title:
- IEEE International Conference on Computer Vision 2017
- Journal:
- IEEE International Conference on Computer Vision 2017 More from this journal
- Pages:
- 3057-3065
- Publication date:
- 2017-12-25
- Acceptance date:
- 2017-11-08
- DOI:
- ISSN:
-
978-1-5386-1033-6, 1550-5499
- ISBN:
- 9781538610329
- Pubs id:
-
pubs:821116
- UUID:
-
uuid:638773c5-b132-4d89-ad99-1ac49a3c878c
- Local pid:
-
pubs:821116
- Source identifiers:
-
821116
- Deposit date:
-
2018-08-17
- ARK identifier:
Terms of use
- Copyright holder:
- IEEE
- Copyright date:
- 2017
- Notes:
- © 2017 IEEE. This is the author accepted manuscript following peer review version of the article. The final version is available online from IEEE at: 10.1109/ICCV.2017.330
If you are the owner of this record, you can report an update to it here: Report update to this record