Conference item icon

Conference item

Tracktention: leveraging point tracking to attend videos faster and better

Abstract:
Temporal consistency is critical in video prediction to ensure that outputs are coherent and free of artifacts. Traditional methods, such as temporal attention and 3D convolution, may struggle with significant object motion and may not capture long-range temporal dependencies in dynamic scenes. To address this gap, we propose the Tracktention Layer, a novel architectural component that explicitly integrates motion information using point tracks, i.e., sequences of corresponding points across frames. By incorporating these motion cues, the Tracktention Layer enhances temporal alignment and effectively handles complex object motions, maintaining consistent feature representations over time. Our approach is computationally efficient and can be seamlessly integrated into existing models, such as Vision Transformers, with minimal modification. It can be used to upgrade image-only models to state-of-the-art video ones, sometimes outperforming models natively designed for video prediction. We demonstrate this on video depth prediction and video colorization, where models augmented with the Tracktention Layer exhibit significantly improved temporal consistency compared to baselines. Project website: zlai0.github.io/TrackTention.
Publication status:
Published
Peer review status:
Peer reviewed

Actions

Access Document

Publisher copy:
10.1109/cvpr52734.2025.02124

Authors

More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Role:
Author
More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Role:
Author


More from this funder
Funder identifier:
https://ror.org/0472cxd90


Publisher:
IEEE
Host title:
2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Pages:
22809-22819
Publication date:
2025-08-13
Event title:
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2025)
Event location:
Nashville, Tennessee, USA
Event website:
https://cvpr.thecvf.com/Conferences/2025
Event start date:
2025-06-11
Event end date:
2025-06-15
DOI:
EISSN:
2575-7075
ISSN:
1063-6919
EISBN:
9798331543648
ISBN:
9798331543655


Terms of use


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP