Few-shot action recognition with permutation-invariant attention

Zhang, H; Zhang, L; Qi, X; Li, H; Torr, PHS; Koniusz, P

AI Collection

Conference item

Few-shot action recognition with permutation-invariant attention

Abstract:: Many few-shot learning models focus on recognising images. In contrast, we tackle a challenging task of few-shot action recognition from videos. We build on a C3D encoder for spatio-temporal video blocks to capture short-range action patterns. Such encoded blocks are aggregated by permutation-invariant pooling to make our approach robust to varying action lengths and long-range temporal dependencies whose patterns are unlikely to repeat even in clips of the same class. Subsequently, the pooled representations are combined into simple relation descriptors which encode so-called query and support clips. Finally, relation descriptors are fed to the comparator with the goal of similarity learning between query and support clips. Importantly, to re-weight block contributions during pooling, we exploit spatial and temporal attention modules and self-supervision. In naturalistic clips (of the same class) there exists a temporal distribution shift–the locations of discriminative temporal action hotspots vary. Thus, we permute blocks of a clip and align the resulting attention regions with similarly permuted attention regions of non-permuted clip to train the attention mechanism invariant to block (and thus long-term hotspot) permutations. Our method outperforms the state of the art on the HMDB51, UCF101, miniMIT datasets.

Publication status:: Published

Peer review status:: Peer reviewed

Actions

Email

Email this record

Send the bibliographic details of this record to your email address.

Your Email
Please enter the email address that the record information will be sent to.

-
Your message (optional)
Please add any additional information to be included within the email.
Share
Cite

Cite this record

APA Style

Zhang, H., Zhang, L., Qi, X., Li, H., Torr, P. H. S., & Koniusz, P. (2020). Few-shot action recognition with permutation-invariant attention. European Conference on Computer Vision (ECCV), 2020, 12350, 525–542.

MLA Style

Zhang, H, et al. “Few-Shot Action Recognition with Permutation-Invariant Attention.” European Conference on Computer Vision (ECCV), 2020, Lecture Notes in Computer Science, vol. 12350, 2020, pp. 525–42.

Chicago Style

Zhang, H, L Zhang, X Qi, H Li, PHS Torr, and P Koniusz. 2020. “Few-Shot Action Recognition with Permutation-Invariant Attention.” In European Conference on Computer Vision (ECCV), 2020, 12350:525–42. Lecture Notes in Computer Science. Springer.
Print

Access Document

Files:: ZhangetalAAM2020.pdf

(Preview, Accepted manuscript, pdf, 3.8MB, Terms of use)

Publisher copy:: 10.1007/978-3-030-58558-7_31

Authors

+ Zhang, H More by this author

Institution:: University of Oxford
Division:: MPLS
Department:: Engineering Science
Role:: Author

+ Zhang, L More by this author

Institution:: University of Oxford
Division:: MPLS
Department:: Zoology
Role:: Author

+ Qi, X More by this author

Institution:: University of Oxford
Division:: MPLS
Department:: Engineering Science
Role:: Author

+ Li, H More by this author

Role:: Author

+ Torr, PHS More by this author

Institution:: University of Oxford
Division:: MPLS
Department:: Engineering Science
Role:: Author

More authors...

Publisher:: Springer
Host title:: Proceedings of the European Conference on Computer Vision (ECCV 2020)
Journal:: Proceedings of the European Conference on Computer Vision (ECCV 2020) More from this journal
Volume:: 12350
Pages:: 525-542
Series:: Lecture Notes in Computer Science
Publication date:: 2020-10-29
Event title:: European Conference on Computer Vision (ECCV), 2020
Event location:: Online
Event website:: https://eccv2020.eu/
Event start date:: 2020-08-23
Event end date:: 2020-08-28
DOI:: 10.1007/978-3-030-58558-7_31
EISSN:: 1611-3349
ISSN:: 0302-9743
EISBN:: 978-3-030-58558-7
ISBN:: 9783030585570

Language:: English
Keywords:: FFR
Pubs id:: 1150997
Local pid:: pubs:1150997
Deposit date:: 2021-01-05
ARK identifier:: ark:/29072/ora_0f61cb08de9742f7aff63632500644a6

Terms of use

Copyright holder:: Springer
Notes:: This paper was presented at the European Conference in Computer Vision (ECCV 2020), 23rd - 28th August 2020. This is the accepted manuscript version of the article. The final version is available from Springer at: https://doi.org/10.1007/978-3-030-58558-7_31

Licence:: Terms and Conditions of Use for Oxford University Research Archive

Views and Downloads

About views and downloads

If you are the owner of this record, you can report an update to it here: Report update to this record

Conference item

Few-shot action recognition with permutation-invariant attention

Actions

Access Document

Authors

Terms of use

Views and Downloads

Altmetrics

Dimensions

Conference item

Few-shot action recognition with permutation-invariant attention

Actions

Access Document

Authors

Bibliographic Details

Item Description

Terms of use

Metrics

Views and Downloads

Altmetrics

Dimensions