Conference item icon

Conference item

Dynamic image networks for action recognition

Abstract:
We introduce the concept of dynamic image, a novel compact representation of videos useful for video analysis especially when convolutional neural networks (CNNs) are used. The dynamic image is based on the rank pooling concept and is obtained through the parameters of a ranking machine that encodes the temporal evolution of the frames of the video. Dynamic images are obtained by directly applying rank pooling on the raw image pixels of a video producing a single RGB image per video. This idea is simple but powerful as it enables the use of existing CNN models directly on video data with fine-tuning. We present an efficient and effective approximate rank pooling operator, speeding it up orders of magnitude compared to rank pooling. Our new approximate rank pooling CNN layer allows us to generalize dynamic images to dynamic feature maps and we demonstrate the power of our new representations on standard benchmarks in action recognition achieving state-of-the-art performance.
Publication status:
Published
Peer review status:
Peer reviewed

Actions

Access Document

Files:

Authors

More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Role:
Author


Publisher:
Institute of Electrical and Electronics Engineers
Host title:
IEEE Conference on Computer Vision and Pattern Recognition, 2016
Journal:
IEEE Conference on Computer Vision and Pattern Recognition, 2016 More from this journal
Publication date:
2016-12-12
Acceptance date:
2016-03-02
Event location:
Washington, USA
Event start date:
2016-06-26


Keywords:
Pubs id:
pubs:624521
UUID:
uuid:0586572f-3a29-4603-ad6f-998a8bb4a7c7
Local pid:
pubs:624521
Source identifiers:
624521
Deposit date:
2016-05-27
ARK identifier:

Terms of use


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP