Conference item
Human focused action localization in video
- Abstract:
-
We propose a novel human-centric approach to detect and localize human actions in challenging video data, such as Hollywood movies. Our goal is to localize actions in time through the video and spatially in each frame. We achieve this by first obtaining generic spatio-temporal human tracks and then detecting specific actions within these using a sliding window classifier.
We make the following contributions: (i) We show that splitting the action localization task into spatial and temporal search leads to an efficient localization algorithm where generic human tracks can be reused to recognize multiple human actions; (ii) We develop a human detector and tracker which is able to cope with a wide range of postures, articulations, motions and camera viewpoints. The tracker includes detection interpolation and a principled classification stage to suppress false positive tracks; (iii) We propose a track-aligned 3D-HOG action representation, investigate its parameters, and show that action localization benefits from using tracks; and (iv) We introduce a new action localization dataset based on Hollywood movies.
Results are presented on a number of real-world movies with crowded, dynamic environment, partial occlusion and cluttered background. On the Coffee&Cigarettes dataset we significantly improve over the state of the art. Furthermore, we obtain excellent results on the new Hollywood–Localization dataset.
- Publication status:
- Published
- Peer review status:
- Peer reviewed
Actions
Access Document
- Files:
-
-
(Preview, Accepted manuscript, pdf, 1.4MB, Terms of use)
-
- Publisher copy:
- 10.1007/978-3-642-35749-7_17
Authors
- Publisher:
- Springer
- Host title:
- Trends and Topics in Computer Vision: ECCV 2010 Workshops, Heraklion, Crete, Greece, September 10-11, 2010, Revised Selected Papers, Part I
- Volume:
- 6553
- Pages:
- 219–233
- Series:
- Lecture Notes in Computer Science
- Publication date:
- 2012-11-23
- Event title:
- 11th European Conference on Computer Vision (ECCV 2010)
- Event location:
- Heraklion, Greece
- Event website:
- https://projects.ics.forth.gr/eccv2010/intro.php
- Event start date:
- 2010-09-05
- Event end date:
- 2010-09-11
- DOI:
- EISSN:
-
1611-3349
- ISSN:
-
0302-9743
- EISBN:
- 9783642357497
- ISBN:
- 9783642357480
- Language:
-
English
- Pubs id:
-
1770604
- Local pid:
-
pubs:1770604
- Deposit date:
-
2024-07-19
- ARK identifier:
Terms of use
- Copyright holder:
- Springer
- Copyright date:
- 2012
- Rights statement:
- © 2012 Springer-Verlag Berlin Heidelberg.
- Notes:
- This is the accepted manuscript version of the article, which was presented at the 11th European Conference on Computer Vision (ECCV 2010), 5-11 September, 2010, Heraklion, Greece. The final version is available online from Springer at https://doi.org/10.1007/978-3-642-35749-7_17
If you are the owner of this record, you can report an update to it here: Report update to this record