Conference item icon

Conference item

3D-aware instance segmentation and tracking in egocentric videos

Abstract:

Egocentric videos present unique challenges for 3D scene understanding due to rapid camera motion, frequent object occlusions, and limited object visibility. This paper introduces a novel approach to instance segmentation and tracking in first-person video that leverages 3D awareness to overcome these obstacles. Our method integrates scene geometry, 3D object centroid tracking, and instance segmentation to create a robust framework for analyzing dynamic egocentric scenes. By incorporating spatial and temporal cues, we achieve superior performance compared to state-of-the-art 2D approaches. Extensive evaluations on the challenging EPIC Fields dataset demonstrate significant improvements across a range of tracking and segmentation consistency metrics. Specifically, our method outperforms the next best performing approach by 7 points in Association Accuracy (AssA) and 4.5 points in IDF1 score, while reducing the number of ID switches by 73% to 80% across various object categories. Leveraging our tracked instance segmentations, we showcase downstream applications in 3D object reconstruction and amodal video object segmentation in these egocentric settings.

Publication status:
Published
Peer review status:
Peer reviewed

Actions

Access Document

Publisher copy:
10.1007/978-981-96-0908-6_20

Authors

More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Role:
Author
More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Role:
Author
More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Role:
Author
More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Role:
Author
More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Role:
Author


More from this funder
Funder identifier:
https://ror.org/0439y7842
Grant:
EP/T028572/1


Publisher:
Springer
Host title:
Computer Vision – ACCV 2024
Pages:
347-364
Series:
Lecture Notes in Computer Science
Series number:
15474
Publication date:
2024-12-07
Acceptance date:
2024-09-20
Event title:
17th Asian Conference on Computer Vision (ACCV 2024)
Event location:
Hanoi, Vietnam
Event website:
https://accv2024.org/
Event start date:
2024-12-08
Event end date:
2024-12-12
DOI:
EISSN:
1611-3349
ISSN:
0302-9743
EISBN:
9789819609086
ISBN:
9789819609079


Language:
English
Keywords:
Pubs id:
2080992
Local pid:
pubs:2080992
Deposit date:
2025-01-28
ARK identifier:

Terms of use


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP