Conference item
PVUW 2024 challenge on complex video understanding: methods and results
- Abstract:
- Pixel-level Video Understanding in the Wild Challenge (PVUW) focus on complex video understanding. In this CVPR 2024 workshop, we add two new tracks, Complex Video Object Segmentation Track based on MOSE dataset and Motion Expression guided Video Segmentation track based on MeViS dataset. In the two new tracks, we provide additional videos and annotations that feature challenging elements, such as the disappearance and reappearance of objects, inconspicuous small objects, heavy occlusions, and crowded environments in MOSE. Moreover, we provide a new motion expression guided video segmentation dataset MeViS to study the natural language-guided video understanding in complex environments. These new videos, sentences, and annotations enable us to foster the development of a more comprehensive and robust pixel-level understanding of video scenes in complex environments and realistic scenarios. The MOSE challenge had 140 registered teams in total, 65 teams participated the validation phase and 12 teams made valid submissions in the final challenge phase. The MeViS challenge had 225 registered teams in total, 50 teams participated the validation phase and 5 teams made valid submissions in the final challenge phase.
- Publication status:
- Accepted
- Peer review status:
- Peer reviewed
Actions
Access Document
- Files:
-
-
(Preview, Accepted manuscript, pdf, 5.4MB, Terms of use)
-
- Publisher copy:
- 10.1007/978-3-031-91856-8_21
Authors
- Publisher:
- Springer
- Host title:
- Computer Vision – ECCV 2024 Workshops
- Pages:
- 361–377
- Series:
- Lecture Notes in Computer Science
- Series number:
- 15632
- Publication date:
- 2025-05-12
- Acceptance date:
- 2024-06-03
- Event title:
- 18th European Conference on Computer Vision (ECCV 2024)
- Event location:
- Milan, Italy
- Event website:
- https://eccv.ecva.net/Conferences/2024
- Event start date:
- 2024-09-29
- Event end date:
- 2024-10-04
- DOI:
- EISBN:
- 9783031918568
- ISBN:
- 9783031918551
- Language:
-
English
- Pubs id:
-
2037051
- Local pid:
-
pubs:2037051
- Deposit date:
-
2024-10-08
- ARK identifier:
Terms of use
- Copyright holder:
- Ding et al
- Copyright date:
- 2025
- Rights statement:
- © 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
- Notes:
- This paper was presented at the 18th European Conference on Computer Vision (ECCV 2024), 29th September -9th October 2024, Milan, Italy. This is the accepted manuscript version of the article. The final version is available online from Springer at https://dx.doi.org/10.1007/978-3-031-91856-8_21
If you are the owner of this record, you can report an update to it here: Report update to this record