Conference item
Character-centric understanding of animated movies
- Abstract:
- Animated movies are captivating for their unique character designs and imaginative storytelling, yet they pose significant challenges for existing recognition systems. Unlike the consistent visual patterns detected by conventional face recognition methods, animated characters exhibit extreme diversity in their appearance, motion, and deformation. In this work, we propose an audio-visual pipeline to enable automatic and robust animated character recognition, and thereby enhance character-centric understanding of animated movies. Central to our approach is the automatic construction of an audio-visual character bank from online sources. This bank contains both visual exemplars and voice (audio) samples for each character, enabling subsequent multi-modal character recognition despite long-tailed appearance distributions. Building on accurate character recognition, we explore two downstream applications: Audio Description (AD) generation for visually impaired audiences, and character-aware subtitling for the hearing impaired. To support research in this domain, we introduce CMD-AM, a new dataset of 75 animated movies with comprehensive annotations. Our charactercentric pipeline demonstrates significant improvements in both accessibility and narrative comprehension for animated content over prior face-detection-based approaches. For the code and dataset, visit https://www.robots.ox.ac.uk/~vgg/research/animated_ad/.
- Publication status:
- Published
- Peer review status:
- Peer reviewed
Actions
Access Document
- Files:
-
-
(Preview, Version of record, pdf, 5.5MB, Terms of use)
-
- Publisher copy:
- 10.1145/3746027.3755041
Authors
+ Engineering and Physical Sciences Research Council
More from this funder
- Funder identifier:
- https://ror.org/0439y7842
- Grant:
- EP/T028572/1
- Publisher:
- Association for Computing Machinery
- Host title:
- MM '25: Proceedings of the 33rd ACM International Conference on Multimedia
- Pages:
- 3300 - 3309
- Publication date:
- 2025-10-27
- Acceptance date:
- 2025-07-05
- Event title:
- 33rd ACM International Conference on Multimedia (MM 2025)
- Event location:
- Dublin, Ireland
- Event website:
- https://acmmm2025.org/
- Event start date:
- 2025-10-27
- Event end date:
- 2025-10-31
- DOI:
- ISBN:
- 9798400720352
- Language:
-
English
- Keywords:
- Pubs id:
-
2300220
- Local pid:
-
pubs:2300220
- Deposit date:
-
2025-10-17
- ARK identifier:
Terms of use
- Copyright holder:
- Gui et al
- Copyright date:
- 2025
- Rights statement:
- © 2025 Copyright held by the owner/author(s). This work is licensed under a Creative Commons Attribution 4.0 International License.
- Notes:
- This paper was presented at the 33rd ACM International Conference on Multimedia (MM 2025), 27th-31st October 2025, Dublin, Ireland. The author accepted manuscript (AAM) of this paper has been made available under the University of Oxford's Open Access Publications Policy, and a CC BY public copyright licence has been applied.
- Licence:
- CC Attribution (CC BY)
If you are the owner of this record, you can report an update to it here: Report update to this record