Thesis icon

Thesis

Understanding video through the lens of language

Abstract:

The increasing abundance of video data online necessitates the development of systems capable of understanding such content. However, building these systems poses significant challenges, including the absence of scalable and robust supervision signals, computational complexity, and multimodal modelling. To address these issues, this thesis explores the role of language as a complementary learning signal for video, drawing inspiration from the success of self-supervised Large Language Model...

Expand abstract

Actions


Access Document


Files:

Authors


More by this author
Oxford college:
Christ Church
Role:
Author

Contributors

Role:
Supervisor
ORCID:
0000-0002-8945-8573
Role:
Examiner
Institution:
King Abdullah University of Science and Technology
Role:
Examiner
Type of award:
DPhil
Level of award:
Doctoral
Awarding institution:
University of Oxford
DOI:

Terms of use


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP