Conference item icon

Conference item

Learning structured video descriptions: Automated video knowledge extraction for video understanding tasks

Abstract:

Vision to language problems, such as video annotation, or visual question answering, stand out from the perceptual video understanding tasks (e.g., classification) through their cognitive nature and their tight connection to the field of natural language processing. While most of the current solutions to vision-to-language problems are inspired from machine translation methods, aiming to directly map visual features to text, several recent results on image and video understanding have proven ...

Expand abstract
Publication status:
Published
Peer review status:
Peer reviewed
Version:
Accepted manuscript

Actions


Access Document


Files:
Publisher copy:
10.1007/978-3-030-02671-4_20

Authors


More by this author
Institution:
University of Oxford
Division:
MPLS Division
Department:
Computer Science
Role:
Author
ORCID:
0000-0002-7644-1668
Publisher:
Springer Publisher's website
Publication date:
2018-10-18
Acceptance date:
2018-08-20
DOI:
Pubs id:
pubs:935180
URN:
uri:eab50226-444a-4620-83a1-0e43185a81ae
UUID:
uuid:eab50226-444a-4620-83a1-0e43185a81ae
Local pid:
pubs:935180

Terms of use


Metrics


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP