Conference item icon

Conference item

Look, listen and learn

Abstract:

We consider the question: what can be learnt by looking at and listening to a large number of unlabelled videos? There is a valuable, but so far untapped, source of information contained in the video itself – the correspondence between the visual and the audio streams, and we introduce a novel “Audio-Visual Correspondence” learning task that makes use of this. Training visual and audio networks from scratch, without any additional supervision other than the raw unconstrained videos themselves...

Expand abstract
Publication status:
Published
Peer review status:
Peer reviewed

Actions


Access Document


Files:
Publisher copy:
10.1109/iccv.2017.73

Authors


More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Oxford college:
Brasenose College
Role:
Author
ORCID:
0000-0002-8945-8573
Publisher:
IEEE Publisher's website
Journal:
2017 IEEE International Conference on Computer Vision (ICCV) Journal website
Host title:
2017 IEEE International Conference on Computer Vision (ICCV)
Publication date:
2017-12-25
Acceptance date:
2017-07-17
Event location:
Venice, Italy
DOI:
ISSN:
1550-5499
Source identifiers:
829574
Pubs id:
pubs:829574
UUID:
uuid:1323c912-31b9-46b9-b120-32b8251dcb07
Local pid:
pubs:829574
Deposit date:
2019-01-29

Terms of use


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP