Conference item icon

Conference item

Learnable PINs: Cross-modal embeddings for person identity

Abstract:

We propose and investigate an identity sensitive joint embedding of face and voice. Such an embedding enables cross-modal retrieval from voice to face and from face to voice. We make the following four contributions: first, we show that the embedding can be learnt from videos of talking faces, without requiring any identity labels, using a form of cross-modal self-supervision; second, we develop a curriculum learning schedule for hard negative mining targeted to this task that is essential fo...

Expand abstract
Publication status:
Published
Peer review status:
Peer reviewed
Version:
Accepted Manuscript

Actions


Access Document


Files:
Publisher copy:
10.1007/978-3-030-01261-8_5

Authors


More by this author
Institution:
University of Oxford
Division:
MPLS Division
Department:
Engineering Science
Role:
Author
More by this author
Institution:
University of Oxford
Division:
MPLS Division
Department:
Engineering Science
Role:
Author
More by this author
Institution:
University of Oxford
Division:
MPLS Division
Department:
Engineering Science
Oxford college:
Brasenose College
Role:
Author
ORCID:
0000-0002-8945-8573
Publisher:
Springer Publisher's website
Volume:
11217
Pages:
73-89
Publication date:
2018-10-06
Acceptance date:
2018-07-03
DOI:
ISSN:
0302-9743 and 1611-3349
Pubs id:
pubs:941298
URN:
uri:0ef631f1-97c5-4b3e-b913-6e3115cad81a
UUID:
uuid:0ef631f1-97c5-4b3e-b913-6e3115cad81a
Local pid:
pubs:941298
ISBN:
9783030012601

Terms of use


Metrics


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP