Learnable PINs: Cross-modal embeddings for person identity

Nagrani, A; Albanie, S; Zisserman, A

AI Collection

Conference item

Learnable PINs: Cross-modal embeddings for person identity

Abstract:: We propose and investigate an identity sensitive joint embedding of face and voice. Such an embedding enables cross-modal retrieval from voice to face and from face to voice. We make the following four contributions: first, we show that the embedding can be learnt from videos of talking faces, without requiring any identity labels, using a form of cross-modal self-supervision; second, we develop a curriculum learning schedule for hard negative mining targeted to this task that is essential for learning to proceed successfully; third, we demonstrate and evaluate cross-modal retrieval for identities unseen and unheard during training over a number of scenarios and establish a benchmark for this novel task; finally, we show an application of using the joint embedding for automatically retrieving and labelling characters in TV dramas.

Publication status:: Published

Peer review status:: Peer reviewed

Actions

Email

Email this record

Send the bibliographic details of this record to your email address.

Your Email
Please enter the email address that the record information will be sent to.

-
Your message (optional)
Please add any additional information to be included within the email.
Share
Cite

Cite this record

APA Style

Nagrani, A., Albanie, S., & Zisserman, A. (2018). Learnable PINs: Cross-modal embeddings for person identity. 11217, 73–89.

MLA Style

Nagrani, A, et al. “Learnable PINs: Cross-Modal Embeddings for Person Identity.” vol. 11217, 2018, pp. 73–89.

Chicago Style

Nagrani, A, S Albanie, and A Zisserman. 2018. “Learnable PINs: Cross-Modal Embeddings for Person Identity.” 11217: 73–89.
Print

Access Document

Files:: nagrani18c.pdf

(Preview, Accepted manuscript, pdf, 3.0MB, Terms of use)

Publisher copy:: 10.1007/978-3-030-01261-8_5

Authors

+ Nagrani, A More by this author

Institution:: University of Oxford
Division:: MPLS Division
Department:: Engineering Science
Role:: Author

+ Albanie, S More by this author

Institution:: University of Oxford
Division:: MPLS Division
Department:: Engineering Science
Role:: Author

+ Zisserman, A More by this author

Institution:: University of Oxford
Division:: MPLS
Department:: Engineering Science
Oxford college:: Brasenose College
Role:: Author
ORCID:: 0000-0002-8945-8573

+ Engineering and Physical Sciences Research Council More from this funder

Grant:: EP/M013774/1

Publisher:: Springer
Host title:: Lecture Notes in Computer Science
Journal:: Lecture Notes in Computer Science More from this journal
Volume:: 11217
Pages:: 73-89
Publication date:: 2018-10-06
Acceptance date:: 2018-07-03
DOI:: 10.1007/978-3-030-01261-8_5
ISSN:: 0302-9743, 1611-3349
ISBN:: 9783030012601

Keywords:: multi-modal

self-supervised

face recognition

cross-modal

metric learning

joint embedding

speaker identification
Pubs id:: pubs:941298
UUID:: uuid:0ef631f1-97c5-4b3e-b913-6e3115cad81a
Local pid:: pubs:941298
Source identifiers:: 941298
Deposit date:: 2018-11-20
ARK identifier:: ark:/29072/ora_0ef631f197c54b3eb9136e3115cad81a

Terms of use

Copyright holder:: Springer Nature
Notes:: © Springer Nature Switzerland AG 2018. This paper was presented at the European Conference on Computer Vision 2018. This is the accepted manuscript version of the article. The final version is available online from Springer at: 10.1007/978-3-030-01261-8_5

Licence:: Terms and Conditions of Use for Oxford University Research Archive

Views and Downloads

About views and downloads

If you are the owner of this record, you can report an update to it here: Report update to this record

Conference item

Learnable PINs: Cross-modal embeddings for person identity

Actions

Access Document

Authors

Terms of use

Views and Downloads

Altmetrics

Dimensions

Conference item

Learnable PINs: Cross-modal embeddings for person identity

Actions

Access Document

Authors

Funding

Bibliographic Details

Item Description

Terms of use

Metrics

Views and Downloads

Altmetrics

Dimensions