Conference item icon

Conference item

Emotion Recognition in Speech using Cross-Modal Transfer in the Wild

Abstract:

Obtaining large, human labelled speech datasets to train models for emotion recognition is a notoriously challenging task, hindered by annotation cost and label ambiguity. In this work, we consider the task of learning embeddings for speech classification without access to any form of labelled audio. We base our approach on a simple hypothesis: that the emotional content of speech correlates with the facial expression of the speaker. By exploiting this relationship, we show that annotations o...

Expand abstract
Publication status:
Published
Peer review status:
Peer reviewed
Version:
Publisher's Version

Actions


Access Document


Files:
Publisher copy:
10.1145/3240508.3240578

Authors


More by this author
Institution:
University of Oxford
Division:
MPLS Division
Department:
Engineering Science
Oxford college:
Brasenose College
Role:
Author
ORCID:
0000-0002-8945-8573
Publisher:
ACM Publisher's website
Publication date:
2018-10-19
Acceptance date:
2018-07-07
DOI:
Pubs id:
pubs:944586
URN:
uri:5dfbf3a3-fec7-48ec-a47f-9748760ec171
UUID:
uuid:5dfbf3a3-fec7-48ec-a47f-9748760ec171
Local pid:
pubs:944586

Terms of use


Metrics


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP