Conference item icon

Conference item

My lips are concealed: audio-visual speech enhancement through obstructions

Abstract:

Our objective is an audio-visual model for separating a single speaker from a mixture of sounds such as other speakers and background noise. Moreover, we wish to hear the speaker even when the visual cues are temporarily absent due to occlusion. To this end we introduce a deep audio-visual speech enhancement network that is able to separate a speaker’s voice by conditioning on both the speaker’s lip movements and/or a representation of their voice. The voice representation can be obtained by...

Expand abstract
Publication status:
Published
Peer review status:
Reviewed (other)

Actions


Access Document


Files:
Publisher copy:
10.21437/interspeech.2019-3114

Authors


More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Oxford college:
Worcester College
Role:
Author
More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Oxford college:
Brasenose College
Role:
Author
ORCID:
0000-0002-8945-8573
More from this funder
Name:
Engineering & Physical Sciences Research Council
Grant:
EP/M013774/1
Publisher:
ISCA
Host title:
Proc. Interspeech 2019
Pages:
4295-4299
Publication date:
2019-09-15
Event title:
Interspeech 2019
Event location:
Graz, Austria
Event website:
https://www.interspeech2019.org/
Event start date:
2019-09-15
Event end date:
2019-09-19
DOI:
ISSN:
1990-9772
Keywords:
Pubs id:
1091155
Local pid:
pubs:1091155
Deposit date:
2020-03-05

Terms of use


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP