Conference item icon

Conference item

Voicevector: multimodal enrolment vectors for speaker separation

Abstract:
We present a transformer-based architecture for voice separation of a target speaker from multiple other speakers and ambient noise. We achieve this by using two separate neural networks: (A) An enrolment network designed to craft speakerspecific embeddings, exploiting various combinations of audio and visual modalities; and (B) A separation network that accepts both the noisy signal and enrolment vectors as inputs, outputting the clean signal of the target speaker. The novelties are: (i) the enrolment vector can be produced from: audio only, audio-visual data (using lip movements), or visual data alone (using lip movements from silent video); and (ii) the flexibility in conditioning the separation on multiple positive and negative enrolment vectors. We compare to previous methods and obtain superior performance
Publication status:
Published
Peer review status:
Peer reviewed

Actions


Access Document


Files:
Publisher copy:
10.1109/ICASSPW62465.2024.10627309

Authors


More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Role:
Author
More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Role:
Author
More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Role:
Author
ORCID:
0000-0002-8945-8573


Publisher:
IEEE
Host title:
Proceeding of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2024)
Pages:
785-789
Publication date:
2024-08-15
Acceptance date:
2024-04-14
Event title:
International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2024)
Event location:
COEX, Seoul, South Korea
Event website:
https://2024.ieeeicassp.org/
Event start date:
2024-04-14
Event end date:
2024-04-19
DOI:
EISBN:
979-8-3503-7451-3
ISBN:
979-8-3503-7452-0


Language:
English
Keywords:
Pubs id:
1996107
Local pid:
pubs:1996107
Deposit date:
2024-05-14

Terms of use



Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP