Conference item
Lip reading in profile
- Abstract:
- There has been a quantum leap in the performance of automated lip reading recently due to the application of neural network sequence models trained on a very large corpus of aligned text and face videos. However, this advance has only been demonstrated for frontal or near frontal faces, and so the question remains: can lips be read in profile to the same standard? The objective of this paper is to answer that question. We make three contributions: first, we obtain a new large aligned training corpus that contains profile faces, and select these using a face pose regressor network; second, we propose a curriculum learning procedure that is able to extend SyncNet [10] (a network to synchronize face movements and speech) progressively from frontal to profile faces; third, we demonstrate lip reading in profile for unseen videos. The trained model is evaluated on a held out test set, and is also shown to far surpass the state of the art on the OuluVS2 multi-view benchmark.
- Publication status:
- Published
- Peer review status:
- Peer reviewed
Actions
Authors
- Publisher:
- British Machine Vision Association and Society for Pattern Recognition
- Host title:
- 28th British Machine Vision Conference, 2017, Imperial College London, 4th-7th September 2017
- Journal:
- ritish Machine Vision Conference, 2017 More from this journal
- Publication date:
- 2017-09-04
- Acceptance date:
- 2017-07-01
- Pubs id:
-
pubs:821113
- UUID:
-
uuid:9f06858c-349c-416f-8ace-87751cd401fc
- Local pid:
-
pubs:821113
- Source identifiers:
-
821113
- Deposit date:
-
2018-08-17
Terms of use
- Copyright holder:
- Chung and Zisserman
- Copyright date:
- 2017
- Notes:
- This is the accepted version of the paper. The final version is available online from The British Machine Vision Association and Society for Pattern Recognition at: https://bmvc2017.london/
If you are the owner of this record, you can report an update to it here: Report update to this record