Conference item
Phonological Feature Based Mispronunciation Detection and Diagnosis using Multi-Task DNNs and Active Learning
- Abstract:
- This paper presents a phonological feature based computer aided pronunciation training system for the learners of a new language (L2). Phonological features allow analysing the learners’ mispronunciations systematically and rendering the feedback more effectively. The proposed acoustic model consists of a multi-task deep neural network, which uses a shared representation for estimating the phonological features and HMM state probabilities. Moreover, an active learning based scheme is proposed to efficiently deal with the cost of annotation, which is done by expert teachers, by selecting the most informative samples for annotation. Experimental evaluations are carried out for German and Italian native-speakers speaking English. For mispronunciation detection, the proposed feature-based system outperforms conventional GOP measure and classifier based methods, while providing more detailed diagnosis. Evaluations also demonstrate the advantage of active learning based sampling over random sampling.
- Publication status:
- Published
- Peer review status:
- Peer reviewed
Actions
Access Document
- Files:
-
-
(Preview, Accepted manuscript, pdf, 189.3KB, Terms of use)
-
- Publisher copy:
- 10.21437/Interspeech.2017-1350
Authors
- Publisher:
- International Speech Communication Association
- Host title:
- Interspeech 2017: Situated Interaction
- Journal:
- Interspeech More from this journal
- Series:
- Proceedings of the Annual Conference of the International Speech Communication Association
- Publication date:
- 2017-08-20
- Acceptance date:
- 2017-05-22
- Event location:
- Stockholm
- Event start date:
- 2017-08-20
- Event end date:
- 2017-08-24
- DOI:
- ISSN:
-
1990-9772
- Keywords:
- Pubs id:
-
pubs:698191
- UUID:
-
uuid:69032bc7-d9b0-45e6-b624-2822097a6f33
- Local pid:
-
pubs:698191
- Source identifiers:
-
698191
- Deposit date:
-
2017-06-02
- ARK identifier:
Terms of use
- Copyright holder:
- International Speech Communication Association
- Copyright date:
- 2017
- Notes:
- Copyright © 2017 ISCA. This article was presented at Interspeech 2017: Situated Interaction (20-24 August 2017, Stockholm, Sweden).This is the accepted manuscript version of the article. The final version is available online from ISCA at: [10.21437/Interspeech.2017-1350].
If you are the owner of this record, you can report an update to it here: Report update to this record