Mispronunciation detection and diagnosis for Mandarin accented English speech

Subash Khanal, Mohammad Soleymanpour, Narjes Bozorg, Michael T. Johnson

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

This paper presents a Mispronunciation Detection and Diagnosis (MDD) system based on a range of Automatic Speech Recognition (ASR) models and feature types. The goals of this research are to assess the ability of speech recognition systems to detect and diagnose the common pronunciation errors seen in non-native speakers (L2) of English and to assess the contribution of the information offered by Electromagnetic Articulography (EMA) data in improving the performance of such MDD systems. To evaluate the ability of the ASR systems to detect and diagnose pronunciation errors, the recognized sequence of phonemes generated by the ASR models were aligned with human-labeled phonetic transcripts as well as with the original phonetic prompts. This three-way alignment determined the MDD related metrics of the ASR system. System architectures included GMM-HMM, DNN, and RNN based ASR engines for the MDD system. Articulatory features derived from the Electromagnetic Articulography corpus of Mandarin-Accented English (EMA-MAE) were utilized along with acoustic features to compare the performance of MDD systems. The best performing system using a combination of acoustic and articulatory features had an accuracy of 82.4%, diagnostic accuracy of 75.8% and a false rejection rate of 17.2%.

Original languageEnglish
Title of host publication2021 11th International Conference on Speech Technology and Human-Computer Dialogue, SpeD 2021
Pages62-67
Number of pages6
ISBN (Electronic)9781665427869
DOIs
StatePublished - 2021
Event11th International Conference on Speech Technology and Human-Computer Dialogue, SpeD 2021 - Virtual, Bucharest, Romania
Duration: Oct 13 2021Oct 15 2021

Publication series

Name2021 11th International Conference on Speech Technology and Human-Computer Dialogue, SpeD 2021

Conference

Conference11th International Conference on Speech Technology and Human-Computer Dialogue, SpeD 2021
Country/TerritoryRomania
CityVirtual, Bucharest
Period10/13/2110/15/21

Bibliographical note

Publisher Copyright:
© 2021 IEEE.

Keywords

  • Articulatory features
  • Automatic speech recognition (ASR)
  • Mispronunciation detection and diagnosis

ASJC Scopus subject areas

  • Language and Linguistics
  • Human-Computer Interaction
  • Signal Processing
  • Safety, Risk, Reliability and Quality
  • Communication

Fingerprint

Dive into the research topics of 'Mispronunciation detection and diagnosis for Mandarin accented English speech'. Together they form a unique fingerprint.

Cite this