Abstract
This paper presents a novel deep autoregressive method for Acoustic-to-Articulatory Inversion called Articulatory-WaveNet. In traditional methods such as Gaussian Mixture Model-Hidden Markov Model (GMM-HMM), mapping the frame-level interdependency of observations has not been considered. We address this problem by introducing the Articulatory-WaveNet with dilated causal convolutional layers to predict the articulatory trajectories from acoustic feature sequences. This new model has an average Root Mean Square Error (RMSE) of 1.08mm and a correlation of 0.82 on the English speaker subset of the ElectroMagnetic Articulography-Mandarin Accented English (EMA-MAE) corpus. Articulatory-WaveNet represents an improvement of 59% for RMSE and 30% for correlation over the previous GMM-HMM based inversion model. To the best of our knowledge, this paper introduces the first application of a WaveNet synthesis approach to the problem of Acoustic-to-Articulatory Inversion, and results are comparable to or better than the best currently published systems.
Original language | English |
---|---|
Title of host publication | Interspeech 2020 |
Pages | 3725-3729 |
Number of pages | 5 |
DOIs | |
State | Published - 2020 |
Event | 21st Annual Conference of the International Speech Communication Association, INTERSPEECH 2020 - Shanghai, China Duration: Oct 25 2020 → Oct 29 2020 |
Publication series
Name | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH |
---|---|
Volume | 2020-October |
ISSN (Print) | 2308-457X |
ISSN (Electronic) | 1990-9772 |
Conference
Conference | 21st Annual Conference of the International Speech Communication Association, INTERSPEECH 2020 |
---|---|
Country/Territory | China |
City | Shanghai |
Period | 10/25/20 → 10/29/20 |
Bibliographical note
Publisher Copyright:Copyright © 2020 ISCA
Keywords
- Acoustic-to-articulatory inversion
- Deep autoregressive model
- Speaker-dependent
- WaveNet
ASJC Scopus subject areas
- Language and Linguistics
- Human-Computer Interaction
- Signal Processing
- Software
- Modeling and Simulation