Autoregressive articulatory wavenet flow for speaker-independent acoustic-to-articulatory inversion

Narjes Bozorg, Michael T. Johnson, Mohammad Soleymanpour

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In this paper we introduce a new speaker independent method for Acoustic-to-Articulatory Inversion. The proposed architecture, Speaker Independent-Articulatory WaveNet (SI-AWN), models the relationship between acoustic and articulatory features by conditioning the articulatory trajectories on acoustic features and then utilizes the structure for unseen target speakers. We evaluate the proposed SI-AWN on the Electro Magnetic Articulography corpus of Mandarin Accented English (EMA-MAE), using the pool of acoustic-articulatory information from 35 reference speakers and testing on target speakers that include male, female, native and non-native speakers. The results suggest that SI-AWN improves the performance of the acoustic-to-articulatory inversion process compared to the baseline Maximum Likelihood Regression-Parallel Reference Speaker Weighting (MLLR-PRSW) method by 21 percent. To the best of our knowledge, this is the first application of a WaveNet-like synthesis approach to the problem of Speaker Independent Acoustic-to-Articulatory Inversion, and results are comparable to or better than the best currently published systems.

Original languageEnglish
Title of host publication2021 11th International Conference on Speech Technology and Human-Computer Dialogue, SpeD 2021
Pages156-161
Number of pages6
ISBN (Electronic)9781665427869
DOIs
StatePublished - 2021
Event11th International Conference on Speech Technology and Human-Computer Dialogue, SpeD 2021 - Virtual, Bucharest, Romania
Duration: Oct 13 2021Oct 15 2021

Publication series

Name2021 11th International Conference on Speech Technology and Human-Computer Dialogue, SpeD 2021

Conference

Conference11th International Conference on Speech Technology and Human-Computer Dialogue, SpeD 2021
Country/TerritoryRomania
CityVirtual, Bucharest
Period10/13/2110/15/21

Bibliographical note

Publisher Copyright:
© 2021 IEEE.

Keywords

  • Acoustic-to-articulatory inversion
  • Deep autoregressive model
  • Speaker-independent
  • WaveNet

ASJC Scopus subject areas

  • Language and Linguistics
  • Human-Computer Interaction
  • Signal Processing
  • Safety, Risk, Reliability and Quality
  • Communication

Fingerprint

Dive into the research topics of 'Autoregressive articulatory wavenet flow for speaker-independent acoustic-to-articulatory inversion'. Together they form a unique fingerprint.

Cite this