Parallel Reference Speaker Weighting for Kinematic-Independent Acoustic-to-Articulatory Inversion

An Ji, Michael T. Johnson, Jeffrey J. Berry

Research output: Contribution to journalArticlepeer-review

11 Scopus citations

Abstract

Acoustic-to-articulatory inversion, the estimation of articulatory kinematics from an acoustic waveform, is a challenging but important problem. Accurate estimation of articulatory movements has the potential for significant impact on our understanding of speech production, on our capacity to assess and treat pathologies in a clinical setting, and on speech technologies such as computer aided pronunciation assessment and audio-video synthesis. However, because of the complex and speaker-specific relationship between articulation and acoustics, existing approaches for inversion do not generalize well across speakers. As acquiring speaker-specific kinematic data for training is not feasible in many practical applications, this remains an important and open problem. This paper proposes a novel approach to acoustic-to-articulatory inversion, Parallel Reference Speaker Weighting (PRSW), which requires no kinematic data for the target speaker and a small amount of acoustic adaptation data. PRSW hypothesizes that acoustic and kinematic similarities are correlated and uses speaker-adapted articulatory models derived from acoustically derived weights. The system was assessed using a 20-speaker data set of synchronous acoustic and Electromagnetic Articulography (EMA) kinematic data. Results demonstrate that by restricting the reference group to a subset consisting of speakers with strong individual speaker-dependent inversion performance, the PRSW method is able to attain kinematic-independent acoustic-to-articulatory inversion performance nearly matching that of the speaker-dependent model, with an average correlation of 0.62 versus 0.63. This indicates that given a sufficiently complete and appropriately selected reference speaker set for adaptation, it is possible to create effective articulatory models without kinematic training data.

Original languageEnglish
Pages (from-to)1865-1875
Number of pages11
JournalIEEE/ACM Transactions on Audio Speech and Language Processing
Volume24
Issue number10
DOIs
StatePublished - Oct 2016

Bibliographical note

Publisher Copyright:
© 2014 IEEE.

Keywords

  • Acoustic-to-articulatory inversion
  • electromagnetic articulography

ASJC Scopus subject areas

  • Computer Science (miscellaneous)
  • Acoustics and Ultrasonics
  • Computational Mathematics
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Parallel Reference Speaker Weighting for Kinematic-Independent Acoustic-to-Articulatory Inversion'. Together they form a unique fingerprint.

Cite this