Abstract
Text-dependent speaker verification is becoming popular in the speaker recognition society. However, the conventional i-vector framework which has been successful for speaker identification and other similar tasks works relatively poorly in this task. Researchers have proposed several new methods to improve performance, but it is still unclear that which model is the best choice, especially when the pass-phrases are prompted during enrollment and test. In this paper, we introduce four modeling methods and compare their performance on the newly published RedDots dataset. To further explore the influence of different frame alignments, Viterbi and forward-backward algorithms are both used in the HMM-based models. Several bottleneck features are also investigated. Our experiments show that, by explicitly modeling the lexical content, the HMM-based modeling achieves good results in the fixed-phrase condition. In the prompted-phrase condition, GMM-HMM and i-vector/HMM are not as successful. In both conditions, the forward-backward algorithm brings more benefits to the i-vector/HMM system. Additionally, we also find that even though bottleneck features perform well for text-independent speaker verification, they do not outperform MFCCs on the most challenging Imposter-Correct trials on RedDots.
Original language | English |
---|---|
Title of host publication | 2017 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2017 - Proceedings |
Pages | 629-636 |
Number of pages | 8 |
ISBN (Electronic) | 9781509047888 |
DOIs | |
State | Published - Jul 2 2017 |
Event | 2017 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2017 - Okinawa, Japan Duration: Dec 16 2017 → Dec 20 2017 |
Publication series
Name | 2017 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2017 - Proceedings |
---|---|
Volume | 2018-January |
Conference
Conference | 2017 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2017 |
---|---|
Country/Territory | Japan |
City | Okinawa |
Period | 12/16/17 → 12/20/17 |
Bibliographical note
Publisher Copyright:© 2017 IEEE.
Keywords
- RedDots
- Text-dependent speaker verification
- bottleneck feature
- frame alignment
- modeling methods
ASJC Scopus subject areas
- Computer Vision and Pattern Recognition
- Human-Computer Interaction