Resumen
Text-dependent speaker verification is becoming popular in the speaker recognition society. However, the conventional i-vector framework which has been successful for speaker identification and other similar tasks works relatively poorly in this task. Researchers have proposed several new methods to improve performance, but it is still unclear that which model is the best choice, especially when the pass-phrases are prompted during enrollment and test. In this paper, we introduce four modeling methods and compare their performance on the newly published RedDots dataset. To further explore the influence of different frame alignments, Viterbi and forward-backward algorithms are both used in the HMM-based models. Several bottleneck features are also investigated. Our experiments show that, by explicitly modeling the lexical content, the HMM-based modeling achieves good results in the fixed-phrase condition. In the prompted-phrase condition, GMM-HMM and i-vector/HMM are not as successful. In both conditions, the forward-backward algorithm brings more benefits to the i-vector/HMM system. Additionally, we also find that even though bottleneck features perform well for text-independent speaker verification, they do not outperform MFCCs on the most challenging Imposter-Correct trials on RedDots.
| Idioma original | English |
|---|---|
| Título de la publicación alojada | 2017 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2017 - Proceedings |
| Páginas | 629-636 |
| Número de páginas | 8 |
| ISBN (versión digital) | 9781509047888 |
| DOI | |
| Estado | Published - jul 2 2017 |
| Evento | 2017 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2017 - Okinawa, Japan Duración: dic 16 2017 → dic 20 2017 |
Serie de la publicación
| Nombre | 2017 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2017 - Proceedings |
|---|---|
| Volumen | 2018-January |
Conference
| Conference | 2017 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2017 |
|---|---|
| País/Territorio | Japan |
| Ciudad | Okinawa |
| Período | 12/16/17 → 12/20/17 |
Nota bibliográfica
Publisher Copyright:© 2017 IEEE.
Financiación
The work is supported by National Natural Science Foundation of China under Grant No. 61370034, No. 61403224 and No. 61273268.
| Financiadores | Número del financiador |
|---|---|
| National Natural Science Foundation of China (NSFC) | 61403224, 61273268, 61370034 |
ASJC Scopus subject areas
- Computer Vision and Pattern Recognition
- Human-Computer Interaction
Huella
Profundice en los temas de investigación de 'Comparison of multiple features and modeling methods for text-dependent speaker verification'. En conjunto forman una huella única.Citar esto
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver