Speech signal enhancement through adaptive wavelet thresholding

Michael T. Johnson, Xiaolong Yuan, Yao Ren

Research output: Contribution to journalArticlepeer-review

93 Scopus citations

Abstract

This paper demonstrates the application of the Bionic Wavelet Transform (BWT), an adaptive wavelet transform derived from a non-linear auditory model of the cochlea, to the task of speech signal enhancement. Results, measured objectively by Signal-to-Noise ratio (SNR) and Segmental SNR (SSNR) and subjectively by Mean Opinion Score (MOS), are given for additive white Gaussian noise as well as four different types of realistic noise environments. Enhancement is accomplished through the use of thresholding on the adapted BWT coefficients, and the results are compared to a variety of speech enhancement techniques, including Ephraim Malah filtering, iterative Wiener filtering, and spectral subtraction, as well as to wavelet denoising based on a perceptually scaled wavelet packet transform decomposition. Overall results indicate that SNR and SSNR improvements for the proposed approach are comparable to those of the Ephraim Malah filter, with BWT enhancement giving the best results of all methods for the noisiest (-10 db and -5 db input SNR) conditions. Subjective measurements using MOS surveys across a variety of 0 db SNR noise conditions indicate enhancement quality competitive with but still lower than results for Ephraim Malah filtering and iterative Wiener filtering, but higher than the perceptually scaled wavelet method.

Original languageEnglish
Pages (from-to)123-133
Number of pages11
JournalSpeech Communication
Volume49
Issue number2
DOIs
StatePublished - Feb 2007

Keywords

  • Adaptive wavelets
  • Bionic Wavelet Transform
  • Denoising
  • Speech enhancement

ASJC Scopus subject areas

  • Software
  • Modeling and Simulation
  • Communication
  • Language and Linguistics
  • Linguistics and Language
  • Computer Vision and Pattern Recognition
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Speech signal enhancement through adaptive wavelet thresholding'. Together they form a unique fingerprint.

Cite this