Abstract
In this study, the authors propose multichannel weighted Euclidean (WE) and weighted cosh (WCOSH) cost function estimators for speech enhancement in the distributed microphone scenario. The goal of the work is to illustrate the advantages of utilising additional microphones and modified cost functions for improving signal-to-noise ratio (SNR) and segmental SNR (SSNR) along with log-likelihood ratio (LLR) and perceptual evaluation of speech quality (PESQ) objective metrics over the corresponding single-channel baseline estimators. As with their single-channel counterparts, the perceptually-motivated multichannel WE and WCOSH estimators are functions of a weighting law parameter, which influences attention of the noisy spectral amplitude through a spectral gain function, emphasises spectral peak (formant) information, and accounts for auditory masking effects. Based on the simulation results, the multichannel WE and WCOSH cost function estimators produced gains in SSNR improvement, LLR output and PESQ output over the single-channel baseline results and unweighted cost functions with the best improvements occurring with negative values of the weighting law parameter across all input SNR levels and noise types.
Original language | English |
---|---|
Pages (from-to) | 337-344 |
Number of pages | 8 |
Journal | IET Signal Processing |
Volume | 7 |
Issue number | 4 |
DOIs | |
State | Published - 2013 |
Bibliographical note
Publisher Copyright:© The Institution of Engineering and Technology 2013.
ASJC Scopus subject areas
- Signal Processing
- Electrical and Electronic Engineering