TY - JOUR
T1 - Evaluation of the importance of time-frequency contributions to speech intelligibility in noise
AU - Yu, Chengzhu
AU - Wójcicki, Kamil K.
AU - Loizou, Philipos C.
AU - Hansen, John H.L.
AU - Johnson, Michael T.
PY - 2014/5
Y1 - 2014/5
N2 - Recent studies on binary masking techniques make the assumption that each time-frequency (T-F) unit contributes an equal amount to the overall intelligibility of speech. The present study demonstrated that the importance of each T-F unit to speech intelligibility varies in accordance with speech content. Specifically, T-F units are categorized into two classes, speech-present T-F units and speech-absent T-F units. Results indicate that the importance of each speech-present T-F unit to speech intelligibility is highly related to the loudness of its target component, while the importance of each speech-absent T-F unit varies according to the loudness of its masker component. Two types of mask errors are also considered, which include miss and false alarm errors. Consistent with previous work, false alarm errors are shown to be more harmful to speech intelligibility than miss errors when the mixture signal-to-noise ratio (SNR) is below 0dB. However, the relative importance between the two types of error is conditioned on the SNR level of the input speech signal. Based on these observations, a mask-based objective measure, the loudness weighted hit-false, is proposed for predicting speech intelligibility. The proposed objective measure shows significantly higher correlation with intelligibility compared to two existing mask-based objective measures.
AB - Recent studies on binary masking techniques make the assumption that each time-frequency (T-F) unit contributes an equal amount to the overall intelligibility of speech. The present study demonstrated that the importance of each T-F unit to speech intelligibility varies in accordance with speech content. Specifically, T-F units are categorized into two classes, speech-present T-F units and speech-absent T-F units. Results indicate that the importance of each speech-present T-F unit to speech intelligibility is highly related to the loudness of its target component, while the importance of each speech-absent T-F unit varies according to the loudness of its masker component. Two types of mask errors are also considered, which include miss and false alarm errors. Consistent with previous work, false alarm errors are shown to be more harmful to speech intelligibility than miss errors when the mixture signal-to-noise ratio (SNR) is below 0dB. However, the relative importance between the two types of error is conditioned on the SNR level of the input speech signal. Based on these observations, a mask-based objective measure, the loudness weighted hit-false, is proposed for predicting speech intelligibility. The proposed objective measure shows significantly higher correlation with intelligibility compared to two existing mask-based objective measures.
UR - http://www.scopus.com/inward/record.url?scp=84900409946&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84900409946&partnerID=8YFLogxK
U2 - 10.1121/1.4869088
DO - 10.1121/1.4869088
M3 - Article
C2 - 24815280
AN - SCOPUS:84900409946
SN - 0001-4966
VL - 135
SP - 3007
EP - 3016
JO - Journal of the Acoustical Society of America
JF - Journal of the Acoustical Society of America
IS - 5
ER -