On using acoustic environment classification for statistical model-based speech enhancement

Choi, Jae-Hun; Chang, Joon-Hyuk

doi:10.1016/j.specom.2011.10.009

Detailed Information

Cited 18 time in webofscience

Cited 24 time in scopus

Metadata Downloads

On using acoustic environment classification for statistical model-based speech enhancement

Authors: Choi, Jae-Hun; Chang, Joon-Hyuk

Issue Date: Mar-2012

Publisher: Elsevier BV

Keywords: Speech enhancement; Noise classification; Gaussian mixture model; DFT

Citation: Speech Communication, v.54, no.3, pp 477 - 490

Pages: 14

Indexed: SCI
SCIE
SCOPUS

Journal Title: Speech Communication

Volume: 54

Number: 3

Start Page: 477

End Page: 490

URI: https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/27582

DOI: 10.1016/j.specom.2011.10.009

ISSN: 0167-6393
1872-7182

Abstract: In this paper, we present a statistical model-based speech enhancement technique using acoustic environment classification supported by a Gaussian mixture model (GMM). In the data training stage, the principal parameters of the statistical model-based speech enhancement algorithm such as the weighting parameter in the decision-directed (DD) method, the long-term smoothing parameter of the noise estimation, and the control parameter of the minimum gain value are uniquely set as optimal operating points according to the given noise information to ensure the best performance for each noise. These optimal operating points, which are specific to the different background noises, are estimated based on the composite measures, which are the objective quality measures representing the highest correlation with the actual speech quality processed by noise suppression algorithms. In the on-line environment-aware speech enhancement step, the noise classification is performed on a frame-by-frame basis using the maximum likelihood (ML)-based Gaussian mixture model (GMM). The speech absence probability (SAP) is used to detect the speech absence periods and to update the likelihood of the GMM. According to the classified noise information for each frame, we assign the optimal values to the aforementioned three parameters for speech enhancement. We evaluated the performances of the proposed methods using objective speech quality measures and subjective listening tests under various noise environments. Our experimental results showed that the proposed method yields better performances than does a conventional algorithm with fixed parameters. (C) 2011 Elsevier B.V. All rights reserved.

Files in This Item: There are no files associated with this item.

Appears in Collections: 서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

Show full item record

qrcode

Related Researcher

Researcher Chang, Joon-Hyuk photo

Chang, Joon-Hyuk: COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE