Sequential deep learning for speech bandwidth extensionopen access
- Authors
- BONG-KI, LEE; KYOUNGJIN, NOH; Chang, Joon Hyuk; KIHYUN, CHOO; EUNMI, OH
- Issue Date
- Jan-2018
- Publisher
- Institute of Electrical and Electronics Engineers Inc.
- Citation
- IEEE Access, pp 1007 - 1009
- Pages
- 3
- Indexed
- OTHER
- Journal Title
- IEEE Access
- Start Page
- 1007
- End Page
- 1009
- URI
- https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/17841
- DOI
- 10.1109/ACCESS.2018.2833890
- ISSN
- 2169-3536
- Abstract
- In this paper, we propose a subband-based ensemble of sequential deep neural networks (DNNs) for bandwidth extension (BWE). First, the narrow-band spectra are folded into the highband (HB) region to generate the high-band spectra, and then the energy levels of the HB spectra are adjusted using the DNN-based on the log-power spectra feature. For this, we basically build the multiple DNNs, which is responsible for each subband of the HB and the DNN ensemble is sequentially connected from lower to higher subbands. This sequential structure for the DNN ensemble carries out the denoising and HB regression to better estimate the HB energy levels. In addition, we use the voiced/unvoiced (V/UV) classification to differently apply the DNN ensemble depending on either V/UV sounds. To demonstrate the performance of the proposed BWE algorithm, we compare it with a speech production model-based BWE system and a DNN-based BWE system in which the log-power spectra in the HB are estimated directly. The experimental results show that the proposed approach provides better speech quality than conventional approaches.
- Files in This Item
- There are no files associated with this item.
- Appears in
Collections - 서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.