An experimental study of diffusion-based general speech restoration with predictive-guided conditioning
- Authors
- Yang, Da-Hee; Chang, Joon-Hyuk
- Issue Date
- Jul-2026
- Publisher
- ACADEMIC PRESS LTD- ELSEVIER SCIENCE LTD
- Keywords
- Score-based diffusion model; Predictive-guided conditioning; General speech restoration; Speech enhancement
- Citation
- COMPUTER SPEECH AND LANGUAGE, v.99, pp 1 - 11
- Pages
- 11
- Indexed
- SCIE
SCOPUS
- Journal Title
- COMPUTER SPEECH AND LANGUAGE
- Volume
- 99
- Start Page
- 1
- End Page
- 11
- URI
- https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/211432
- DOI
- 10.1016/j.csl.2026.101940
- ISSN
- 0885-2308
1095-8363
- Abstract
- This study presents a hybrid speech restoration framework that integrates predictive-guided conditioning into a diffusion-based generative model to address complex distortions, including noise, reverberation, and bandwidth reduction. The proposed method employs the outputs of a predictive model to guide the diffusion process, enabling more accurate reconstruction under challenging acoustic conditions. Furthermore, during the final sampling stage, the outputs of the predictive and generative models are fused with a tunable ratio, balancing signal fidelity and perceptual naturalness. Experimental results demonstrate that the proposed approach significantly improves objective restoration metrics compared to conventional diffusion baselines. However, the perceptual quality varies with the fusion ratio, revealing a trade-off between objective gains and subjective preference. These findings highlight the potential of predictive-guided conditioning for robust speech restoration and provide insights into optimizing the balance between predictive and generative contributions.
- Files in This Item
-
Go to Link
- Appears in
Collections - 서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.