Cited 0 time in
An experimental study of diffusion-based general speech restoration with predictive-guided conditioning
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Yang, Da-Hee | - |
| dc.contributor.author | Chang, Joon-Hyuk | - |
| dc.date.accessioned | 2026-03-23T02:00:28Z | - |
| dc.date.available | 2026-03-23T02:00:28Z | - |
| dc.date.issued | 2026-07 | - |
| dc.identifier.issn | 0885-2308 | - |
| dc.identifier.issn | 1095-8363 | - |
| dc.identifier.uri | https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/211432 | - |
| dc.description.abstract | This study presents a hybrid speech restoration framework that integrates predictive-guided conditioning into a diffusion-based generative model to address complex distortions, including noise, reverberation, and bandwidth reduction. The proposed method employs the outputs of a predictive model to guide the diffusion process, enabling more accurate reconstruction under challenging acoustic conditions. Furthermore, during the final sampling stage, the outputs of the predictive and generative models are fused with a tunable ratio, balancing signal fidelity and perceptual naturalness. Experimental results demonstrate that the proposed approach significantly improves objective restoration metrics compared to conventional diffusion baselines. However, the perceptual quality varies with the fusion ratio, revealing a trade-off between objective gains and subjective preference. These findings highlight the potential of predictive-guided conditioning for robust speech restoration and provide insights into optimizing the balance between predictive and generative contributions. | - |
| dc.format.extent | 11 | - |
| dc.language | 영어 | - |
| dc.language.iso | ENG | - |
| dc.publisher | ACADEMIC PRESS LTD- ELSEVIER SCIENCE LTD | - |
| dc.title | An experimental study of diffusion-based general speech restoration with predictive-guided conditioning | - |
| dc.type | Article | - |
| dc.publisher.location | 영국 | - |
| dc.identifier.doi | 10.1016/j.csl.2026.101940 | - |
| dc.identifier.scopusid | 2-s2.0-105027726418 | - |
| dc.identifier.wosid | 001674828400001 | - |
| dc.identifier.bibliographicCitation | COMPUTER SPEECH AND LANGUAGE, v.99, pp 1 - 11 | - |
| dc.citation.title | COMPUTER SPEECH AND LANGUAGE | - |
| dc.citation.volume | 99 | - |
| dc.citation.startPage | 1 | - |
| dc.citation.endPage | 11 | - |
| dc.type.docType | Article | - |
| dc.description.isOpenAccess | N | - |
| dc.description.journalRegisteredClass | scie | - |
| dc.description.journalRegisteredClass | scopus | - |
| dc.relation.journalResearchArea | Computer Science | - |
| dc.relation.journalWebOfScienceCategory | Computer Science, Artificial Intelligence | - |
| dc.subject.keywordPlus | Acoustic noise | - |
| dc.subject.keywordPlus | Architectural acoustics | - |
| dc.subject.keywordPlus | Diffusion | - |
| dc.subject.keywordPlus | Restoration | - |
| dc.subject.keywordPlus | Speech communication | - |
| dc.subject.keywordPlus | Speech enhancement | - |
| dc.subject.keywordAuthor | Score-based diffusion model | - |
| dc.subject.keywordAuthor | Predictive-guided conditioning | - |
| dc.subject.keywordAuthor | General speech restoration | - |
| dc.subject.keywordAuthor | Speech enhancement | - |
| dc.identifier.url | https://www.sciencedirect.com/science/article/pii/S0885230826000045?via%3Dihub | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366
COPYRIGHT © 2024 HANYANG UNIVERSITY.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.
