Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Guided conditioning with predictive network on score-based diffusion model for speech enhancement

Full metadata record
DC Field Value Language
dc.contributor.authorKim, Dail-
dc.contributor.authorYang, Da-Hee-
dc.contributor.authorKim, Donghyun-
dc.contributor.authorChang, Joon-Hyuk-
dc.contributor.authorYang, Jaemo-
dc.contributor.authorChoi, Jeonghwan-
dc.contributor.authorLee, Moa-
dc.contributor.authorMoon, Han-gil-
dc.date.accessioned2025-04-10T08:30:17Z-
dc.date.available2025-04-10T08:30:17Z-
dc.date.issued2024-09-
dc.identifier.issn1990-9772-
dc.identifier.issn2308-457X-
dc.identifier.urihttps://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/207030-
dc.description.abstractAlthough diffusion-based speech enhancement (SE) models have emerged, they exhibit lower ability in noise removal than other predictive-based SE models. This reflects a trade-off between generative models, which are capable of producing more natural speech based on estimated target distribution, and predictive models, which are more effective in noise removal. To mitigate this trade-off, we propose a novel conditioning method for score-based diffusion models. The proposed approach involves guiding the diffusion model with a pretrained predictive model without joint training, thereby enabling enhanced speech to offer the proper direction to the diffusion model. The effectiveness of the proposed method is highlighted by outperforming the baseline method, with only half the number of sampling steps.-
dc.format.extent5-
dc.language영어-
dc.language.isoENG-
dc.titleGuided conditioning with predictive network on score-based diffusion model for speech enhancement-
dc.typeArticle-
dc.identifier.doi10.21437/Interspeech.2024-1545-
dc.identifier.scopusid2-s2.0-85206651720-
dc.identifier.wosid001331850101067-
dc.identifier.bibliographicCitationProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, pp 1190 - 1194-
dc.citation.titleProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH-
dc.citation.startPage1190-
dc.citation.endPage1194-
dc.type.docTypeProceedings Paper-
dc.description.isOpenAccessN-
dc.description.journalRegisteredClassscopus-
dc.relation.journalResearchAreaComputer Science-
dc.relation.journalWebOfScienceCategoryComputer Science, Artificial Intelligence-
dc.subject.keywordPlusSpeech enhancement-
dc.subject.keywordAuthorSpeech enhancement-
dc.subject.keywordAuthorscore-based diffusion models-
dc.subject.keywordAuthorgenerative modeling-
dc.subject.keywordAuthorpredictive modeling-
dc.subject.keywordAuthorconditioning-
dc.identifier.urlhttps://www.isca-archive.org/interspeech_2024/kim24o_interspeech.html-
Files in This Item
Go to Link
Appears in
Collections
서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Chang, Joon-Hyuk photo

Chang, Joon-Hyuk
COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)
Read more

Altmetrics

Total Views & Downloads

BROWSE