Cited 0 time in
Tokenized Generative Speech Enhancement With Language Model and Flow Matching
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Yang, Da-Hee | - |
| dc.contributor.author | Lee, Jaeuk | - |
| dc.contributor.author | Chang, Joon-Hyuk | - |
| dc.date.accessioned | 2025-08-26T02:00:11Z | - |
| dc.date.available | 2025-08-26T02:00:11Z | - |
| dc.date.issued | 2025-07 | - |
| dc.identifier.issn | 1070-9908 | - |
| dc.identifier.issn | 1558-2361 | - |
| dc.identifier.uri | https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/208580 | - |
| dc.description.abstract | We propose a novel generative speech enhancement (SE) framework that integrates a language model (LM) and a flow-matching model. To utilize an LM with discrete tokens, we introduce dMel, which discretizes Mel spectrograms into a predefined set of quantized values on a linear-scale without requiring additional neural networks. dMel preserves both semantic and acoustic characteristics, providing a compact and effective token-based alternative to Mel spectrograms. We design the first encoder-decoder LM for SE, which learns to map noisy dMel to enhanced ones. Subsequently, flow-matching de-quantizes enhanced dMel into continuous representation and refines it by learning the optimal transport-based probability path, improving perceptual quality. This unified approach enables structured reconstruction while effectively suppressing noise. Experimental results demonstrate the effectiveness of our method in enhancing speech quality, establishing a new paradigm for generative SE without reliance on neural codec-based representations. | - |
| dc.format.extent | 5 | - |
| dc.language | 영어 | - |
| dc.language.iso | ENG | - |
| dc.publisher | Institute of Electrical and Electronics Engineers | - |
| dc.title | Tokenized Generative Speech Enhancement With Language Model and Flow Matching | - |
| dc.type | Article | - |
| dc.publisher.location | 미국 | - |
| dc.identifier.doi | 10.1109/LSP.2025.3589128 | - |
| dc.identifier.scopusid | 2-s2.0-105012356240 | - |
| dc.identifier.wosid | 001536693600005 | - |
| dc.identifier.bibliographicCitation | IEEE Signal Processing Letters, v.32, pp 2828 - 2832 | - |
| dc.citation.title | IEEE Signal Processing Letters | - |
| dc.citation.volume | 32 | - |
| dc.citation.startPage | 2828 | - |
| dc.citation.endPage | 2832 | - |
| dc.type.docType | Article | - |
| dc.description.isOpenAccess | N | - |
| dc.description.journalRegisteredClass | scie | - |
| dc.description.journalRegisteredClass | scopus | - |
| dc.relation.journalResearchArea | Engineering | - |
| dc.relation.journalWebOfScienceCategory | Engineering, Electrical & Electronic | - |
| dc.subject.keywordPlus | Computational linguistics | - |
| dc.subject.keywordPlus | Neural networks | - |
| dc.subject.keywordPlus | Optimization | - |
| dc.subject.keywordPlus | Semantics | - |
| dc.subject.keywordPlus | Speech coding | - |
| dc.subject.keywordPlus | Speech communication | - |
| dc.subject.keywordAuthor | Spectrogram | - |
| dc.subject.keywordAuthor | Noise measurement | - |
| dc.subject.keywordAuthor | Speech enhancement | - |
| dc.subject.keywordAuthor | Tokenization | - |
| dc.subject.keywordAuthor | Decoding | - |
| dc.subject.keywordAuthor | Training | - |
| dc.subject.keywordAuthor | Noise | - |
| dc.subject.keywordAuthor | Indexes | - |
| dc.subject.keywordAuthor | Computational modeling | - |
| dc.subject.keywordAuthor | Acoustics | - |
| dc.subject.keywordAuthor | tokenization | - |
| dc.subject.keywordAuthor | language model | - |
| dc.subject.keywordAuthor | flow-matching | - |
| dc.identifier.url | https://ieeexplore.ieee.org/document/11079998 | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366
COPYRIGHT © 2024 HANYANG UNIVERSITY.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.
