Tokenized Generative Speech Enhancement With Language Model and Flow Matching

Yang, Da-Hee; Lee, Jaeuk; Chang, Joon-Hyuk

doi:10.1109/LSP.2025.3589128

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Tokenized Generative Speech Enhancement With Language Model and Flow Matching

Full metadata record

DC Field	Value	Language
dc.contributor.author	Yang, Da-Hee	-
dc.contributor.author	Lee, Jaeuk	-
dc.contributor.author	Chang, Joon-Hyuk	-
dc.date.accessioned	2025-08-26T02:00:11Z	-
dc.date.available	2025-08-26T02:00:11Z	-
dc.date.issued	2025-07	-
dc.identifier.issn	1070-9908	-
dc.identifier.issn	1558-2361	-
dc.identifier.uri	https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/208580	-
dc.description.abstract	We propose a novel generative speech enhancement (SE) framework that integrates a language model (LM) and a flow-matching model. To utilize an LM with discrete tokens, we introduce dMel, which discretizes Mel spectrograms into a predefined set of quantized values on a linear-scale without requiring additional neural networks. dMel preserves both semantic and acoustic characteristics, providing a compact and effective token-based alternative to Mel spectrograms. We design the first encoder-decoder LM for SE, which learns to map noisy dMel to enhanced ones. Subsequently, flow-matching de-quantizes enhanced dMel into continuous representation and refines it by learning the optimal transport-based probability path, improving perceptual quality. This unified approach enables structured reconstruction while effectively suppressing noise. Experimental results demonstrate the effectiveness of our method in enhancing speech quality, establishing a new paradigm for generative SE without reliance on neural codec-based representations.	-
dc.format.extent	5	-
dc.language	영어	-
dc.language.iso	ENG	-
dc.publisher	Institute of Electrical and Electronics Engineers	-
dc.title	Tokenized Generative Speech Enhancement With Language Model and Flow Matching	-
dc.type	Article	-
dc.publisher.location	미국	-
dc.identifier.doi	10.1109/LSP.2025.3589128	-
dc.identifier.scopusid	2-s2.0-105012356240	-
dc.identifier.wosid	001536693600005	-
dc.identifier.bibliographicCitation	IEEE Signal Processing Letters, v.32, pp 2828 - 2832	-
dc.citation.title	IEEE Signal Processing Letters	-
dc.citation.volume	32	-
dc.citation.startPage	2828	-
dc.citation.endPage	2832	-
dc.type.docType	Article	-
dc.description.isOpenAccess	N	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-
dc.relation.journalResearchArea	Engineering	-
dc.relation.journalWebOfScienceCategory	Engineering, Electrical & Electronic	-
dc.subject.keywordPlus	Computational linguistics	-
dc.subject.keywordPlus	Neural networks	-
dc.subject.keywordPlus	Optimization	-
dc.subject.keywordPlus	Semantics	-
dc.subject.keywordPlus	Speech coding	-
dc.subject.keywordPlus	Speech communication	-
dc.subject.keywordAuthor	Spectrogram	-
dc.subject.keywordAuthor	Noise measurement	-
dc.subject.keywordAuthor	Speech enhancement	-
dc.subject.keywordAuthor	Tokenization	-
dc.subject.keywordAuthor	Decoding	-
dc.subject.keywordAuthor	Training	-
dc.subject.keywordAuthor	Noise	-
dc.subject.keywordAuthor	Indexes	-
dc.subject.keywordAuthor	Computational modeling	-
dc.subject.keywordAuthor	Acoustics	-
dc.subject.keywordAuthor	tokenization	-
dc.subject.keywordAuthor	language model	-
dc.subject.keywordAuthor	flow-matching	-
dc.identifier.url	https://ieeexplore.ieee.org/document/11079998	-

Files in This Item: Go to Link

Appears in Collections: 서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Chang, Joon-Hyuk photo

Chang, Joon-Hyuk: COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE