Quad-Net: Melspectrogram Vocoder with Convolutional Layers Restricted by the Quadrature Mirror Filter for Perfect Reconstruction

Song, Nam-Seok; Chang, Joon-Hyuk

doi:10.1109/ICASSP49660.2025.10890659

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Quad-Net: Melspectrogram Vocoder with Convolutional Layers Restricted by the Quadrature Mirror Filter for Perfect Reconstruction

Full metadata record

DC Field	Value	Language
dc.contributor.author	Song, Nam-Seok	-
dc.contributor.author	Chang, Joon-Hyuk	-
dc.date.accessioned	2025-07-22T07:30:21Z	-
dc.date.available	2025-07-22T07:30:21Z	-
dc.date.issued	2025-03	-
dc.identifier.issn	0736-7791	-
dc.identifier.issn	1520-6149	-
dc.identifier.uri	https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/208313	-
dc.description.abstract	Recently, neural vocoders have applied signal processing methods to synthesize speech to reduce computational complexity. However, most methods lack the benefits of a data-driven approach and the flexibility of hyper-parameters, such as filter length, because they rely on fixed signal processing filters. In this paper, we introduce Quad-Net, a network that includes restricted convolutional layers shaped by quadrature mirror synthesis filter banks. It is optimized with a perfect reconstruction loss derived from perfect reconstruction filter banks. This enables us to control filter lengths and degrees of data-drivenness. The results show that the filter parameters trained in our model exhibit characteristics similar to those of other signal processing methods with lower parameters. Furthermore, by increasing the filter length of Quad-Net, we can obtain filters that have complex frequency responses.It shows that a new approach enables the design of more complex filters that are adaptive to neural networks, diverging from previous methods.	-
dc.format.extent	5	-
dc.language	영어	-
dc.language.iso	ENG	-
dc.publisher	Institute of Electrical and Electronics Engineers Inc.	-
dc.title	Quad-Net: Melspectrogram Vocoder with Convolutional Layers Restricted by the Quadrature Mirror Filter for Perfect Reconstruction	-
dc.type	Article	-
dc.publisher.location	미국	-
dc.identifier.doi	10.1109/ICASSP49660.2025.10890659	-
dc.identifier.scopusid	2-s2.0-105009603255	-
dc.identifier.bibliographicCitation	ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, pp 1 - 5	-
dc.citation.title	ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings	-
dc.citation.startPage	1	-
dc.citation.endPage	5	-
dc.type.docType	Conference paper	-
dc.description.isOpenAccess	N	-
dc.description.journalRegisteredClass	scopus	-
dc.subject.keywordPlus	Audio signal processing	-
dc.subject.keywordPlus	Complex networks	-
dc.subject.keywordPlus	Computer vision	-
dc.subject.keywordPlus	Convolution	-
dc.subject.keywordPlus	Filter banks	-
dc.subject.keywordPlus	Frequency response	-
dc.subject.keywordPlus	Mirrors	-
dc.subject.keywordPlus	Neural networks	-
dc.subject.keywordPlus	Speech communication	-
dc.subject.keywordAuthor	perfect reconstruction	-
dc.subject.keywordAuthor	quadrature mirror filter	-
dc.subject.keywordAuthor	singal processing	-
dc.subject.keywordAuthor	vocoder	-
dc.identifier.url	https://ieeexplore.ieee.org/document/10890659	-

Files in This Item: Go to Link

Appears in Collections: 서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Chang, Joon-Hyuk photo

Chang, Joon-Hyuk: COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE