Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Quad-Net: Melspectrogram Vocoder with Convolutional Layers Restricted by the Quadrature Mirror Filter for Perfect Reconstruction

Full metadata record
DC Field Value Language
dc.contributor.authorSong, Nam-Seok-
dc.contributor.authorChang, Joon-Hyuk-
dc.date.accessioned2025-07-22T07:30:21Z-
dc.date.available2025-07-22T07:30:21Z-
dc.date.issued2025-03-
dc.identifier.issn0736-7791-
dc.identifier.issn1520-6149-
dc.identifier.urihttps://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/208313-
dc.description.abstractRecently, neural vocoders have applied signal processing methods to synthesize speech to reduce computational complexity. However, most methods lack the benefits of a data-driven approach and the flexibility of hyper-parameters, such as filter length, because they rely on fixed signal processing filters. In this paper, we introduce Quad-Net, a network that includes restricted convolutional layers shaped by quadrature mirror synthesis filter banks. It is optimized with a perfect reconstruction loss derived from perfect reconstruction filter banks. This enables us to control filter lengths and degrees of data-drivenness. The results show that the filter parameters trained in our model exhibit characteristics similar to those of other signal processing methods with lower parameters. Furthermore, by increasing the filter length of Quad-Net, we can obtain filters that have complex frequency responses.It shows that a new approach enables the design of more complex filters that are adaptive to neural networks, diverging from previous methods.-
dc.format.extent5-
dc.language영어-
dc.language.isoENG-
dc.publisherInstitute of Electrical and Electronics Engineers Inc.-
dc.titleQuad-Net: Melspectrogram Vocoder with Convolutional Layers Restricted by the Quadrature Mirror Filter for Perfect Reconstruction-
dc.typeArticle-
dc.publisher.location미국-
dc.identifier.doi10.1109/ICASSP49660.2025.10890659-
dc.identifier.scopusid2-s2.0-105009603255-
dc.identifier.bibliographicCitationICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, pp 1 - 5-
dc.citation.titleICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings-
dc.citation.startPage1-
dc.citation.endPage5-
dc.type.docTypeConference paper-
dc.description.isOpenAccessN-
dc.description.journalRegisteredClassscopus-
dc.subject.keywordPlusAudio signal processing-
dc.subject.keywordPlusComplex networks-
dc.subject.keywordPlusComputer vision-
dc.subject.keywordPlusConvolution-
dc.subject.keywordPlusFilter banks-
dc.subject.keywordPlusFrequency response-
dc.subject.keywordPlusMirrors-
dc.subject.keywordPlusNeural networks-
dc.subject.keywordPlusSpeech communication-
dc.subject.keywordAuthorperfect reconstruction-
dc.subject.keywordAuthorquadrature mirror filter-
dc.subject.keywordAuthorsingal processing-
dc.subject.keywordAuthorvocoder-
dc.identifier.urlhttps://ieeexplore.ieee.org/document/10890659-
Files in This Item
Go to Link
Appears in
Collections
서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Chang, Joon-Hyuk photo

Chang, Joon-Hyuk
COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)
Read more

Altmetrics

Total Views & Downloads

BROWSE