Cited 0 time in
Quad-Net: Melspectrogram Vocoder with Convolutional Layers Restricted by the Quadrature Mirror Filter for Perfect Reconstruction
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Song, Nam-Seok | - |
| dc.contributor.author | Chang, Joon-Hyuk | - |
| dc.date.accessioned | 2025-07-22T07:30:21Z | - |
| dc.date.available | 2025-07-22T07:30:21Z | - |
| dc.date.issued | 2025-03 | - |
| dc.identifier.issn | 0736-7791 | - |
| dc.identifier.issn | 1520-6149 | - |
| dc.identifier.uri | https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/208313 | - |
| dc.description.abstract | Recently, neural vocoders have applied signal processing methods to synthesize speech to reduce computational complexity. However, most methods lack the benefits of a data-driven approach and the flexibility of hyper-parameters, such as filter length, because they rely on fixed signal processing filters. In this paper, we introduce Quad-Net, a network that includes restricted convolutional layers shaped by quadrature mirror synthesis filter banks. It is optimized with a perfect reconstruction loss derived from perfect reconstruction filter banks. This enables us to control filter lengths and degrees of data-drivenness. The results show that the filter parameters trained in our model exhibit characteristics similar to those of other signal processing methods with lower parameters. Furthermore, by increasing the filter length of Quad-Net, we can obtain filters that have complex frequency responses.It shows that a new approach enables the design of more complex filters that are adaptive to neural networks, diverging from previous methods. | - |
| dc.format.extent | 5 | - |
| dc.language | 영어 | - |
| dc.language.iso | ENG | - |
| dc.publisher | Institute of Electrical and Electronics Engineers Inc. | - |
| dc.title | Quad-Net: Melspectrogram Vocoder with Convolutional Layers Restricted by the Quadrature Mirror Filter for Perfect Reconstruction | - |
| dc.type | Article | - |
| dc.publisher.location | 미국 | - |
| dc.identifier.doi | 10.1109/ICASSP49660.2025.10890659 | - |
| dc.identifier.scopusid | 2-s2.0-105009603255 | - |
| dc.identifier.bibliographicCitation | ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, pp 1 - 5 | - |
| dc.citation.title | ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings | - |
| dc.citation.startPage | 1 | - |
| dc.citation.endPage | 5 | - |
| dc.type.docType | Conference paper | - |
| dc.description.isOpenAccess | N | - |
| dc.description.journalRegisteredClass | scopus | - |
| dc.subject.keywordPlus | Audio signal processing | - |
| dc.subject.keywordPlus | Complex networks | - |
| dc.subject.keywordPlus | Computer vision | - |
| dc.subject.keywordPlus | Convolution | - |
| dc.subject.keywordPlus | Filter banks | - |
| dc.subject.keywordPlus | Frequency response | - |
| dc.subject.keywordPlus | Mirrors | - |
| dc.subject.keywordPlus | Neural networks | - |
| dc.subject.keywordPlus | Speech communication | - |
| dc.subject.keywordAuthor | perfect reconstruction | - |
| dc.subject.keywordAuthor | quadrature mirror filter | - |
| dc.subject.keywordAuthor | singal processing | - |
| dc.subject.keywordAuthor | vocoder | - |
| dc.identifier.url | https://ieeexplore.ieee.org/document/10890659 | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366
COPYRIGHT © 2024 HANYANG UNIVERSITY.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.
