Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Improved CNN-Transformer Using Broadcasted Residual Learning for Text-Independent Speaker Verification

Authors
Choi, Jeong-HwanYang, Joon-YoungJeoung, Ye-RinChang, Joon-Hyuk
Issue Date
Sep-2022
Publisher
International Speech Communication Association
Keywords
attentive statistics pooling; hybrid deep neural network; Text-independent speaker verification; Transformer
Citation
Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, v.2022-September, pp.2223 - 2227
Indexed
SCOPUS
Journal Title
Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Volume
2022-September
Start Page
2223
End Page
2227
URI
https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/173091
DOI
10.21437/Interspeech.2022-88
ISSN
2308-457X
Abstract
This study proposes a novel speaker embedding extractor architecture that effectively combines convolutional neural networks (CNNs) and Transformers. Based on the recently proposed CNNs-meet-vision-Transformers (CMT) architecture, we propose two strategies for efficient speaker embedding extraction modeling. First, we apply broadcast residual learning techniques to the building blocks of the CMT, allowing us to extract frequency-aware temporal features shared across frequency dimensions with a reduced set of parameters. Second, frequency-statistics-dependent attentive statistics pooling is proposed to aggregate attentive temporal statistics acquired from the means and standard deviations of input feature maps weighted along the frequency axis using an attention mechanism. The experimental results on the VoxCeleb-1 dataset show that the proposed model outperforms several CNN- and Transformer-based models with a similar number of model parameters. Moreover, the effectiveness of the proposed modifications to the CMT architecture is validated through ablation studies.
Files in This Item
Go to Link
Appears in
Collections
서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Chang, Joon-Hyuk photo

Chang, Joon-Hyuk
COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)
Read more

Altmetrics

Total Views & Downloads

BROWSE