Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Trainable Adaptive Score Normalization for Automatic Speaker Verification

Full metadata record
DC Field Value Language
dc.contributor.authorChoi, Jeong-Hwan-
dc.contributor.authorSeong, Ju-Seok-
dc.contributor.authorJeoung, Ye-Rin-
dc.contributor.authorChang, Joon-Hyuk-
dc.date.accessioned2025-07-22T07:00:09Z-
dc.date.available2025-07-22T07:00:09Z-
dc.date.issued2025-03-
dc.identifier.issn0736-7791-
dc.identifier.issn1520-6149-
dc.identifier.urihttps://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/208312-
dc.description.abstractAdaptive S-norm (AS-norm) calibrates automatic speaker verification (ASV) scores by normalizing them utilize the scores of impostors which are similar to the input speaker. However, AS-norm does not involve any learning process, limiting its ability to provide appropriate regularization strength for various evaluation utterances. To address this limitation, we propose a trainable AS-norm (TAS-norm) that leverages learnable impostor embeddings (LIEs), which are used to compose the cohort. These LIEs are initialized to represent each speaker in a training dataset consisting of impostor speakers. Subsequently, LIEs are fine-tuned by simulating an ASV evaluation. We utilize a margin penalty during top-scoring IEs selection in fine-tuning to prevent non-impostor speakers from being selected. In our experiments with ECAPA-TDNN, the proposed TAS-norm observed 4.11% and 10.62% relative improvement in equal error rate and minimum detection cost function, respectively, on VoxCeleb1-O trial compared with standard AS-norm without using proposed LIEs. We further validated the effectiveness of the TAS-norm on additional ASV datasets comprising Persian and Chinese, demonstrating its robustness across different languages.-
dc.format.extent5-
dc.language영어-
dc.language.isoENG-
dc.publisherInstitute of Electrical and Electronics Engineers Inc.-
dc.titleTrainable Adaptive Score Normalization for Automatic Speaker Verification-
dc.typeArticle-
dc.publisher.location미국-
dc.identifier.doi10.1109/ICASSP49660.2025.10890182-
dc.identifier.scopusid2-s2.0-105009602151-
dc.identifier.bibliographicCitationICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, pp 1 - 5-
dc.citation.titleICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings-
dc.citation.startPage1-
dc.citation.endPage5-
dc.type.docTypeConference paper-
dc.description.isOpenAccessN-
dc.description.journalRegisteredClassscopus-
dc.subject.keywordPlusArtificial intelligence-
dc.subject.keywordPlusEmbeddings-
dc.subject.keywordPlusFunction evaluation-
dc.subject.keywordPlusSignal processing-
dc.subject.keywordPlusSpeech communication-
dc.subject.keywordAuthorAutomatic speaker verification-
dc.subject.keywordAuthorback-end-
dc.subject.keywordAuthorfine-tuning-
dc.subject.keywordAuthorscore normalization-
dc.subject.keywordAuthorspeaker embedding-
dc.identifier.urlhttps://ieeexplore.ieee.org/document/10890182-
Files in This Item
Go to Link
Appears in
Collections
서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Chang, Joon-Hyuk photo

Chang, Joon-Hyuk
COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)
Read more

Altmetrics

Total Views & Downloads

BROWSE