Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Zero-Shot Voice Cloning Using Variational Embedding with Attention Mechanism

Authors
Lee, JaeukKim, JiyeChang, Joon-Hyuk
Issue Date
Jan-2022
Publisher
Institute of Electrical and Electronics Engineers Inc.
Keywords
Global style token; Multi-speaker; Text-to-speech; Voice cloning
Citation
Proceedings of 2021 7th IEEE International Conference on Network Intelligence and Digital Content, IC-NIDC 2021, pp.344 - 348
Indexed
SCOPUS
Journal Title
Proceedings of 2021 7th IEEE International Conference on Network Intelligence and Digital Content, IC-NIDC 2021
Start Page
344
End Page
348
URI
https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/139794
DOI
10.1109/IC-NIDC54101.2021.9660599
ISSN
0000-0000
Abstract
Many voice cloning studies based on multi-speaker text-to-speech (TTS) have been conducted. Among the techniques of voice cloning, we focus on zero-shot voice cloning. The most important aspect of zero-shot voice cloning is which speaker embedding is used. In this study, two types of speaker embeddings are used. One is extracted from the mel spectrogram using a speaker encoder and the other is stored in an embedding dictionary, such as a vector quantized-variational autoencoder (VQ-VAE). To extract embedding from the embedding dictionary, an attention mechanism is applied, which we call attention- V AE (AT - V AE). By employing the embedding extracted by the speaker encoder as a query in the attention mechanism, the attention weights are calculated in the embedding dictionary. This mechanism allows the extraction of speaker embedding, which represents unseen speakers. In addition, training is applied to make our model robust to unseen speakers. Through the training stage, our system has developed further. The performance of the proposed method was validated in terms of various metrics, and it was demonstrated that the proposed method enables voice cloning without adaptation training.
Files in This Item
Go to Link
Appears in
Collections
서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Chang, Joon-Hyuk photo

Chang, Joon-Hyuk
COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)
Read more

Altmetrics

Total Views & Downloads

BROWSE