Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Improving Joint Speech and Emotion Recognition Using Global Style Tokens

Full metadata record
DC Field Value Language
dc.contributor.author경제현-
dc.contributor.author성주석-
dc.contributor.author최정환-
dc.contributor.author정예린-
dc.contributor.authorChang, Joon-Hyuk-
dc.date.accessioned2023-10-10T02:35:37Z-
dc.date.available2023-10-10T02:35:37Z-
dc.date.issued2023-08-
dc.identifier.issn1990-9772-
dc.identifier.urihttps://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/191792-
dc.description.abstractAutomatic speech recognition (ASR) and speech emotion recognition (SER) are closely related in that the acoustic features of speech, such as pitch, tone, and intensity, can vary according to the speaker's emotional state. Our study focuses on a joint ASR and SER task, in which an emotion token is tagged and recognized along with the text. To further improve the joint recognition performance, we propose a novel training method that adopts the global style tokens (GSTs). The style embedding is extracted from the GSTs module to enhance the joint ASR and SER model to capture emotional information from speech. Specifically, a conformer-based joint ASR and SER model pre-trained on a large-scale dataset is jointly fine-tuned with style embedding to improve both ASR and SER. The experimental results on the IEMOCAP dataset showed that the proposed model achieves a word error rate of 15.8% and four emotion classification weighted and unweighted accuracy of 75.1% and 76.3%, respectively.-
dc.format.extent5-
dc.language영어-
dc.language.isoENG-
dc.titleImproving Joint Speech and Emotion Recognition Using Global Style Tokens-
dc.typeArticle-
dc.identifier.doi10.21437/Interspeech.2023-2375-
dc.identifier.scopusid2-s2.0-85171588429-
dc.identifier.wosid001186650304138-
dc.identifier.bibliographicCitationProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, v.2023, pp 4528 - 4532-
dc.citation.titleProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH-
dc.citation.volume2023-
dc.citation.startPage4528-
dc.citation.endPage4532-
dc.type.docTypeProceedings Paper-
dc.description.isOpenAccessN-
dc.description.journalRegisteredClassscopus-
dc.relation.journalResearchAreaAcoustics-
dc.relation.journalResearchAreaAudiology & Speech-Language Pathology-
dc.relation.journalResearchAreaComputer Science-
dc.relation.journalWebOfScienceCategoryAcoustics-
dc.relation.journalWebOfScienceCategoryAudiology & Speech-Language Pathology-
dc.relation.journalWebOfScienceCategoryComputer Science, Artificial Intelligence-
dc.relation.journalWebOfScienceCategoryComputer Science, Software Engineering-
dc.subject.keywordPlusCharacter recognition-
dc.subject.keywordPlusClassification (of information)-
dc.subject.keywordPlusContinuous speech recognition-
dc.subject.keywordPlusEmbeddings-
dc.subject.keywordPlusEmotion Recognition-
dc.subject.keywordPlusSpeech communication-
dc.subject.keywordPlusAcoustic features-
dc.subject.keywordPlusAutomatic speech recognition-
dc.subject.keywordPlusEmbeddings-
dc.subject.keywordPlusEmotion recognition-
dc.subject.keywordPlusEmotional state-
dc.subject.keywordPlusGlobal style token-
dc.subject.keywordPlusPerformance-
dc.subject.keywordPlusRecognition models-
dc.subject.keywordPlusSpeech emotion recognition-
dc.subject.keywordPlusTraining methods-
dc.subject.keywordPlusLarge dataset-
dc.subject.keywordAuthorautomatic speech recognition-
dc.subject.keywordAuthorglobal style tokens-
dc.subject.keywordAuthorspeech emotion recognition-
dc.identifier.urlhttps://www.isca-speech.org/archive/interspeech_2023/kyung23_interspeech.html-
Files in This Item
Go to Link
Appears in
Collections
서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Chang, Joon-Hyuk photo

Chang, Joon-Hyuk
COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)
Read more

Altmetrics

Total Views & Downloads

BROWSE