Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Proper Error Estimation and Calibration for Attention-Based Encoder-Decoder Models

Full metadata record
DC Field Value Language
dc.contributor.authorLee, Mun-Hak-
dc.contributor.authorChang, Joon-Hyuk-
dc.date.accessioned2024-12-06T05:30:18Z-
dc.date.available2024-12-06T05:30:18Z-
dc.date.issued2024-11-
dc.identifier.issn2329-9290-
dc.identifier.issn2329-9304-
dc.identifier.urihttps://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/202075-
dc.description.abstractAn attention-based automatic speech recognition (ASR) model generates a probability distribution of the tokens set at each time step. Recent studies have shown that calibration errors exist in the output probability distributions of attention-based ASR models trained to minimize the negative log likelihood. This study analyzes the causes of calibration errors in ASR model outputs and their impact on model performance. Based on the analysis, we argue that conventional methods for estimating calibration errors at the token level are unsuitable for ASR tasks. Accordingly, we propose a new calibration measure that estimates the calibration error at the sequence level. Moreover, we present a new post-hoc calibration function and training objective to mitigate the calibration error of the ASR model at the sequence level. Through experiments using the ASR benchmark, we show that the proposed methods effectively alleviate the calibration error of the ASR model and improve the generalization performance.-
dc.format.extent12-
dc.language영어-
dc.language.isoENG-
dc.publisherIEEE Advancing Technology for Humanity-
dc.titleProper Error Estimation and Calibration for Attention-Based Encoder-Decoder Models-
dc.typeArticle-
dc.publisher.location미국-
dc.identifier.doi10.1109/TASLP.2024.3492799-
dc.identifier.scopusid2-s2.0-85209104995-
dc.identifier.wosid001361960400006-
dc.identifier.bibliographicCitationIEEE/ACM Transactions on Audio, Speech, and Language Processing, v.32, pp 4919 - 4930-
dc.citation.titleIEEE/ACM Transactions on Audio, Speech, and Language Processing-
dc.citation.volume32-
dc.citation.startPage4919-
dc.citation.endPage4930-
dc.type.docTypeArticle-
dc.description.isOpenAccessN-
dc.description.journalRegisteredClassscie-
dc.description.journalRegisteredClassscopus-
dc.relation.journalResearchAreaAcoustics-
dc.relation.journalResearchAreaEngineering-
dc.relation.journalWebOfScienceCategoryAcoustics-
dc.relation.journalWebOfScienceCategoryEngineering, Electrical & Electronic-
dc.subject.keywordPlusEncoding (symbols)-
dc.subject.keywordPlusSignal encoding-
dc.subject.keywordPlusSpeech recognition-
dc.subject.keywordAuthorCalibration-
dc.subject.keywordAuthorProbability distribution-
dc.subject.keywordAuthorDecoding-
dc.subject.keywordAuthorAccuracy-
dc.subject.keywordAuthorTraining-
dc.subject.keywordAuthorAnalytical models-
dc.subject.keywordAuthorMeasurement uncertainty-
dc.subject.keywordAuthorError analysis-
dc.subject.keywordAuthorData models-
dc.subject.keywordAuthorSpeech processing-
dc.subject.keywordAuthorSpeech recognition-
dc.subject.keywordAuthorcalibration-
dc.subject.keywordAuthorpost-hoc calibration methods-
dc.subject.keywordAuthorattention-base encoder decoder-
dc.subject.keywordAuthorsequence-level training-
dc.identifier.urlhttps://ieeexplore.ieee.org/document/10745647-
Files in This Item
Go to Link
Appears in
Collections
서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Chang, Joon-Hyuk photo

Chang, Joon-Hyuk
COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)
Read more

Altmetrics

Total Views & Downloads

BROWSE