Detailed Information

Cited 0 time in webofscience Cited 1 time in scopus
Metadata Downloads

Deep neural network calibration for e2e speech recognition system

Full metadata record
DC Field Value Language
dc.contributor.authorLee, Mun-Hak-
dc.contributor.authorChang, Joon-Hyuk-
dc.date.accessioned2022-07-06T14:45:30Z-
dc.date.available2022-07-06T14:45:30Z-
dc.date.created2021-12-08-
dc.date.issued2021-08-
dc.identifier.issn2308-457X-
dc.identifier.urihttps://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/141231-
dc.description.abstractCross-entropy loss, which is commonly used in deep-neural-network-based (DNN) classification model training, induces models to assign a high probability value to one class. Networks trained in this fashion tend to be overconfident, which causes a problem in the decoding process of the speech recognition system, as it uses the combined probability distribution of multiple independently trained networks. Overconfidence in neural networks can be quantified as a calibration error, which is the difference between the output probability of a model and the likelihood of obtaining an actual correct answer. We show that the deep-learning-based components of an end-to-end (E2E) speech recognition system with high classification accuracy contain calibration errors and quantify them using various calibration measures. In addition, it was experimentally shown that the calibration function, which was being trained to minimize calibration errors effectively mitigates those of the speech recognition system, and as a result, can improve the performance of beam-search during decoding.-
dc.language영어-
dc.language.isoen-
dc.publisherInternational Speech Communication Association-
dc.titleDeep neural network calibration for e2e speech recognition system-
dc.typeArticle-
dc.contributor.affiliatedAuthorChang, Joon-Hyuk-
dc.identifier.doi10.21437/Interspeech.2021-176-
dc.identifier.scopusid2-s2.0-85119203334-
dc.identifier.bibliographicCitationProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, v.3, pp.4064 - 4068-
dc.relation.isPartOfProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH-
dc.citation.titleProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH-
dc.citation.volume3-
dc.citation.startPage4064-
dc.citation.endPage4068-
dc.type.rimsART-
dc.type.docTypeConference Paper-
dc.description.journalClass1-
dc.description.isOpenAccessN-
dc.description.journalRegisteredClassscopus-
dc.subject.keywordPlusCalibration-
dc.subject.keywordPlusDecoding-
dc.subject.keywordPlusErrors-
dc.subject.keywordPlusProbability distributions-
dc.subject.keywordPlusSpeech-
dc.subject.keywordPlusSpeech communication-
dc.subject.keywordPlusSpeech recognition-
dc.subject.keywordAuthorE2E speech recognition-
dc.subject.keywordAuthordeep neural network calibration-
dc.identifier.urlhttps://www.isca-speech.org/archive/interspeech_2021/lee21f_interspeech.html-
Files in This Item
Go to Link
Appears in
Collections
서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Chang, Joon-Hyuk photo

Chang, Joon-Hyuk
COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)
Read more

Altmetrics

Total Views & Downloads

BROWSE