Deep neural network calibration for e2e speech recognition system

Lee, Mun-Hak; Chang, Joon-Hyuk

doi:10.21437/Interspeech.2021-176

Detailed Information

Cited 0 time in webofscience

Cited 1 time in scopus

Metadata Downloads

Deep neural network calibration for e2e speech recognition system

Full metadata record

DC Field	Value	Language
dc.contributor.author	Lee, Mun-Hak	-
dc.contributor.author	Chang, Joon-Hyuk	-
dc.date.accessioned	2022-07-06T14:45:30Z	-
dc.date.available	2022-07-06T14:45:30Z	-
dc.date.created	2021-12-08	-
dc.date.issued	2021-08	-
dc.identifier.issn	2308-457X	-
dc.identifier.uri	https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/141231	-
dc.description.abstract	Cross-entropy loss, which is commonly used in deep-neural-network-based (DNN) classification model training, induces models to assign a high probability value to one class. Networks trained in this fashion tend to be overconfident, which causes a problem in the decoding process of the speech recognition system, as it uses the combined probability distribution of multiple independently trained networks. Overconfidence in neural networks can be quantified as a calibration error, which is the difference between the output probability of a model and the likelihood of obtaining an actual correct answer. We show that the deep-learning-based components of an end-to-end (E2E) speech recognition system with high classification accuracy contain calibration errors and quantify them using various calibration measures. In addition, it was experimentally shown that the calibration function, which was being trained to minimize calibration errors effectively mitigates those of the speech recognition system, and as a result, can improve the performance of beam-search during decoding.	-
dc.language	영어	-
dc.language.iso	en	-
dc.publisher	International Speech Communication Association	-
dc.title	Deep neural network calibration for e2e speech recognition system	-
dc.type	Article	-
dc.contributor.affiliatedAuthor	Chang, Joon-Hyuk	-
dc.identifier.doi	10.21437/Interspeech.2021-176	-
dc.identifier.scopusid	2-s2.0-85119203334	-
dc.identifier.bibliographicCitation	Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, v.3, pp.4064 - 4068	-
dc.relation.isPartOf	Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH	-
dc.citation.title	Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH	-
dc.citation.volume	3	-
dc.citation.startPage	4064	-
dc.citation.endPage	4068	-
dc.type.rims	ART	-
dc.type.docType	Conference Paper	-
dc.description.journalClass	1	-
dc.description.isOpenAccess	N	-
dc.description.journalRegisteredClass	scopus	-
dc.subject.keywordPlus	Calibration	-
dc.subject.keywordPlus	Decoding	-
dc.subject.keywordPlus	Errors	-
dc.subject.keywordPlus	Probability distributions	-
dc.subject.keywordPlus	Speech	-
dc.subject.keywordPlus	Speech communication	-
dc.subject.keywordPlus	Speech recognition	-
dc.subject.keywordAuthor	E2E speech recognition	-
dc.subject.keywordAuthor	deep neural network calibration	-
dc.identifier.url	https://www.isca-speech.org/archive/interspeech_2021/lee21f_interspeech.html	-

Files in This Item: Go to Link

Appears in Collections: 서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Chang, Joon-Hyuk photo

Chang, Joon-Hyuk: COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1365

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE