Cited 1 time in
Deep neural network calibration for e2e speech recognition system
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Lee, Mun-Hak | - |
| dc.contributor.author | Chang, Joon-Hyuk | - |
| dc.date.accessioned | 2022-07-06T14:45:30Z | - |
| dc.date.available | 2022-07-06T14:45:30Z | - |
| dc.date.issued | 2021-08 | - |
| dc.identifier.issn | 2308-457X | - |
| dc.identifier.uri | https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/141231 | - |
| dc.description.abstract | Cross-entropy loss, which is commonly used in deep-neural-network-based (DNN) classification model training, induces models to assign a high probability value to one class. Networks trained in this fashion tend to be overconfident, which causes a problem in the decoding process of the speech recognition system, as it uses the combined probability distribution of multiple independently trained networks. Overconfidence in neural networks can be quantified as a calibration error, which is the difference between the output probability of a model and the likelihood of obtaining an actual correct answer. We show that the deep-learning-based components of an end-to-end (E2E) speech recognition system with high classification accuracy contain calibration errors and quantify them using various calibration measures. In addition, it was experimentally shown that the calibration function, which was being trained to minimize calibration errors effectively mitigates those of the speech recognition system, and as a result, can improve the performance of beam-search during decoding. | - |
| dc.format.extent | 5 | - |
| dc.language | 영어 | - |
| dc.language.iso | ENG | - |
| dc.publisher | International Speech Communication Association | - |
| dc.title | Deep neural network calibration for e2e speech recognition system | - |
| dc.type | Article | - |
| dc.publisher.location | 프랑스 | - |
| dc.identifier.doi | 10.21437/Interspeech.2021-176 | - |
| dc.identifier.scopusid | 2-s2.0-85119203334 | - |
| dc.identifier.wosid | 000841879504031 | - |
| dc.identifier.bibliographicCitation | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, v.3, pp 4064 - 4068 | - |
| dc.citation.title | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH | - |
| dc.citation.volume | 3 | - |
| dc.citation.startPage | 4064 | - |
| dc.citation.endPage | 4068 | - |
| dc.type.docType | Proceedings Paper | - |
| dc.description.isOpenAccess | N | - |
| dc.description.journalRegisteredClass | scopus | - |
| dc.relation.journalResearchArea | Audiology & Speech-Language Pathology | - |
| dc.relation.journalResearchArea | Computer Science | - |
| dc.relation.journalWebOfScienceCategory | Audiology & Speech-Language Pathology | - |
| dc.relation.journalWebOfScienceCategory | Computer Science, Artificial Intelligence | - |
| dc.relation.journalWebOfScienceCategory | Computer Science, Software Engineering | - |
| dc.subject.keywordPlus | Calibration | - |
| dc.subject.keywordPlus | Decoding | - |
| dc.subject.keywordPlus | Errors | - |
| dc.subject.keywordPlus | Probability distributions | - |
| dc.subject.keywordPlus | Speech | - |
| dc.subject.keywordPlus | Speech communication | - |
| dc.subject.keywordPlus | Speech recognition | - |
| dc.subject.keywordAuthor | E2E speech recognition | - |
| dc.subject.keywordAuthor | deep neural network calibration | - |
| dc.identifier.url | https://www.isca-speech.org/archive/interspeech_2021/lee21f_interspeech.html | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366
COPYRIGHT © 2024 HANYANG UNIVERSITY.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.
