Intra-ensemble: A New Method for Combining Intermediate Outputs in Transformer-based Automatic Speech Recognition

Kim, DoHee; Choi, Jieun; Chang, Joon-Hyuk

doi:10.21437/Interspeech.2023-1255

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Intra-ensemble: A New Method for Combining Intermediate Outputs in Transformer-based Automatic Speech Recognition

Full metadata record

DC Field	Value	Language
dc.contributor.author	Kim, DoHee	-
dc.contributor.author	Choi, Jieun	-
dc.contributor.author	Chang, Joon-Hyuk	-
dc.date.accessioned	2023-10-10T02:36:06Z	-
dc.date.available	2023-10-10T02:36:06Z	-
dc.date.created	2023-10-04	-
dc.date.issued	2023-08	-
dc.identifier.issn	2308-457X	-
dc.identifier.uri	https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/191795	-
dc.description.abstract	Deep learning models employ various regularization techniques to prevent overfitting and enhance generalization. In particular, an auxiliary loss, as proposed for connectionist temporal classification (CTC) models, demonstrated the potential for intermediate prediction to be useful by enabling sub-models to recognize speech accurately. We propose a new method called Intra-ensemble, which combines these accurate intermediate outputs into a single output for both training and inference, considering the importance of the intermediate layer using learnable parameters. Our approach is applicable to CTC models, attention-based encoder-decoder models, and transducer structures and demonstrated performance improvements of 13.5%, 3.0%, and 4.1% respectively, in the LibriSpeech evaluation. Furthermore, through various analytical experiments, we found that the sub-models contributed significantly to performance improvement.	-
dc.language	영어	-
dc.language.iso	en	-
dc.publisher	International Speech Communication Association	-
dc.title	Intra-ensemble: A New Method for Combining Intermediate Outputs in Transformer-based Automatic Speech Recognition	-
dc.type	Article	-
dc.contributor.affiliatedAuthor	Chang, Joon-Hyuk	-
dc.identifier.doi	10.21437/Interspeech.2023-1255	-
dc.identifier.scopusid	2-s2.0-85171529529	-
dc.identifier.bibliographicCitation	Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, v.2023-August, pp.2203 - 2207	-
dc.relation.isPartOf	Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH	-
dc.citation.title	Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH	-
dc.citation.volume	2023-August	-
dc.citation.startPage	2203	-
dc.citation.endPage	2207	-
dc.type.rims	ART	-
dc.type.docType	Conference paper	-
dc.description.journalClass	1	-
dc.description.isOpenAccess	N	-
dc.description.journalRegisteredClass	scopus	-
dc.subject.keywordPlus	Deep learning	-
dc.subject.keywordPlus	Speech communication	-
dc.subject.keywordPlus	Automatic speech recognition	-
dc.subject.keywordPlus	Classification models	-
dc.subject.keywordPlus	Ensemble	-
dc.subject.keywordPlus	Generalisation	-
dc.subject.keywordPlus	Learning models	-
dc.subject.keywordPlus	Overfitting	-
dc.subject.keywordPlus	Performance	-
dc.subject.keywordPlus	Regularization technique	-
dc.subject.keywordPlus	Submodels	-
dc.subject.keywordPlus	Temporal classification	-
dc.subject.keywordPlus	Speech recognition	-
dc.subject.keywordAuthor	ensemble	-
dc.subject.keywordAuthor	speech recognition	-
dc.identifier.url	https://www.isca-speech.org/archive/interspeech_2023/kim23e_interspeech.html	-

Files in This Item: Go to Link

Appears in Collections: 서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Chang, Joon-Hyuk photo

Chang, Joon-Hyuk: COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)

Read more

Altmetrics

Total Views & Downloads

STATISTICS: Total View :6,007,935; Today View :34,677

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1365

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE