Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

UNDERSTANDING THE ROLE OF SELF ATTENTION FOR EFFICIENT SPEECH RECOGNITION

Full metadata record
DC Field Value Language
dc.contributor.authorShim, Kyuhong-
dc.contributor.authorChoi, Jung wook-
dc.contributor.authorSung, Wonyong-
dc.date.accessioned2023-05-03T09:39:50Z-
dc.date.available2023-05-03T09:39:50Z-
dc.date.created2023-04-06-
dc.date.issued2022-04-
dc.identifier.urihttps://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/184848-
dc.description.abstractSelf-attention (SA) is a critical component of Transformer neural networks that have succeeded in automatic speech recognition (ASR). In this paper, we analyze the role of SA in Transformer-based ASR models for not only understanding the mechanism of improved recognition accuracy but also lowering the computational complexity. We reveal that SA performs two distinct roles: phonetic and linguistic localization. Especially, we show by experiments that phonetic localization in the lower layers extracts phonologically meaningful features from speech and reduces the phonetic variance in the utterance for proper linguistic localization in the upper layers. From this understanding, we discover that attention maps can be reused as long as their localization capability is preserved. To evaluate this idea, we implement the layer-wise attention map reuse on real GPU platforms and achieve up to 1.96 times speedup in inference and 33% savings in training time with noticeably improved ASR performance for the challenging benchmark on LibriSpeech dev/test-other dataset.-
dc.language영어-
dc.language.isoen-
dc.publisherInternational Conference on Learning Representations, ICLR-
dc.titleUNDERSTANDING THE ROLE OF SELF ATTENTION FOR EFFICIENT SPEECH RECOGNITION-
dc.typeArticle-
dc.contributor.affiliatedAuthorChoi, Jung wook-
dc.identifier.scopusid2-s2.0-85150355364-
dc.identifier.bibliographicCitationICLR 2022 - 10th International Conference on Learning Representations, pp.1 - 19-
dc.relation.isPartOfICLR 2022 - 10th International Conference on Learning Representations-
dc.citation.titleICLR 2022 - 10th International Conference on Learning Representations-
dc.citation.startPage1-
dc.citation.endPage19-
dc.type.rimsART-
dc.type.docTypeConference Paper-
dc.description.journalClass1-
dc.description.isOpenAccessN-
dc.description.journalRegisteredClassscopus-
dc.subject.keywordPlusBenchmarking-
dc.subject.keywordPlusSpeech recognition-
dc.subject.keywordPlusLinguistics-
dc.subject.keywordPlusAutomatic speech recognition-
dc.subject.keywordPlusCritical component-
dc.subject.keywordPlusLayer-wise-
dc.subject.keywordPlusLocalisation-
dc.subject.keywordPlusNeural-networks-
dc.subject.keywordPlusRecognition accuracy-
dc.subject.keywordPlusRecognition models-
dc.subject.keywordPlusReuse-
dc.subject.keywordPlusTraining time-
dc.subject.keywordPlusUpper layer-
dc.identifier.urlhttps://openreview.net/forum?id=AvcfxqRy4Y-
Files in This Item
Go to Link
Appears in
Collections
서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Choi, Jung wook photo

Choi, Jung wook
COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)
Read more

Altmetrics

Total Views & Downloads

BROWSE