Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

FiLM Conditioning with Enhanced Feature to the Transformer-based End-to-End Noisy Speech Recognition

Full metadata record
DC Field Value Language
dc.contributor.authorYang, Da-Hee-
dc.contributor.authorChang, Joon-Hyuk-
dc.date.accessioned2022-12-20T06:24:43Z-
dc.date.available2022-12-20T06:24:43Z-
dc.date.created2022-11-02-
dc.date.issued2022-09-
dc.identifier.issn2308-457X-
dc.identifier.urihttps://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/173084-
dc.description.abstractEnsuring robustness against environmental noise is an important concern in the design of automatic speech recognition (ASR) systems. This is typically achieved by utilizing a speech enhancement (SE) network in an ASR system to boost noise robustness. The performance of ASR systems can be improved using SE networks as a front-end or by retraining the ASR system on enhanced speech. Although the SE network is effective, it does not always result in improved performance in the ASR system owing to artifacts. To address this problem, we propose the use of enhanced speech from an SE network as a conditioning feature instead of a direct input feature of the ASR system. This is achieved by stacking a feature-wise linear modulation (FiLM) layer on each transformer layer of the end-to-end ASR encoder and combining the input and conditioning features. The results indicate that the proposed FiLM training method exhibits greater robustness against noise owing to the use of enhanced speech as conditioning information rather than as direct ASR input.-
dc.language영어-
dc.language.isoen-
dc.publisherInternational Speech Communication Association-
dc.titleFiLM Conditioning with Enhanced Feature to the Transformer-based End-to-End Noisy Speech Recognition-
dc.typeArticle-
dc.contributor.affiliatedAuthorChang, Joon-Hyuk-
dc.identifier.doi10.21437/Interspeech.2022-161-
dc.identifier.scopusid2-s2.0-85140067940-
dc.identifier.wosid000900724504056-
dc.identifier.bibliographicCitationProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, v.2022-September, pp.4098 - 4102-
dc.relation.isPartOfProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH-
dc.citation.titleProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH-
dc.citation.volume2022-September-
dc.citation.startPage4098-
dc.citation.endPage4102-
dc.type.rimsART-
dc.type.docTypeProceedings Paper-
dc.description.journalClass1-
dc.description.isOpenAccessN-
dc.description.journalRegisteredClassscopus-
dc.relation.journalResearchAreaAcoustics-
dc.relation.journalResearchAreaAudiology & Speech-Language Pathology-
dc.relation.journalResearchAreaComputer Science-
dc.relation.journalResearchAreaEngineering-
dc.relation.journalWebOfScienceCategoryAcoustics-
dc.relation.journalWebOfScienceCategoryAudiology & Speech-Language Pathology-
dc.relation.journalWebOfScienceCategoryComputer Science, Artificial Intelligence-
dc.relation.journalWebOfScienceCategoryEngineering, Electrical & Electronic-
dc.subject.keywordPlusModulation-
dc.subject.keywordPlusSpeech communication-
dc.subject.keywordPlusSpeech recognition-
dc.subject.keywordPlusSpeech enhancement-
dc.subject.keywordPlusAutomatic speech recognition-
dc.subject.keywordPlusAutomatic speech recognition system-
dc.subject.keywordPlusEnd to end-
dc.subject.keywordPlusEnd-to-end speech recognition-
dc.subject.keywordPlusEnhancement-
dc.subject.keywordPlusFeature-wise linear modulation-
dc.subject.keywordPlusLinear modulations-
dc.subject.keywordPlusNoisy speech recognition-
dc.subject.keywordPlusPerformance-
dc.subject.keywordPlusRobust speech recognition-
dc.subject.keywordAuthorend-to-end speech recognition-
dc.subject.keywordAuthorenhancement-
dc.subject.keywordAuthorfeature-wise linear modulation-
dc.subject.keywordAuthorrobust speech recognition-
dc.identifier.urlhttps://www.isca-speech.org/archive/interspeech_2022/yang22b_interspeech.html-
Files in This Item
Go to Link
Appears in
Collections
서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Chang, Joon-Hyuk photo

Chang, Joon-Hyuk
COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)
Read more

Altmetrics

Total Views & Downloads

BROWSE