FiLM Conditioning with Enhanced Feature to the Transformer-based End-to-End Noisy Speech Recognition

Yang, Da-Hee; Chang, Joon-Hyuk

doi:10.21437/Interspeech.2022-161

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

FiLM Conditioning with Enhanced Feature to the Transformer-based End-to-End Noisy Speech Recognition

Full metadata record

DC Field	Value	Language
dc.contributor.author	Yang, Da-Hee	-
dc.contributor.author	Chang, Joon-Hyuk	-
dc.date.accessioned	2022-12-20T06:24:43Z	-
dc.date.available	2022-12-20T06:24:43Z	-
dc.date.issued	2022-09	-
dc.identifier.issn	1990-9772	-
dc.identifier.uri	https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/173084	-
dc.description.abstract	Ensuring robustness against environmental noise is an important concern in the design of automatic speech recognition (ASR) systems. This is typically achieved by utilizing a speech enhancement (SE) network in an ASR system to boost noise robustness. The performance of ASR systems can be improved using SE networks as a front-end or by retraining the ASR system on enhanced speech. Although the SE network is effective, it does not always result in improved performance in the ASR system owing to artifacts. To address this problem, we propose the use of enhanced speech from an SE network as a conditioning feature instead of a direct input feature of the ASR system. This is achieved by stacking a feature-wise linear modulation (FiLM) layer on each transformer layer of the end-to-end ASR encoder and combining the input and conditioning features. The results indicate that the proposed FiLM training method exhibits greater robustness against noise owing to the use of enhanced speech as conditioning information rather than as direct ASR input.	-
dc.format.extent	5	-
dc.language	영어	-
dc.language.iso	ENG	-
dc.title	FiLM Conditioning with Enhanced Feature to the Transformer-based End-to-End Noisy Speech Recognition	-
dc.type	Article	-
dc.identifier.doi	10.21437/Interspeech.2022-161	-
dc.identifier.scopusid	2-s2.0-85140067940	-
dc.identifier.wosid	000900724504056	-
dc.identifier.bibliographicCitation	Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, v.2022-September, pp 4098 - 4102	-
dc.citation.title	Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH	-
dc.citation.volume	2022-September	-
dc.citation.startPage	4098	-
dc.citation.endPage	4102	-
dc.type.docType	Proceedings Paper	-
dc.description.isOpenAccess	N	-
dc.description.journalRegisteredClass	scopus	-
dc.relation.journalResearchArea	Acoustics	-
dc.relation.journalResearchArea	Audiology & Speech-Language Pathology	-
dc.relation.journalResearchArea	Computer Science	-
dc.relation.journalResearchArea	Engineering	-
dc.relation.journalWebOfScienceCategory	Acoustics	-
dc.relation.journalWebOfScienceCategory	Audiology & Speech-Language Pathology	-
dc.relation.journalWebOfScienceCategory	Computer Science, Artificial Intelligence	-
dc.relation.journalWebOfScienceCategory	Engineering, Electrical & Electronic	-
dc.subject.keywordPlus	Modulation	-
dc.subject.keywordPlus	Speech communication	-
dc.subject.keywordPlus	Speech recognition	-
dc.subject.keywordPlus	Speech enhancement	-
dc.subject.keywordPlus	Automatic speech recognition	-
dc.subject.keywordPlus	Automatic speech recognition system	-
dc.subject.keywordPlus	End to end	-
dc.subject.keywordPlus	End-to-end speech recognition	-
dc.subject.keywordPlus	Enhancement	-
dc.subject.keywordPlus	Feature-wise linear modulation	-
dc.subject.keywordPlus	Linear modulations	-
dc.subject.keywordPlus	Noisy speech recognition	-
dc.subject.keywordPlus	Performance	-
dc.subject.keywordPlus	Robust speech recognition	-
dc.subject.keywordAuthor	end-to-end speech recognition	-
dc.subject.keywordAuthor	enhancement	-
dc.subject.keywordAuthor	feature-wise linear modulation	-
dc.subject.keywordAuthor	robust speech recognition	-
dc.identifier.url	https://www.isca-speech.org/archive/interspeech_2022/yang22b_interspeech.html	-

Files in This Item: Go to Link

Appears in Collections: 서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Chang, Joon-Hyuk photo

Chang, Joon-Hyuk: COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE