Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Selective Film Conditioning with CTC-Based ASR Probability for Speech Enhancement

Full metadata record
DC Field Value Language
dc.contributor.author양다희-
dc.contributor.authorChang, Joon-Hyuk-
dc.date.accessioned2024-11-28T10:31:10Z-
dc.date.available2024-11-28T10:31:10Z-
dc.date.issued2023-06-
dc.identifier.issn0736-7791-
dc.identifier.issn1520-6149-
dc.identifier.urihttps://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/196127-
dc.description.abstractEnhancing speech quality and intelligibility for automatic speech recognition (ASR) plays an important role in modeling speech enhancement (SE) systems. However, improving the ASR performance by utilizing SE networks is not guaranteed, owing to the discrepancy in the training methods of the two systems. Therefore, recent studies have gradually incorporated ASR information into SE systems by jointly training ASR and SE systems. Although prior studies have improved the performance, they are inefficient because the two networks are combined and require large model sizes. To address this limitation, we propose an efficient way to use feature-wise linear modulation (FiLM) conditioning with CTC-based ASR probabilities for the SE system. The proposed model is designed by stacking a FiLM layer with selective learning on each temporal convolutional network of the SE estimation module. This allows the SE network to adaptively select ASR information based on the relationship between context and acoustic information. The proposed method improves SE and ASR performance, resulting in more robust results against noise with only a small increase in the number of parameters.-
dc.format.extent5-
dc.language영어-
dc.language.isoENG-
dc.publisherInstitute of Electrical and Electronics Engineers Inc.-
dc.titleSelective Film Conditioning with CTC-Based ASR Probability for Speech Enhancement-
dc.typeArticle-
dc.publisher.location미국-
dc.identifier.doi10.1109/ICASSP49357.2023.10096375-
dc.identifier.scopusid2-s2.0-85180535669-
dc.identifier.bibliographicCitationICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, pp 1 - 5-
dc.citation.titleICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings-
dc.citation.startPage1-
dc.citation.endPage5-
dc.type.docTypeConference paper-
dc.description.isOpenAccessN-
dc.description.journalRegisteredClassscopus-
dc.subject.keywordPlusAutomatic speech recognition-
dc.subject.keywordPlusFeature-wise linear modulation conditioning-
dc.subject.keywordPlusFrame-wise CTC-based posterior probability-
dc.subject.keywordPlusLinear modulations-
dc.subject.keywordPlusPosterior probability-
dc.subject.keywordPlusSpeech enhancement system-
dc.subject.keywordPlusSpeech quality-
dc.subject.keywordPlusSpeech recognition performance-
dc.subject.keywordPlusSpeech recognition probability-
dc.subject.keywordPlusTraining methods-
dc.subject.keywordAuthorFiLM conditioning-
dc.subject.keywordAuthorframe-wise CTC-based posterior probability-
dc.subject.keywordAuthorspeech enhancement-
dc.identifier.urlhttps://ieeexplore.ieee.org/document/10096375-
Files in This Item
Go to Link
Appears in
Collections
서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Chang, Joon-Hyuk photo

Chang, Joon-Hyuk
COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)
Read more

Altmetrics

Total Views & Downloads

BROWSE