Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Convolutional Recurrent Neural Network with Auxiliary Stream for Robust Variable-Length Acoustic Scene Classification

Full metadata record
DC Field Value Language
dc.contributor.authorChoi, Won-Gook-
dc.contributor.authorChang, Joon-Hyuk-
dc.date.accessioned2022-12-20T06:25:06Z-
dc.date.available2022-12-20T06:25:06Z-
dc.date.created2022-11-02-
dc.date.issued2022-09-
dc.identifier.issn2308-457X-
dc.identifier.urihttps://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/173089-
dc.description.abstractDeep learning has proven to be suitable for acoustic scene classification (ASC). Therefore, it exhibits significant improvement in performance while using neural networks. However, several studies have been performed using convolutional neural network (CNN) rather than recurrent neural network (RNN) or convolutional recurrent neural network (CRNN), even though acoustic scene data is treated as a temporal signal. In practice, CRNNs are rarely adopted and are ranked lower in recent detection and classification of acoustic scenes and events (DCASE) challenges for fixed-length (i.e., 10 s) ASC. In this paper, an auxiliary stream technique is proposed that can improve the performance of CRNNs compared with that of CNNs by controlling the inductive bias of RNN. The auxiliary stream trains CNN by effectively extracting embeddings and is only connected on training steps. Therefore, it does not affect the model complexity on the inference steps. The experimental results demonstrate the superiority of the proposed method, regardless of the CNN model used for CRNN. Additionally, the proposed method yields robustness on variable-length ASC by performing streaming inferences and demonstrates the importance of CRNN.-
dc.language영어-
dc.language.isoen-
dc.publisherInternational Speech Communication Association-
dc.titleConvolutional Recurrent Neural Network with Auxiliary Stream for Robust Variable-Length Acoustic Scene Classification-
dc.typeArticle-
dc.contributor.affiliatedAuthorChang, Joon-Hyuk-
dc.identifier.doi10.21437/Interspeech.2022-959-
dc.identifier.scopusid2-s2.0-85140094712-
dc.identifier.wosid000900724502119-
dc.identifier.bibliographicCitationProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, v.2022-September, pp.2418 - 2422-
dc.relation.isPartOfProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH-
dc.citation.titleProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH-
dc.citation.volume2022-September-
dc.citation.startPage2418-
dc.citation.endPage2422-
dc.type.rimsART-
dc.type.docTypeProceedings Paper-
dc.description.journalClass1-
dc.description.isOpenAccessN-
dc.description.journalRegisteredClassscopus-
dc.relation.journalResearchAreaAcoustics-
dc.relation.journalResearchAreaAudiology & Speech-Language Pathology-
dc.relation.journalResearchAreaComputer Science-
dc.relation.journalResearchAreaEngineering-
dc.relation.journalWebOfScienceCategoryAcoustics-
dc.relation.journalWebOfScienceCategoryAudiology & Speech-Language Pathology-
dc.relation.journalWebOfScienceCategoryComputer Science, Artificial Intelligence-
dc.relation.journalWebOfScienceCategoryEngineering, Electrical & Electronic-
dc.subject.keywordPlusConvolution-
dc.subject.keywordPlusConvolutional neural networks-
dc.subject.keywordPlusSpeech communication-
dc.subject.keywordPlusRecurrent neural networks-
dc.subject.keywordPlusAcoustic scene classification-
dc.subject.keywordPlusConvolutional neural network-
dc.subject.keywordPlusConvolutional recurrent neural network-
dc.subject.keywordPlusInductive bias-
dc.subject.keywordPlusNeural-networks-
dc.subject.keywordPlusPerformance-
dc.subject.keywordPlusScene classification-
dc.subject.keywordPlusStreaming-
dc.subject.keywordPlusTemporal signals-
dc.subject.keywordPlusVariable length-
dc.subject.keywordAuthoracoustic scene classification-
dc.subject.keywordAuthorconvolutional recurrent neural network-
dc.subject.keywordAuthorstreaming-
dc.subject.keywordAuthorvariable-length-
dc.identifier.urlhttps://www.isca-speech.org/archive/interspeech_2022/chang22d_interspeech.html-
Files in This Item
Go to Link
Appears in
Collections
서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Chang, Joon-Hyuk photo

Chang, Joon-Hyuk
COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)
Read more

Altmetrics

Total Views & Downloads

BROWSE