Convolutional Recurrent Neural Network with Auxiliary Stream for Robust Variable-Length Acoustic Scene Classification

Choi, Won-Gook; Chang, Joon-Hyuk

doi:10.21437/Interspeech.2022-959

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Convolutional Recurrent Neural Network with Auxiliary Stream for Robust Variable-Length Acoustic Scene Classification

Full metadata record

DC Field	Value	Language
dc.contributor.author	Choi, Won-Gook	-
dc.contributor.author	Chang, Joon-Hyuk	-
dc.date.accessioned	2022-12-20T06:25:06Z	-
dc.date.available	2022-12-20T06:25:06Z	-
dc.date.created	2022-11-02	-
dc.date.issued	2022-09	-
dc.identifier.issn	2308-457X	-
dc.identifier.uri	https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/173089	-
dc.description.abstract	Deep learning has proven to be suitable for acoustic scene classification (ASC). Therefore, it exhibits significant improvement in performance while using neural networks. However, several studies have been performed using convolutional neural network (CNN) rather than recurrent neural network (RNN) or convolutional recurrent neural network (CRNN), even though acoustic scene data is treated as a temporal signal. In practice, CRNNs are rarely adopted and are ranked lower in recent detection and classification of acoustic scenes and events (DCASE) challenges for fixed-length (i.e., 10 s) ASC. In this paper, an auxiliary stream technique is proposed that can improve the performance of CRNNs compared with that of CNNs by controlling the inductive bias of RNN. The auxiliary stream trains CNN by effectively extracting embeddings and is only connected on training steps. Therefore, it does not affect the model complexity on the inference steps. The experimental results demonstrate the superiority of the proposed method, regardless of the CNN model used for CRNN. Additionally, the proposed method yields robustness on variable-length ASC by performing streaming inferences and demonstrates the importance of CRNN.	-
dc.language	영어	-
dc.language.iso	en	-
dc.publisher	International Speech Communication Association	-
dc.title	Convolutional Recurrent Neural Network with Auxiliary Stream for Robust Variable-Length Acoustic Scene Classification	-
dc.type	Article	-
dc.contributor.affiliatedAuthor	Chang, Joon-Hyuk	-
dc.identifier.doi	10.21437/Interspeech.2022-959	-
dc.identifier.scopusid	2-s2.0-85140094712	-
dc.identifier.wosid	000900724502119	-
dc.identifier.bibliographicCitation	Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, v.2022-September, pp.2418 - 2422	-
dc.relation.isPartOf	Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH	-
dc.citation.title	Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH	-
dc.citation.volume	2022-September	-
dc.citation.startPage	2418	-
dc.citation.endPage	2422	-
dc.type.rims	ART	-
dc.type.docType	Proceedings Paper	-
dc.description.journalClass	1	-
dc.description.isOpenAccess	N	-
dc.description.journalRegisteredClass	scopus	-
dc.relation.journalResearchArea	Acoustics	-
dc.relation.journalResearchArea	Audiology & Speech-Language Pathology	-
dc.relation.journalResearchArea	Computer Science	-
dc.relation.journalResearchArea	Engineering	-
dc.relation.journalWebOfScienceCategory	Acoustics	-
dc.relation.journalWebOfScienceCategory	Audiology & Speech-Language Pathology	-
dc.relation.journalWebOfScienceCategory	Computer Science, Artificial Intelligence	-
dc.relation.journalWebOfScienceCategory	Engineering, Electrical & Electronic	-
dc.subject.keywordPlus	Convolution	-
dc.subject.keywordPlus	Convolutional neural networks	-
dc.subject.keywordPlus	Speech communication	-
dc.subject.keywordPlus	Recurrent neural networks	-
dc.subject.keywordPlus	Acoustic scene classification	-
dc.subject.keywordPlus	Convolutional neural network	-
dc.subject.keywordPlus	Convolutional recurrent neural network	-
dc.subject.keywordPlus	Inductive bias	-
dc.subject.keywordPlus	Neural-networks	-
dc.subject.keywordPlus	Performance	-
dc.subject.keywordPlus	Scene classification	-
dc.subject.keywordPlus	Streaming	-
dc.subject.keywordPlus	Temporal signals	-
dc.subject.keywordPlus	Variable length	-
dc.subject.keywordAuthor	acoustic scene classification	-
dc.subject.keywordAuthor	convolutional recurrent neural network	-
dc.subject.keywordAuthor	streaming	-
dc.subject.keywordAuthor	variable-length	-
dc.identifier.url	https://www.isca-speech.org/archive/interspeech_2022/chang22d_interspeech.html	-

Files in This Item: Go to Link

Appears in Collections: 서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Chang, Joon-Hyuk photo

Chang, Joon-Hyuk: COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)

Read more

Altmetrics

Total Views & Downloads

STATISTICS: Total View :6,007,935; Today View :34,677

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1365

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE