Instance-level loss based multiple-instance learning framework for acoustic scene classification

Choi, Won-Gook; Chang, Joon-Hyuk; Yang, Jae-Mo; Moon, Han-Gil

doi:10.1016/j.apacoust.2023.109757

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Instance-level loss based multiple-instance learning framework for acoustic scene classification

Full metadata record

DC Field	Value	Language
dc.contributor.author	Choi, Won-Gook	-
dc.contributor.author	Chang, Joon-Hyuk	-
dc.contributor.author	Yang, Jae-Mo	-
dc.contributor.author	Moon, Han-Gil	-
dc.date.accessioned	2024-04-21T23:00:19Z	-
dc.date.available	2024-04-21T23:00:19Z	-
dc.date.issued	2024-01	-
dc.identifier.issn	0003-682X	-
dc.identifier.issn	1872-910X	-
dc.identifier.uri	https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/194707	-
dc.description.abstract	An acoustic scene is inferred by detecting properties combining diverse sounds and acoustic environments. This study is intended to discover these properties effectively using multiple-instance learning (MIL). MIL, also known as a weakly supervised learning approach, is a strategy for extracting an instance vector from an audio chunk that composes an audio clip and utilizing these unlabeled instances to infer a scene corresponding to the input data. However, many studies pointed out an underestimation problem of MIL. In this study, we propose an enhanced MIL framework more suitable for ASC systems by defining instance-level labels and loss to extract and cluster instances effectively. Furthermore, we design a lightweight convolutional neural network named FUSE comprising frequency-, temporal-sided depthwise, and pointwise convolutional filters. Experimental results show that the confidence and proportion of positive instances significantly increase compared to vanilla MIL, overcoming the underestimation problem and improving the classification accuracy even higher than the supervised learning. The proposed system achieved a performance of 81.1%, 72.3%, and 58.3% on the TAU urban acoustic scenes 2019, 2020 mobile, and 2022 mobile datasets with 139 K parameters, respectively. In particular, it achieves the highest performance among the systems having under the 1 M parameters for the TAU urban acoustic scenes 2019 dataset.	-
dc.format.extent	13	-
dc.language	영어	-
dc.language.iso	ENG	-
dc.publisher	Pergamon Press Ltd.	-
dc.title	Instance-level loss based multiple-instance learning framework for acoustic scene classification	-
dc.type	Article	-
dc.publisher.location	영국	-
dc.identifier.doi	10.1016/j.apacoust.2023.109757	-
dc.identifier.scopusid	2-s2.0-85178633756	-
dc.identifier.wosid	001203295900001	-
dc.identifier.bibliographicCitation	Applied Acoustics, v.216, pp 1 - 13	-
dc.citation.title	Applied Acoustics	-
dc.citation.volume	216	-
dc.citation.startPage	1	-
dc.citation.endPage	13	-
dc.type.docType	Article	-
dc.description.isOpenAccess	N	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-
dc.relation.journalResearchArea	Acoustics	-
dc.relation.journalWebOfScienceCategory	Acoustics	-
dc.subject.keywordPlus	Acoustic scene classification	-
dc.subject.keywordPlus	Learning frameworks	-
dc.subject.keywordPlus	Multiple-instance learning	-
dc.subject.keywordPlus	Performance	-
dc.subject.keywordPlus	Property	-
dc.subject.keywordPlus	Scene classification	-
dc.subject.keywordPlus	Sound and acoustic	-
dc.subject.keywordPlus	Sound environment	-
dc.subject.keywordPlus	Urban acoustics	-
dc.subject.keywordPlus	Weakly supervised learning	-
dc.subject.keywordAuthor	Acoustic scene classification	-
dc.subject.keywordAuthor	Multiple-instance learning	-
dc.subject.keywordAuthor	Weakly supervised learning	-
dc.identifier.url	https://www.sciencedirect.com/science/article/pii/S0003682X23005558?via%3Dihub	-

Files in This Item: Go to Link

Appears in Collections: 서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Chang, Joon-Hyuk photo

Chang, Joon-Hyuk: COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE