Cited 0 time in
Instance-level loss based multiple-instance learning framework for acoustic scene classification
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Choi, Won-Gook | - |
| dc.contributor.author | Chang, Joon-Hyuk | - |
| dc.contributor.author | Yang, Jae-Mo | - |
| dc.contributor.author | Moon, Han-Gil | - |
| dc.date.accessioned | 2024-04-21T23:00:19Z | - |
| dc.date.available | 2024-04-21T23:00:19Z | - |
| dc.date.issued | 2024-01 | - |
| dc.identifier.issn | 0003-682X | - |
| dc.identifier.issn | 1872-910X | - |
| dc.identifier.uri | https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/194707 | - |
| dc.description.abstract | An acoustic scene is inferred by detecting properties combining diverse sounds and acoustic environments. This study is intended to discover these properties effectively using multiple-instance learning (MIL). MIL, also known as a weakly supervised learning approach, is a strategy for extracting an instance vector from an audio chunk that composes an audio clip and utilizing these unlabeled instances to infer a scene corresponding to the input data. However, many studies pointed out an underestimation problem of MIL. In this study, we propose an enhanced MIL framework more suitable for ASC systems by defining instance-level labels and loss to extract and cluster instances effectively. Furthermore, we design a lightweight convolutional neural network named FUSE comprising frequency-, temporal-sided depthwise, and pointwise convolutional filters. Experimental results show that the confidence and proportion of positive instances significantly increase compared to vanilla MIL, overcoming the underestimation problem and improving the classification accuracy even higher than the supervised learning. The proposed system achieved a performance of 81.1%, 72.3%, and 58.3% on the TAU urban acoustic scenes 2019, 2020 mobile, and 2022 mobile datasets with 139 K parameters, respectively. In particular, it achieves the highest performance among the systems having under the 1 M parameters for the TAU urban acoustic scenes 2019 dataset. | - |
| dc.format.extent | 13 | - |
| dc.language | 영어 | - |
| dc.language.iso | ENG | - |
| dc.publisher | Pergamon Press Ltd. | - |
| dc.title | Instance-level loss based multiple-instance learning framework for acoustic scene classification | - |
| dc.type | Article | - |
| dc.publisher.location | 영국 | - |
| dc.identifier.doi | 10.1016/j.apacoust.2023.109757 | - |
| dc.identifier.scopusid | 2-s2.0-85178633756 | - |
| dc.identifier.wosid | 001203295900001 | - |
| dc.identifier.bibliographicCitation | Applied Acoustics, v.216, pp 1 - 13 | - |
| dc.citation.title | Applied Acoustics | - |
| dc.citation.volume | 216 | - |
| dc.citation.startPage | 1 | - |
| dc.citation.endPage | 13 | - |
| dc.type.docType | Article | - |
| dc.description.isOpenAccess | N | - |
| dc.description.journalRegisteredClass | scie | - |
| dc.description.journalRegisteredClass | scopus | - |
| dc.relation.journalResearchArea | Acoustics | - |
| dc.relation.journalWebOfScienceCategory | Acoustics | - |
| dc.subject.keywordPlus | Acoustic scene classification | - |
| dc.subject.keywordPlus | Learning frameworks | - |
| dc.subject.keywordPlus | Multiple-instance learning | - |
| dc.subject.keywordPlus | Performance | - |
| dc.subject.keywordPlus | Property | - |
| dc.subject.keywordPlus | Scene classification | - |
| dc.subject.keywordPlus | Sound and acoustic | - |
| dc.subject.keywordPlus | Sound environment | - |
| dc.subject.keywordPlus | Urban acoustics | - |
| dc.subject.keywordPlus | Weakly supervised learning | - |
| dc.subject.keywordAuthor | Acoustic scene classification | - |
| dc.subject.keywordAuthor | Multiple-instance learning | - |
| dc.subject.keywordAuthor | Weakly supervised learning | - |
| dc.identifier.url | https://www.sciencedirect.com/science/article/pii/S0003682X23005558?via%3Dihub | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366
COPYRIGHT © 2024 HANYANG UNIVERSITY.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.
