Deeply supervised curriculum learning for deep neural network-based sound source localization
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Baek, Min-Sang | - |
dc.contributor.author | Yang, Joon-Young | - |
dc.contributor.author | Chang, Joon-Hyuk | - |
dc.date.accessioned | 2023-10-10T02:36:55Z | - |
dc.date.available | 2023-10-10T02:36:55Z | - |
dc.date.created | 2023-10-04 | - |
dc.date.issued | 2023-08 | - |
dc.identifier.issn | 2308-457X | - |
dc.identifier.uri | https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/191802 | - |
dc.description.abstract | Deep neural network (DNN) has made impressive progress in sound source localization (SSL) tasks with the hard n-hot labels that represent specific directions-of-arrivals (DOAs). However, recent study suggested soft DOA labels, considering the correlations between targets and nearby DOAs. In this study, to effectively train a DNN using soft labels, we propose deeply supervised curriculum learning (DSCL) by adopting the two techniques for the DNN, deep supervision (DS) and curriculum learning (CL). We train a DNN to solve SSL problems progressing from easier to harder, expecting the DNN would gradually reduce the angular region of the target DOAs. It is gained by various resolution soft targets for the different DNN layers to deeply supervise the DNN, while increasing the angular selectivity of the targets from the early to late stages of training by CL. Proposed method was verified on datasets with multi-speakers, and exceeded the hard-label methods with great improvements. | - |
dc.language | 영어 | - |
dc.language.iso | en | - |
dc.publisher | International Speech Communication Association | - |
dc.title | Deeply supervised curriculum learning for deep neural network-based sound source localization | - |
dc.type | Article | - |
dc.contributor.affiliatedAuthor | Chang, Joon-Hyuk | - |
dc.identifier.doi | 10.21437/Interspeech.2023-2451 | - |
dc.identifier.scopusid | 2-s2.0-85171561839 | - |
dc.identifier.bibliographicCitation | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, v.2023-August, pp.3744 - 3748 | - |
dc.relation.isPartOf | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH | - |
dc.citation.title | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH | - |
dc.citation.volume | 2023-August | - |
dc.citation.startPage | 3744 | - |
dc.citation.endPage | 3748 | - |
dc.type.rims | ART | - |
dc.type.docType | Conference paper | - |
dc.description.journalClass | 1 | - |
dc.description.isOpenAccess | N | - |
dc.description.journalRegisteredClass | scopus | - |
dc.subject.keywordPlus | Acoustic generators | - |
dc.subject.keywordPlus | Deep neural networks | - |
dc.subject.keywordPlus | Direction of arrival | - |
dc.subject.keywordPlus | Speech communication | - |
dc.subject.keywordPlus | Angular regions | - |
dc.subject.keywordPlus | Curriculum learning | - |
dc.subject.keywordPlus | Deep supervision | - |
dc.subject.keywordPlus | Directionof-arrival (DOA) | - |
dc.subject.keywordPlus | Localization problems | - |
dc.subject.keywordPlus | Network-based | - |
dc.subject.keywordPlus | Soft labels | - |
dc.subject.keywordPlus | Soft targets | - |
dc.subject.keywordPlus | Sound source localization | - |
dc.subject.keywordPlus | Target direction | - |
dc.subject.keywordPlus | Curricula | - |
dc.subject.keywordAuthor | curriculum learning | - |
dc.subject.keywordAuthor | deep neural network | - |
dc.subject.keywordAuthor | deep supervision | - |
dc.subject.keywordAuthor | direction-of-arrival | - |
dc.subject.keywordAuthor | sound source localization | - |
dc.identifier.url | https://www.isca-speech.org/archive/interspeech_2023/baek23_interspeech.html | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1365
COPYRIGHT © 2021 HANYANG UNIVERSITY.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.