Deeply supervised curriculum learning for deep neural network-based sound source localization

Baek, Min-Sang; Yang, Joon-Young; Chang, Joon-Hyuk

doi:10.21437/Interspeech.2023-2451

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Deeply supervised curriculum learning for deep neural network-based sound source localization

Full metadata record

DC Field	Value	Language
dc.contributor.author	Baek, Min-Sang	-
dc.contributor.author	Yang, Joon-Young	-
dc.contributor.author	Chang, Joon-Hyuk	-
dc.date.accessioned	2023-10-10T02:36:55Z	-
dc.date.available	2023-10-10T02:36:55Z	-
dc.date.created	2023-10-04	-
dc.date.issued	2023-08	-
dc.identifier.issn	2308-457X	-
dc.identifier.uri	https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/191802	-
dc.description.abstract	Deep neural network (DNN) has made impressive progress in sound source localization (SSL) tasks with the hard n-hot labels that represent specific directions-of-arrivals (DOAs). However, recent study suggested soft DOA labels, considering the correlations between targets and nearby DOAs. In this study, to effectively train a DNN using soft labels, we propose deeply supervised curriculum learning (DSCL) by adopting the two techniques for the DNN, deep supervision (DS) and curriculum learning (CL). We train a DNN to solve SSL problems progressing from easier to harder, expecting the DNN would gradually reduce the angular region of the target DOAs. It is gained by various resolution soft targets for the different DNN layers to deeply supervise the DNN, while increasing the angular selectivity of the targets from the early to late stages of training by CL. Proposed method was verified on datasets with multi-speakers, and exceeded the hard-label methods with great improvements.	-
dc.language	영어	-
dc.language.iso	en	-
dc.publisher	International Speech Communication Association	-
dc.title	Deeply supervised curriculum learning for deep neural network-based sound source localization	-
dc.type	Article	-
dc.contributor.affiliatedAuthor	Chang, Joon-Hyuk	-
dc.identifier.doi	10.21437/Interspeech.2023-2451	-
dc.identifier.scopusid	2-s2.0-85171561839	-
dc.identifier.bibliographicCitation	Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, v.2023-August, pp.3744 - 3748	-
dc.relation.isPartOf	Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH	-
dc.citation.title	Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH	-
dc.citation.volume	2023-August	-
dc.citation.startPage	3744	-
dc.citation.endPage	3748	-
dc.type.rims	ART	-
dc.type.docType	Conference paper	-
dc.description.journalClass	1	-
dc.description.isOpenAccess	N	-
dc.description.journalRegisteredClass	scopus	-
dc.subject.keywordPlus	Acoustic generators	-
dc.subject.keywordPlus	Deep neural networks	-
dc.subject.keywordPlus	Direction of arrival	-
dc.subject.keywordPlus	Speech communication	-
dc.subject.keywordPlus	Angular regions	-
dc.subject.keywordPlus	Curriculum learning	-
dc.subject.keywordPlus	Deep supervision	-
dc.subject.keywordPlus	Directionof-arrival (DOA)	-
dc.subject.keywordPlus	Localization problems	-
dc.subject.keywordPlus	Network-based	-
dc.subject.keywordPlus	Soft labels	-
dc.subject.keywordPlus	Soft targets	-
dc.subject.keywordPlus	Sound source localization	-
dc.subject.keywordPlus	Target direction	-
dc.subject.keywordPlus	Curricula	-
dc.subject.keywordAuthor	curriculum learning	-
dc.subject.keywordAuthor	deep neural network	-
dc.subject.keywordAuthor	deep supervision	-
dc.subject.keywordAuthor	direction-of-arrival	-
dc.subject.keywordAuthor	sound source localization	-
dc.identifier.url	https://www.isca-speech.org/archive/interspeech_2023/baek23_interspeech.html	-

Files in This Item: Go to Link

Appears in Collections: 서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Chang, Joon-Hyuk photo

Chang, Joon-Hyuk: COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)

Read more

Altmetrics

Total Views & Downloads

STATISTICS: Total View :6,007,935; Today View :34,677

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1365

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE