Spatially Weighted Contrastive Learning for Robust Sound Source Localization

Kim, Hyun-Soo; Yang, Da-Hee; Chang, Joon-Hyuk

doi:10.21437/Interspeech.2025-2666

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Spatially Weighted Contrastive Learning for Robust Sound Source Localization

Full metadata record

DC Field	Value	Language
dc.contributor.author	Kim, Hyun-Soo	-
dc.contributor.author	Yang, Da-Hee	-
dc.contributor.author	Chang, Joon-Hyuk	-
dc.date.accessioned	2025-11-20T01:30:30Z	-
dc.date.available	2025-11-20T01:30:30Z	-
dc.date.issued	2025-08	-
dc.identifier.issn	2958-1796	-
dc.identifier.uri	https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/209223	-
dc.description.abstract	We propose a spatially weighted contrastive loss (SWeC loss) for sound source localization in real-world scenarios using multi-channel speech data. In multi-channel localization, phase differences between microphone channels provide critical cues for estimating the azimuth angle of incoming speech. To effectively extract azimuth information, we leverage contrastive learning and introduce a novel loss function that incorporates spatial relationships between azimuth classes. Specifically, our loss assigns weights to negative pairs based on their angular distance, penalizing high similarity between embeddings corresponding to distant angles. Furthermore, we propose a contrastive data generation method tailored to multi-channel localization, enhancing the effectiveness of contrastive learning. Experimental results demonstrate that the proposed loss function and data generation strategy significantly improve localization performance.	-
dc.format.extent	5	-
dc.language	영어	-
dc.language.iso	ENG	-
dc.publisher	International Speech Communication Association	-
dc.title	Spatially Weighted Contrastive Learning for Robust Sound Source Localization	-
dc.type	Article	-
dc.identifier.doi	10.21437/Interspeech.2025-2666	-
dc.identifier.scopusid	2-s2.0-105020083243	-
dc.identifier.bibliographicCitation	Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, pp 2490 - 2494	-
dc.citation.title	Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH	-
dc.citation.startPage	2490	-
dc.citation.endPage	2494	-
dc.type.docType	Conference paper	-
dc.description.isOpenAccess	N	-
dc.description.journalRegisteredClass	scopus	-
dc.subject.keywordPlus	Contrastive Learning	-
dc.subject.keywordPlus	Microphones	-
dc.subject.keywordPlus	Speech communication	-
dc.subject.keywordAuthor	contrastive learning	-
dc.subject.keywordAuthor	sound source localization	-
dc.identifier.url	https://www.isca-archive.org/interspeech_2025/kim25v_interspeech.html	-

Files in This Item: Go to Link

Appears in Collections: 서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Chang, Joon-Hyuk photo

Chang, Joon-Hyuk: COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE