Cited 0 time in
Spatially Weighted Contrastive Learning for Robust Sound Source Localization
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Kim, Hyun-Soo | - |
| dc.contributor.author | Yang, Da-Hee | - |
| dc.contributor.author | Chang, Joon-Hyuk | - |
| dc.date.accessioned | 2025-11-20T01:30:30Z | - |
| dc.date.available | 2025-11-20T01:30:30Z | - |
| dc.date.issued | 2025-08 | - |
| dc.identifier.issn | 2958-1796 | - |
| dc.identifier.uri | https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/209223 | - |
| dc.description.abstract | We propose a spatially weighted contrastive loss (SWeC loss) for sound source localization in real-world scenarios using multi-channel speech data. In multi-channel localization, phase differences between microphone channels provide critical cues for estimating the azimuth angle of incoming speech. To effectively extract azimuth information, we leverage contrastive learning and introduce a novel loss function that incorporates spatial relationships between azimuth classes. Specifically, our loss assigns weights to negative pairs based on their angular distance, penalizing high similarity between embeddings corresponding to distant angles. Furthermore, we propose a contrastive data generation method tailored to multi-channel localization, enhancing the effectiveness of contrastive learning. Experimental results demonstrate that the proposed loss function and data generation strategy significantly improve localization performance. | - |
| dc.format.extent | 5 | - |
| dc.language | 영어 | - |
| dc.language.iso | ENG | - |
| dc.publisher | International Speech Communication Association | - |
| dc.title | Spatially Weighted Contrastive Learning for Robust Sound Source Localization | - |
| dc.type | Article | - |
| dc.identifier.doi | 10.21437/Interspeech.2025-2666 | - |
| dc.identifier.scopusid | 2-s2.0-105020083243 | - |
| dc.identifier.bibliographicCitation | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, pp 2490 - 2494 | - |
| dc.citation.title | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH | - |
| dc.citation.startPage | 2490 | - |
| dc.citation.endPage | 2494 | - |
| dc.type.docType | Conference paper | - |
| dc.description.isOpenAccess | N | - |
| dc.description.journalRegisteredClass | scopus | - |
| dc.subject.keywordPlus | Contrastive Learning | - |
| dc.subject.keywordPlus | Microphones | - |
| dc.subject.keywordPlus | Speech communication | - |
| dc.subject.keywordAuthor | contrastive learning | - |
| dc.subject.keywordAuthor | sound source localization | - |
| dc.identifier.url | https://www.isca-archive.org/interspeech_2025/kim25v_interspeech.html | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366
COPYRIGHT © 2024 HANYANG UNIVERSITY.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.
