Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

IMPROVING TARGET SOUND EXTRACTION WITH TIMESTAMP KNOWLEDGE DISTILLATION

Full metadata record
DC Field Value Language
dc.contributor.authorKim, Dail-
dc.contributor.authorBaek, Min-Sang-
dc.contributor.authorKim, Yungyeo-
dc.contributor.authorChang, Joon-Hyuk-
dc.date.accessioned2024-11-28T16:01:46Z-
dc.date.available2024-11-28T16:01:46Z-
dc.date.issued2024-04-
dc.identifier.issn0736-7791-
dc.identifier.issn1520-6149-
dc.identifier.urihttps://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/197476-
dc.description.abstractIn this paper, we propose a timestamp knowledge distillation (TKD) method that adopts privileged knowledge distillation to enhance the performance of deep neural network (DNN)-based target sound extraction (TSE). While previous studies have mainly used n-hot vectors to indicate the type of target sound events (SEs), which are termed weak labels (WLs), recent studies demonstrated that timestamp knowledge of SEs is meaningful information to improve the TSE performance. To utilize timestamp knowledge, we use the oracle strong labels (OSLs) that indicate the occurrence of target SEs in the audio clip as privileged information. However, the OSLs are difficult to gain in real-world applications compared to WLs. We thus propose the TKD that transfers the timestamp knowledge from the teacher model trained using both WLs and OSLs to the student model trained using only WLs via a loss function. Experimental results across multiple DNN architectures confirmed that the OSLs enhanced the TSE significantly. Moreover, the TKD notably improved the student model's performance compared to the baseline trained only with WLs.-
dc.format.extent5-
dc.language영어-
dc.language.isoENG-
dc.publisherInstitute of Electrical and Electronics Engineers Inc.-
dc.titleIMPROVING TARGET SOUND EXTRACTION WITH TIMESTAMP KNOWLEDGE DISTILLATION-
dc.typeArticle-
dc.publisher.location미국-
dc.identifier.doi10.1109/ICASSP48485.2024.10447525-
dc.identifier.scopusid2-s2.0-85195372806-
dc.identifier.wosid001285850001142-
dc.identifier.bibliographicCitationICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, pp 1396 - 1400-
dc.citation.titleICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings-
dc.citation.startPage1396-
dc.citation.endPage1400-
dc.type.docTypeProceedings Paper-
dc.description.isOpenAccessN-
dc.description.journalRegisteredClassscopus-
dc.relation.journalResearchAreaAcoustics-
dc.relation.journalResearchAreaComputer Science-
dc.relation.journalResearchAreaEngineering-
dc.relation.journalResearchAreaImaging Science & Photographic Technology-
dc.relation.journalWebOfScienceCategoryAcoustics-
dc.relation.journalWebOfScienceCategoryComputer Science, Artificial Intelligence-
dc.relation.journalWebOfScienceCategoryEngineering, Electrical & Electronic-
dc.relation.journalWebOfScienceCategoryImaging Science & Photographic Technology-
dc.subject.keywordAuthorprivileged knowledge distillation-
dc.subject.keywordAuthorTarget sound extraction-
dc.subject.keywordAuthortimestamp information-
Files in This Item
There are no files associated with this item.
Appears in
Collections
서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Chang, Joon-Hyuk photo

Chang, Joon-Hyuk
COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)
Read more

Altmetrics

Total Views & Downloads

BROWSE