Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Optimizing CLAP Reward with LLM Feedback for Semantically Aligned and Diverse Automated Audio Captioning

Authors
Ahn, SeyunByun, Pil MooChoi, Won-GookChang, Joon-Hyuk
Issue Date
Aug-2025
Publisher
International Speech Communication Association
Keywords
Automated Audio Captioning; LLM; Pre-trained Model; Reinforcement Learning
Citation
Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, pp 3140 - 3144
Pages
5
Indexed
SCOPUS
Journal Title
Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Start Page
3140
End Page
3144
URI
https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/209224
DOI
10.21437/Interspeech.2025-1313
ISSN
2958-1796
Abstract
Deep learning-based automated audio captioning (AAC) systems describe audio well, yet they often overfit to reference styles. To address this, reinforcement learning (RL) techniques have been adopted to directly optimize evaluation metrics, but these methods often suffer from word repetition and contextual distortion. Embedding-based rewards, such as those derived from contrastive language-audio pretraining (CLAP), may bias the model toward specific words or phrases that human evaluators find unnatural. In this paper, we propose a novel reward system that combines a CLAP-based reward with a repetition penalty (CRRP) and a large language model (LLM) evaluator. CRRP computes rewards using CLAP similarity, applies a repetition penalty and reward clipping to stabilize training, and uses LLM feedback to enhance naturalness. Our method shows outstanding performance in semantic evaluations and both human and AI-based assessments, with results available at https://yunniya097.github.io/CRRP/.
Files in This Item
Go to Link
Appears in
Collections
서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Chang, Joon-Hyuk photo

Chang, Joon-Hyuk
COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)
Read more

Altmetrics

Total Views & Downloads

BROWSE