Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Partitioning Attention Weight: Mitigating Adverse Effect of Incorrect Pseudo-labels for Self-Supervised ASR

Authors
Lee, Jae-HongChang, Joon-Hyuk
Issue Date
Dec-2023
Publisher
IEEE Advancing Technology for Humanity
Keywords
Computational modeling; Data augmentation; Data models; end-to-end speech recognition; pseudo-labeling; self-supervised learning; Self-supervised learning; self-training; Semi-supervised learning; Semisupervised learning; Task analysis; Transformers
Citation
IEEE/ACM Transactions on Audio, Speech, and Language Processing, v.32, pp 891 - 905
Pages
15
Indexed
SCIE
SCOPUS
Journal Title
IEEE/ACM Transactions on Audio, Speech, and Language Processing
Volume
32
Start Page
891
End Page
905
URI
https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/196609
DOI
10.1109/TASLP.2023.3343615
ISSN
2329-9290
2329-9304
Abstract
The performance of automatic speech recognition (ASR) models has been significantly improved owing to advances in deep learning and end-to-end approaches. However, these require a large amount of labeled data, which are expensive to obtain. Semi-supervised learning techniques, such as pseudo-labeling and self-supervised learning, have emerged as potential solutions to reduce the reliance on labeled data. Recently, some studies have combined self-supervised learning and pseudo-labeling to further enhance ASR performance. However, these methods suffer from incorrect pseudo-labels that propagate errors and reduce ASR performance. In this paper, we propose a novel method called partitioning attention weight (PAW) to mitigate the adverse effects of incorrect labels without requiring additional language models. Our proposed method isolates audio segments by partitioning a fully connected attention weight into sub-attention weights to prevent adverse effects that the model learns the wrong context for the entire attention weights from incorrect labels as well as overfitting. The proposed method is simple, requiring few changes to existing learning frameworks, and leverages the alignment information obtained during the pseudo-labeling process. Our experimental results show consistent performance improvements in ASR performance across various semi-supervised learning scenarios.
Files in This Item
Go to Link
Appears in
Collections
서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Chang, Joon-Hyuk photo

Chang, Joon-Hyuk
COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)
Read more

Altmetrics

Total Views & Downloads

BROWSE