General-purpose Adversarial Training for Enhanced Automatic Speech Recognition Model Generalization

Kim, Dohee; Shim, Daeyeol; Chang, Joon-Hyuk

doi:10.21437/Interspeech.2023-2389

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

General-purpose Adversarial Training for Enhanced Automatic Speech Recognition Model Generalization

Full metadata record

DC Field	Value	Language
dc.contributor.author	Kim, Dohee	-
dc.contributor.author	Shim, Daeyeol	-
dc.contributor.author	Chang, Joon-Hyuk	-
dc.date.accessioned	2023-10-10T02:36:18Z	-
dc.date.available	2023-10-10T02:36:18Z	-
dc.date.created	2023-10-04	-
dc.date.issued	2023-08	-
dc.identifier.issn	2308-457X	-
dc.identifier.uri	https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/191796	-
dc.description.abstract	We present a new adversarial training method called General-purpose adversarial training (GPAT) that enhances the performance of automatic speech recognition models. In GPAT, we propose the followings: (1) a plausible adversarial examples converter (PAC); (2) a distribution matching regularization term (DM reg.). Compared to previous studies that directly compute gradients with respect to the input, PAC incorporates non-linearity to achieve performance improvement while eliminating the need for extra forward passes. Furthermore, unlike previous studies that use fixed norms, GPAT can generate similar yet diverse samples through DM reg. We demonstrate that the GPAT elevates the performance of various models on the LibriSpeech dataset. Specifically, by applying GPAT to the conformer model, we achieved 5.3% average relative improvements. With respect to the wav2vec 2.0 experiments, our method yielded a 2.0%/4.4% word error rate on the LibriSpeech test sets without a language model.	-
dc.language	영어	-
dc.language.iso	en	-
dc.publisher	International Speech Communication Association	-
dc.title	General-purpose Adversarial Training for Enhanced Automatic Speech Recognition Model Generalization	-
dc.type	Article	-
dc.contributor.affiliatedAuthor	Chang, Joon-Hyuk	-
dc.identifier.doi	10.21437/Interspeech.2023-2389	-
dc.identifier.scopusid	2-s2.0-85171526253	-
dc.identifier.bibliographicCitation	Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, v.2023-August, pp.889 - 893	-
dc.relation.isPartOf	Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH	-
dc.citation.title	Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH	-
dc.citation.volume	2023-August	-
dc.citation.startPage	889	-
dc.citation.endPage	893	-
dc.type.rims	ART	-
dc.type.docType	Conference paper	-
dc.description.journalClass	1	-
dc.description.isOpenAccess	N	-
dc.description.journalRegisteredClass	scopus	-
dc.subject.keywordPlus	Speech communication	-
dc.subject.keywordPlus	Adversarial training	-
dc.subject.keywordPlus	Automatic speech recognition	-
dc.subject.keywordPlus	Data augmentation	-
dc.subject.keywordPlus	Distribution matching	-
dc.subject.keywordPlus	Model generalization	-
dc.subject.keywordPlus	Performance	-
dc.subject.keywordPlus	Recognition models	-
dc.subject.keywordPlus	Regularization terms	-
dc.subject.keywordPlus	Training methods	-
dc.subject.keywordPlus	Word error rate	-
dc.subject.keywordPlus	Speech recognition	-
dc.subject.keywordAuthor	adversarial training	-
dc.subject.keywordAuthor	data augmentation	-
dc.subject.keywordAuthor	speech recognition	-
dc.identifier.url	https://www.isca-speech.org/archive/interspeech_2023/kim23l_interspeech.html	-

Files in This Item: Go to Link

Appears in Collections: 서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Chang, Joon-Hyuk photo

Chang, Joon-Hyuk: COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)

Read more

Altmetrics

Total Views & Downloads

STATISTICS: Total View :6,007,935; Today View :34,677

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1365

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE