Sequence dicriminative training 기법을 사용한 트랜스포머 기반 음향 모델 성능 향상

이채원; 장준혁

doi:10.7776/ASK.2022.41.3.335

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Sequence dicriminative training 기법을 사용한 트랜스포머 기반 음향 모델 성능 향상

Full metadata record

DC Field	Value	Language
dc.contributor.author	이채원	-
dc.contributor.author	장준혁	-
dc.date.accessioned	2023-09-26T08:50:07Z	-
dc.date.available	2023-09-26T08:50:07Z	-
dc.date.issued	2022-05	-
dc.identifier.issn	1225-4428	-
dc.identifier.issn	2287-3775	-
dc.identifier.uri	https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/191208	-
dc.description.abstract	본 논문에서는 기존 자연어 처리 분야에서 뛰어난 성능을 보이는 트랜스포머를 하이브리드 음성인식에서의음향모델로 사용하였다. 트랜스포머 음향모델은 attention 구조를 사용하여 시계열 데이터를 처리하며 연산량이 낮으면서 높은 성능을 보인다. 본 논문은 이러한 트랜스포머 AM에 기존 DNN-HMM 모델에서 사용하는 가중 유한 상태 전이기(weighted Finite-State Transducer, wFST) 기반 학습인 시퀀스 분류 학습의 네 가지 알고리즘을 각각 적용하여 성능을 높이는 방법을 제안한다. 또한 기존 Cross Entropy(CE)를 사용한 학습방식과 비교하여 5 %의 상대적word error rate(WER) 감소율을 보였다.	-
dc.description.abstract	In this paper, we adopt a transformer that shows remarkable performance in natural language processing as an acoustic model of hybrid speech recognition. The transformer acoustic model uses attention structures to process sequential data and shows high performance with low computational cost. This paper proposes a method to improve the performance of transformer AM by applying each of the four algorithms of sequence discriminative training, a weighted finite-state transducer (wFST)-based learning used in the existing DNN-HMM model. In addition, compared to the Cross Entropy (CE) learning method, sequence discriminative method shows 5 % of the relative Word Error Rate (WER).	-
dc.format.extent	7	-
dc.language	한국어	-
dc.language.iso	KOR	-
dc.publisher	한국음향학회	-
dc.title	Sequence dicriminative training 기법을 사용한 트랜스포머 기반 음향 모델 성능 향상	-
dc.title.alternative	Improving transformer-based acoustic model performance using sequence discriminative training	-
dc.type	Article	-
dc.publisher.location	대한민국	-
dc.identifier.doi	10.7776/ASK.2022.41.3.335	-
dc.identifier.scopusid	2-s2.0-85133242678	-
dc.identifier.wosid	000810472000009	-
dc.identifier.bibliographicCitation	한국음향학회지, v.41, no.3, pp 335 - 341	-
dc.citation.title	한국음향학회지	-
dc.citation.volume	41	-
dc.citation.number	3	-
dc.citation.startPage	335	-
dc.citation.endPage	341	-
dc.type.docType	Article	-
dc.identifier.kciid	ART002844665	-
dc.description.isOpenAccess	N	-
dc.description.journalRegisteredClass	scopus	-
dc.description.journalRegisteredClass	esci	-
dc.description.journalRegisteredClass	kci	-
dc.relation.journalResearchArea	Acoustics	-
dc.relation.journalWebOfScienceCategory	Acoustics	-
dc.subject.keywordAuthor	Speech recognition	-
dc.subject.keywordAuthor	Transformer	-
dc.subject.keywordAuthor	Sequence discriminative training	-
dc.subject.keywordAuthor	Weighted finite state transducer	-
dc.subject.keywordAuthor	음성인식	-
dc.subject.keywordAuthor	트랜스포머	-
dc.subject.keywordAuthor	시퀀스 분류 학습	-
dc.subject.keywordAuthor	가중 유한 상태 전이기	-
dc.identifier.url	http://koreascience.or.kr/article/JAKO202216466662641.page	-

Files in This Item

Sequence dicriminative training 기법을 사용한 트랜스포머 기반 음향 모델 성능 향상.pdf 550.74 kB

Appears in Collections: 서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Chang, Joon-Hyuk photo

Chang, Joon-Hyuk: COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE