Task-Specific Optimization of Virtual Channel Linear Prediction-Based Speech Dereverberation Front-End for Far-Field Speaker Verification

Yang, Joon-Young; Chang, Joon-Hyuk

doi:10.1109/TASLP.2022.3205752

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Task-Specific Optimization of Virtual Channel Linear Prediction-Based Speech Dereverberation Front-End for Far-Field Speaker Verification

Full metadata record

DC Field	Value	Language
dc.contributor.author	Yang, Joon-Young	-
dc.contributor.author	Chang, Joon-Hyuk	-
dc.date.accessioned	2022-12-20T06:28:27Z	-
dc.date.available	2022-12-20T06:28:27Z	-
dc.date.created	2022-11-02	-
dc.date.issued	2022-09	-
dc.identifier.issn	2329-9290	-
dc.identifier.uri	https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/173113	-
dc.description.abstract	Developing a single-microphone speech denoising or dereverberation front-end for robust automatic speaker verification (ASV) in noisy far-field speaking scenarios is challenging. To address this problem, we present a novel front-end design that involves a recently proposed extension of the weighted prediction error (WPE) speech dereverberation algorithm, the virtual acoustic channel expansion (VACE)-WPE. It is demonstrated experimentally in this study that unlike the conventional WPE algorithm, the VACE-WPE can be explicitly trained to cancel out both late reverberation and background noise. To build the front-end, the VACE-WPE is first (pre)trained to preserve the noise components in the input signals and produce "noisy" dereverberated output signals, thus making the front-end to be inductively biased to preserve as much noise components as possible and perform dereverberation only. Subsequently, given a pretrained speaker embedding model, the VACE-WPE is additionally fine-tuned within a task-specific optimization (TSO) framework, causing the speaker embedding extracted from the processed signal to be similar to that extracted from the "noise-free" target signal. Consequently, the front-end is optimized not to perform unnecessarily excessive denoising, thus achieving "generally safe" dereverberation and denoising for far-field ASV. Moreover, to prevent the front-end from adversely affecting the unconstrained "in-the-wild" ASV performance under more general, non-far-field conditions, we propose a distortion regularization method within the TSO framework. The effectiveness of the proposed approach is verified on both far-field and in-the-wild ASV benchmarks, demonstrating its superiority over fully neural front-ends and other TSO methods in various cases.	-
dc.language	영어	-
dc.language.iso	en	-
dc.publisher	IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC	-
dc.title	Task-Specific Optimization of Virtual Channel Linear Prediction-Based Speech Dereverberation Front-End for Far-Field Speaker Verification	-
dc.type	Article	-
dc.contributor.affiliatedAuthor	Chang, Joon-Hyuk	-
dc.identifier.doi	10.1109/TASLP.2022.3205752	-
dc.identifier.scopusid	2-s2.0-85139437720	-
dc.identifier.wosid	000865086500002	-
dc.identifier.bibliographicCitation	IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, v.30, pp.3144 - 3159	-
dc.relation.isPartOf	IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING	-
dc.citation.title	IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING	-
dc.citation.volume	30	-
dc.citation.startPage	3144	-
dc.citation.endPage	3159	-
dc.type.rims	ART	-
dc.type.docType	Article	-
dc.description.journalClass	1	-
dc.description.isOpenAccess	N	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-
dc.relation.journalResearchArea	Acoustics	-
dc.relation.journalResearchArea	Engineering	-
dc.relation.journalWebOfScienceCategory	Acoustics	-
dc.relation.journalWebOfScienceCategory	Engineering, Electrical & Electronic	-
dc.subject.keywordPlus	NEURAL-NETWORKS	-
dc.subject.keywordPlus	ENHANCEMENT	-
dc.subject.keywordPlus	VOICES	-
dc.subject.keywordAuthor	Noise reduction	-
dc.subject.keywordAuthor	Training	-
dc.subject.keywordAuthor	Noise measurement	-
dc.subject.keywordAuthor	Task analysis	-
dc.subject.keywordAuthor	Optimization	-
dc.subject.keywordAuthor	Microphones	-
dc.subject.keywordAuthor	Reverberation	-
dc.subject.keywordAuthor	Deep neural network	-
dc.subject.keywordAuthor	offline processing	-
dc.subject.keywordAuthor	speaker verification	-
dc.subject.keywordAuthor	speech dereverberation	-
dc.subject.keywordAuthor	single microphone	-
dc.subject.keywordAuthor	virtual acoustic channel expansion	-
dc.subject.keywordAuthor	weighted prediction error	-
dc.identifier.url	https://ieeexplore.ieee.org/document/9889165	-

Files in This Item: Go to Link

Appears in Collections: 서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Chang, Joon-Hyuk photo

Chang, Joon-Hyuk: COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE