Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Diagnosis-aware multitask fine-tuning of Whisper for dysarthric speech recognition

Full metadata record
DC Field Value Language
dc.contributor.authorChung, Yoona-
dc.contributor.authorHong, Jeongmin-
dc.contributor.authorLee, Jaehyuk-
dc.contributor.authorKim, Eunchan-
dc.date.accessioned2026-04-21T06:30:35Z-
dc.date.available2026-04-21T06:30:35Z-
dc.date.issued2026-05-
dc.identifier.issn0167-6393-
dc.identifier.issn1872-7182-
dc.identifier.urihttps://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/212288-
dc.description.abstractAbstractIndividuals with dysarthria exhibit irregular speech patterns that vary by disease, significantly reducing the accuracy of conventional speech-recognition systems. Previous studies have typically focused on a single disease group or used aggregated data without accounting for inter-disease variation, thereby limiting disease-specific insights. In this study, fluency metrics were extracted from a Korean dysarthric speech corpus across three disease groups (stroke, cerebral palsy, and peripheral neuropathy) and the diseases were classified based on these features. The performance of the disease-specific speech-recognition models was evaluated using the weighted character error rate (Weighted-CER). Results showed that classification based on fluency metrics achieved 99% accuracy. The disease-specific models improved the CER by up to 18.34 and 1.05 percentage points compared with the Whisper–Small model and a model trained on the entire dataset, respectively. In terms of Weighted-CER, the error rate decreased by up to 15.27 and 1.49 percentage points, respectively. These findings indicate that disease-specific models can meaningfully enhance speech recognition and underscore the importance of developing speech-recognition systems that can adapt to individual speech characteristics in patients with dysarthria-
dc.format.extent17-
dc.language영어-
dc.language.isoENG-
dc.publisherELSEVIER-
dc.titleDiagnosis-aware multitask fine-tuning of Whisper for dysarthric speech recognition-
dc.typeArticle-
dc.publisher.location네덜란드-
dc.identifier.doi10.1016/j.specom.2026.103393-
dc.identifier.scopusid2-s2.0-105034726630-
dc.identifier.wosid001738286000001-
dc.identifier.bibliographicCitationSPEECH COMMUNICATION, v.180, pp 1 - 17-
dc.citation.titleSPEECH COMMUNICATION-
dc.citation.volume180-
dc.citation.startPage1-
dc.citation.endPage17-
dc.type.docTypeArticle-
dc.description.isOpenAccessN-
dc.description.journalRegisteredClassscie-
dc.description.journalRegisteredClassscopus-
dc.relation.journalResearchAreaAcoustics-
dc.relation.journalResearchAreaComputer Science-
dc.relation.journalWebOfScienceCategoryAcoustics-
dc.relation.journalWebOfScienceCategoryComputer Science, Interdisciplinary Applications-
dc.subject.keywordPlusAUTOMATIC SPEECH-
dc.subject.keywordPlusFUNCTION APPROXIMATION-
dc.subject.keywordPlusPARAMETERS-
dc.subject.keywordPlusDISORDERS-
dc.subject.keywordPlusLANGUAGE-
dc.subject.keywordPlusCHILDREN-
dc.subject.keywordPlusMODEL-
dc.subject.keywordAuthorDysarthria-
dc.subject.keywordAuthorFluency metrics-
dc.subject.keywordAuthorVoice quality-
dc.subject.keywordAuthorPathology fine-tuning-
dc.subject.keywordAuthorASR-
dc.identifier.urlhttps://www.sciencedirect.com/science/article/pii/S0167639326000415?via%3Dihub-
Files in This Item
Go to Link
Appears in
Collections
서울 공과대학 > 서울 정보시스템학과 > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Kim, Eunchan photo

Kim, Eunchan
COLLEGE OF ENGINEERING (DEPARTMENT OF INFORMATION SYSTEMS)
Read more

Altmetrics

Total Views & Downloads

BROWSE