A Self-Evaluated Bilingual Automatic Speech Recognition System for Mandarin–English Mixed Conversations

Hai, Xinhe; Aranganadin, Kaviya; Yeh, Cheng Cheng; Hua, Zhengmao; Huang, Chenyun; Hsu, Huayi; Lin, M. C.

doi:10.3390/app15147691

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

A Self-Evaluated Bilingual Automatic Speech Recognition System for Mandarin–English Mixed Conversations

Full metadata record

DC Field	Value	Language
dc.contributor.author	Hai, Xinhe	-
dc.contributor.author	Aranganadin, Kaviya	-
dc.contributor.author	Yeh, Cheng Cheng	-
dc.contributor.author	Hua, Zhengmao	-
dc.contributor.author	Huang, Chenyun	-
dc.contributor.author	Hsu, Huayi	-
dc.contributor.author	Lin, M. C.	-
dc.date.accessioned	2025-09-10T02:30:24Z	-
dc.date.available	2025-09-10T02:30:24Z	-
dc.date.issued	2025-07	-
dc.identifier.issn	2076-3417	-
dc.identifier.issn	2076-3417	-
dc.identifier.uri	https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/208701	-
dc.description.abstract	Bilingual communication is increasingly prevalent in this globally connected world, where cultural exchanges and international interactions are unavoidable. Existing automatic speech recognition (ASR) systems are often limited to single languages. However, the growing demand for bilingual ASR in human–computer interactions, particularly in medical services, has become indispensable. This article addresses this need by creating an application programming interface (API)-based platform using VOSK, a popular open-source single-language ASR toolkit, to efficiently deploy a self-evaluated bilingual ASR system that seamlessly handles both primary and secondary languages in tasks like Mandarin–English mixed-speech recognition. The mixed error rate (MER) is used as a performance metric, and a workflow is outlined for its calculation using the edit distance algorithm. Results show a remarkable reduction in the Mandarin–English MER, dropping from ∼65% to under 13%, after implementing the self-evaluation framework and mixed-language algorithms. These findings highlight the importance of a well-designed system to manage the complexities of mixed-language speech recognition, offering a promising method for building a bilingual ASR system using existing monolingual models. The framework might be further extended to a trilingual or multilingual ASR system by preparing mixed-language datasets and computer development without involving complex training.	-
dc.format.extent	19	-
dc.language	영어	-
dc.language.iso	ENG	-
dc.publisher	MDPI	-
dc.title	A Self-Evaluated Bilingual Automatic Speech Recognition System for Mandarin–English Mixed Conversations	-
dc.title.alternative	A Self-Evaluated Bilingual Automatic Speech Recognition System for Mandarin-English Mixed Conversations	-
dc.type	Article	-
dc.publisher.location	스위스	-
dc.identifier.doi	10.3390/app15147691	-
dc.identifier.scopusid	2-s2.0-105011753018	-
dc.identifier.wosid	001535538800001	-
dc.identifier.bibliographicCitation	Applied Sciences-basel, v.15, no.14, pp 1 - 19	-
dc.citation.title	Applied Sciences-basel	-
dc.citation.volume	15	-
dc.citation.number	14	-
dc.citation.startPage	1	-
dc.citation.endPage	19	-
dc.type.docType	Article	-
dc.description.isOpenAccess	Y	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-
dc.relation.journalResearchArea	Chemistry	-
dc.relation.journalResearchArea	Engineering	-
dc.relation.journalResearchArea	Materials Science	-
dc.relation.journalResearchArea	Physics	-
dc.relation.journalWebOfScienceCategory	Chemistry, Multidisciplinary	-
dc.relation.journalWebOfScienceCategory	Engineering, Multidisciplinary	-
dc.relation.journalWebOfScienceCategory	Materials Science, Multidisciplinary	-
dc.relation.journalWebOfScienceCategory	Physics, Applied	-
dc.subject.keywordPlus	MODELS	-
dc.subject.keywordPlus	INFORMATION	-
dc.subject.keywordAuthor	Api	-
dc.subject.keywordAuthor	Automatic Speech Recognition	-
dc.subject.keywordAuthor	Bilingual	-
dc.subject.keywordAuthor	Mandarin–english	-
dc.subject.keywordAuthor	Mixed Error Rate	-
dc.subject.keywordAuthor	Computer Systems Programming	-
dc.subject.keywordAuthor	Human Computer Interaction	-
dc.subject.keywordAuthor	Linguistics	-
dc.subject.keywordAuthor	Open Source Software	-
dc.subject.keywordAuthor	Open Systems	-
dc.subject.keywordAuthor	Speech Communication	-
dc.subject.keywordAuthor	Speech Recognition	-
dc.subject.keywordAuthor	Applications Programming Interfaces	-
dc.subject.keywordAuthor	Automatic Speech Recognition	-
dc.subject.keywordAuthor	Automatic Speech Recognition System	-
dc.subject.keywordAuthor	Bilinguals	-
dc.subject.keywordAuthor	Computer Interaction	-
dc.subject.keywordAuthor	Error Rate	-
dc.subject.keywordAuthor	Growing Demand	-
dc.subject.keywordAuthor	Mandarin–english	-
dc.subject.keywordAuthor	Mixed Error Rate	-
dc.subject.keywordAuthor	Mixed Errors	-
dc.subject.keywordAuthor	Application Programming Interfaces (api)	-
dc.identifier.url	https://www.mdpi.com/2076-3417/15/14/7691	-

Files in This Item: Go to Link

Appears in Collections: 서울 공과대학 > 서울 전기공학전공 > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Lin, Ming Chieh photo

Lin, Ming Chieh: COLLEGE OF ENGINEERING (MAJOR IN ELECTRICAL ENGINEERING)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE