Detailed Information

Cited 3 time in webofscience Cited 0 time in scopus
Metadata Downloads

Visual speech recognition of Korean words using convolutional neural network

Authors
Lee, S.-W.Yu, J.-H.Park, S.M.Sim, K.-B.
Issue Date
Mar-2019
Publisher
Korean Institute of Intelligent Systems
Keywords
Convolutional neural network; Human-robot interaction; Korean word recognition; Viola-Jones algorithm; Visual speech recognition
Citation
International Journal of Fuzzy Logic and Intelligent Systems, v.19, no.1, pp 1 - 9
Pages
9
Journal Title
International Journal of Fuzzy Logic and Intelligent Systems
Volume
19
Number
1
Start Page
1
End Page
9
URI
https://scholarworks.bwise.kr/cau/handle/2019.sw.cau/26394
DOI
10.5391/IJFIS.2019.19.1.1
ISSN
1598-2645
2093-744X
Abstract
In recent studies, speech recognition performance is greatly improved by using HMM and CNN. HMM is studying statistical modeling of voice to construct an acoustic model and to reduce the error rate by predicting voice through image of mouth region using CNN. In this paper, we propose visual speech recognition (VSR) using lip images. To implement VSR, we repeatedly recorded three subjects speaking 53 words chosen from an emergency medical service vocabulary book. To extract images of consonants, vowels, and final consonants in the recorded video, audio signals were used. The Viola-Jones algorithm was used for lip tracking on the extracted images. The lip tracking images were grouped and then classified using CNNs. To classify the components of a syllable including consonants, vowels, and final consonants, the structure of the CNN used VGG-s and modified LeNet-5, which has more layers. All syllable components were classified, and then the word was found by the Euclidean distance. From this experiment, a classification rate of 72.327% using 318 total testing words was obtained when VGG-s was used. When LeNet-5 applied this classifier for words, however, the classification rate was 22.327%. © The Korean Institute of Intelligent Systems.
Files in This Item
There are no files associated with this item.
Appears in
Collections
College of ICT Engineering > School of Electrical and Electronics Engineering > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Altmetrics

Total Views & Downloads

BROWSE