Improvement of the end-to-end scene text recognition method for "text-to-speech" conversion
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Makhmudov, F. | - |
dc.contributor.author | Mukhiddinov, M. | - |
dc.contributor.author | Akmalbek, Abdusalomov | - |
dc.contributor.author | Avazov, K. | - |
dc.contributor.author | Khamdamov, U. | - |
dc.contributor.author | Cho, Young Im | - |
dc.date.available | 2021-01-06T03:40:51Z | - |
dc.date.created | 2020-11-13 | - |
dc.date.issued | 2020-11 | - |
dc.identifier.issn | 0219-6913 | - |
dc.identifier.uri | https://scholarworks.bwise.kr/gachon/handle/2020.sw.gachon/79619 | - |
dc.description.abstract | Methods for text detection and recognition in images of natural scenes have become an active research topic in computer vision and have obtained encouraging achievements over several benchmarks. In this paper, we introduce a robust yet simple pipeline that produces accurate and fast text detection and recognition for the Uzbek language in natural scene images using a fully convolutional network and the Tesseract OCR engine. First, the text detection step quickly predicts text in random orientations in full-color images with a single fully convolutional neural network, discarding redundant intermediate stages. Then, the text recognition step recognizes the Uzbek language, including both the Latin and Cyrillic alphabets, using a trained Tesseract OCR engine. Finally, the recognized text can be pronounced using the Uzbek language text-to-speech synthesizer. The proposed method was tested on the ICDAR 2013, ICDAR 2015 and MSRA-TD500 datasets, and it showed an advantage in efficiently detecting and recognizing text from natural scene images for assisting the visually impaired. © 2020 | - |
dc.language | 영어 | - |
dc.language.iso | en | - |
dc.publisher | WORLD SCIENTIFIC PUBL CO PTE LTD | - |
dc.relation.isPartOf | International Journal of Wavelets, Multiresolution and Information Processing | - |
dc.title | Improvement of the end-to-end scene text recognition method for "text-to-speech" conversion | - |
dc.type | Article | - |
dc.type.rims | ART | - |
dc.description.journalClass | 1 | - |
dc.identifier.wosid | 000599931700007 | - |
dc.identifier.doi | 10.1142/S0219691320500526 | - |
dc.identifier.bibliographicCitation | International Journal of Wavelets, Multiresolution and Information Processing, v.18, no.6 | - |
dc.description.isOpenAccess | N | - |
dc.identifier.scopusid | 2-s2.0-85095450703 | - |
dc.citation.title | International Journal of Wavelets, Multiresolution and Information Processing | - |
dc.citation.volume | 18 | - |
dc.citation.number | 6 | - |
dc.contributor.affiliatedAuthor | Makhmudov, F. | - |
dc.contributor.affiliatedAuthor | Mukhiddinov, M. | - |
dc.contributor.affiliatedAuthor | Akmalbek, Abdusalomov | - |
dc.contributor.affiliatedAuthor | Avazov, K. | - |
dc.contributor.affiliatedAuthor | Cho, Young Im | - |
dc.type.docType | Article | - |
dc.subject.keywordAuthor | fully convolutional network | - |
dc.subject.keywordAuthor | natural scene images | - |
dc.subject.keywordAuthor | optical character recognition | - |
dc.subject.keywordAuthor | Scene text detection | - |
dc.subject.keywordAuthor | text recognition | - |
dc.subject.keywordAuthor | text-to-speech synthesizer | - |
dc.subject.keywordAuthor | visually impaired | - |
dc.subject.keywordPlus | Character recognition | - |
dc.subject.keywordPlus | Convolution | - |
dc.subject.keywordPlus | Convolutional neural networks | - |
dc.subject.keywordPlus | Engines | - |
dc.subject.keywordPlus | Speech synthesis | - |
dc.subject.keywordPlus | Convolutional networks | - |
dc.subject.keywordPlus | Full color images | - |
dc.subject.keywordPlus | Intermediate stage | - |
dc.subject.keywordPlus | Natural scene images | - |
dc.subject.keywordPlus | Random orientations | - |
dc.subject.keywordPlus | Text recognition | - |
dc.subject.keywordPlus | Text to speech synthesizers | - |
dc.subject.keywordPlus | Visually impaired | - |
dc.subject.keywordPlus | Speech recognition | - |
dc.relation.journalResearchArea | Computer Science | - |
dc.relation.journalResearchArea | Mathematics | - |
dc.relation.journalWebOfScienceCategory | Computer Science, Software Engineering | - |
dc.relation.journalWebOfScienceCategory | Mathematics, Interdisciplinary Applications | - |
dc.description.journalRegisteredClass | scie | - |
dc.description.journalRegisteredClass | scopus | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
1342, Seongnam-daero, Sujeong-gu, Seongnam-si, Gyeonggi-do, Republic of Korea(13120)031-750-5114
COPYRIGHT 2020 Gachon University All Rights Reserved.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.