Improvement of the end-to-end scene text recognition method for &quot;text-to-speech&quot; conversion

Makhmudov, F.; Mukhiddinov, M.; Akmalbek, Abdusalomov; Avazov, K.; Khamdamov, U.; Cho, Young Im

Detailed Information

Cited 16 time in webofscience

Cited 18 time in scopus

Metadata Downloads

Improvement of the end-to-end scene text recognition method for "text-to-speech" conversion

Full metadata record

DC Field	Value	Language
dc.contributor.author	Makhmudov, F.	-
dc.contributor.author	Mukhiddinov, M.	-
dc.contributor.author	Akmalbek, Abdusalomov	-
dc.contributor.author	Avazov, K.	-
dc.contributor.author	Khamdamov, U.	-
dc.contributor.author	Cho, Young Im	-
dc.date.available	2021-01-06T03:40:51Z	-
dc.date.created	2020-11-13	-
dc.date.issued	2020-11	-
dc.identifier.issn	0219-6913	-
dc.identifier.uri	https://scholarworks.bwise.kr/gachon/handle/2020.sw.gachon/79619	-
dc.description.abstract	Methods for text detection and recognition in images of natural scenes have become an active research topic in computer vision and have obtained encouraging achievements over several benchmarks. In this paper, we introduce a robust yet simple pipeline that produces accurate and fast text detection and recognition for the Uzbek language in natural scene images using a fully convolutional network and the Tesseract OCR engine. First, the text detection step quickly predicts text in random orientations in full-color images with a single fully convolutional neural network, discarding redundant intermediate stages. Then, the text recognition step recognizes the Uzbek language, including both the Latin and Cyrillic alphabets, using a trained Tesseract OCR engine. Finally, the recognized text can be pronounced using the Uzbek language text-to-speech synthesizer. The proposed method was tested on the ICDAR 2013, ICDAR 2015 and MSRA-TD500 datasets, and it showed an advantage in efficiently detecting and recognizing text from natural scene images for assisting the visually impaired. © 2020	-
dc.language	영어	-
dc.language.iso	en	-
dc.publisher	WORLD SCIENTIFIC PUBL CO PTE LTD	-
dc.relation.isPartOf	International Journal of Wavelets, Multiresolution and Information Processing	-
dc.title	Improvement of the end-to-end scene text recognition method for "text-to-speech" conversion	-
dc.type	Article	-
dc.type.rims	ART	-
dc.description.journalClass	1	-
dc.identifier.wosid	000599931700007	-
dc.identifier.doi	10.1142/S0219691320500526	-
dc.identifier.bibliographicCitation	International Journal of Wavelets, Multiresolution and Information Processing, v.18, no.6	-
dc.description.isOpenAccess	N	-
dc.identifier.scopusid	2-s2.0-85095450703	-
dc.citation.title	International Journal of Wavelets, Multiresolution and Information Processing	-
dc.citation.volume	18	-
dc.citation.number	6	-
dc.contributor.affiliatedAuthor	Makhmudov, F.	-
dc.contributor.affiliatedAuthor	Mukhiddinov, M.	-
dc.contributor.affiliatedAuthor	Akmalbek, Abdusalomov	-
dc.contributor.affiliatedAuthor	Avazov, K.	-
dc.contributor.affiliatedAuthor	Cho, Young Im	-
dc.type.docType	Article	-
dc.subject.keywordAuthor	fully convolutional network	-
dc.subject.keywordAuthor	natural scene images	-
dc.subject.keywordAuthor	optical character recognition	-
dc.subject.keywordAuthor	Scene text detection	-
dc.subject.keywordAuthor	text recognition	-
dc.subject.keywordAuthor	text-to-speech synthesizer	-
dc.subject.keywordAuthor	visually impaired	-
dc.subject.keywordPlus	Character recognition	-
dc.subject.keywordPlus	Convolution	-
dc.subject.keywordPlus	Convolutional neural networks	-
dc.subject.keywordPlus	Engines	-
dc.subject.keywordPlus	Speech synthesis	-
dc.subject.keywordPlus	Convolutional networks	-
dc.subject.keywordPlus	Full color images	-
dc.subject.keywordPlus	Intermediate stage	-
dc.subject.keywordPlus	Natural scene images	-
dc.subject.keywordPlus	Random orientations	-
dc.subject.keywordPlus	Text recognition	-
dc.subject.keywordPlus	Text to speech synthesizers	-
dc.subject.keywordPlus	Visually impaired	-
dc.subject.keywordPlus	Speech recognition	-
dc.relation.journalResearchArea	Computer Science	-
dc.relation.journalResearchArea	Mathematics	-
dc.relation.journalWebOfScienceCategory	Computer Science, Software Engineering	-
dc.relation.journalWebOfScienceCategory	Mathematics, Interdisciplinary Applications	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-

Files in This Item: There are no files associated with this item.

Appears in Collections: IT융합대학 > 컴퓨터공학과 > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher ugli, Mukhiddinov Mukhriddin Nuriddin photo

ugli, Mukhiddinov Mukhriddin Nuriddin: College of IT Convergence (컴퓨터공학부(컴퓨터공학전공))

Read more

Altmetrics

Total Views & Downloads

STATISTICS: Total View :4,206,235; Today View :726

RSS_1.0 RSS_2.0 ATOM_1.0

1342, Seongnam-daero, Sujeong-gu, Seongnam-si, Gyeonggi-do, Republic of Korea(13120)031-750-5114

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE