Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

A space-efficient alphabet-independent Four-Russians' lookup table and a multithreaded Four-Russians' edit distance algorithm

Full metadata record
DC Field Value Language
dc.contributor.authorKim, Youngho-
dc.contributor.authorNa, Joong Chae-
dc.contributor.authorPark, Heejin-
dc.contributor.authorSim, Jeong Seop-
dc.date.accessioned2022-07-15T02:47:22Z-
dc.date.available2022-07-15T02:47:22Z-
dc.date.created2021-05-12-
dc.date.issued2016-12-
dc.identifier.issn0304-3975-
dc.identifier.urihttps://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/153478-
dc.description.abstractGiven two strings X (|X| = m) and Y (|Y| = n) over an alphabet Sigma, the edit distance between X and Y can be computed in 0 (mn/t) time with the help of the Four Russians' lookup table whose block size is t. The Four-Russians' lookup table can be constructed in O ((3|Sigma|)(2t)t(2)) time using O((3|Sigma|(2t)t) space. However, the construction time and space requirement of the lookup table grow very fast as the alphabet size increases and thus it has been used only when |Sigma| is very small. For example, when a string is a protein sequence, |Sigma| = 20 and thus it is almost impossible to use the Four-Russians' lookup table on typical workstations. In this paper, we present an efficient alphabet-independent Four-Russians' lookup table. It requires O (3(2t)(2t)!t) space and can be constructed in O (3(2t)(2t)!t(2)) time. Thus, the Four-Russians' lookup table can be constructed and used irrespective of the alphabet size. The time and space complexity were achieved by compacting the lookup table using a clever encoding of the preprocessed strings. Experimental results show that the space requirement of the lookup table is reduced to about 1/5,172,030 of its original size when |Sigma| = 26 and t = 4. Furthermore, we present efficient multithreaded parallel algorithms for edit distance computation using the Four Russians' lookup table. The parallel algorithm for lookup table construction runs in O(t) time and the parallel algorithm for edit distance computation between X and Y runs in O (m + n) time. Experiments performed on CUDA-supported GPU show that our algorithm runs about 942 times faster than the sequential version of the original Four-Russians' algorithm for 100 pairs of random strings of length approximately 1,000 when |Sigma| = 4 and t = 4.-
dc.language영어-
dc.language.isoen-
dc.publisherELSEVIER-
dc.titleA space-efficient alphabet-independent Four-Russians' lookup table and a multithreaded Four-Russians' edit distance algorithm-
dc.typeArticle-
dc.contributor.affiliatedAuthorPark, Heejin-
dc.identifier.doi10.1016/j.tcs.2016.04.028-
dc.identifier.scopusid2-s2.0-84965130966-
dc.identifier.wosid000390971000007-
dc.identifier.bibliographicCitationTHEORETICAL COMPUTER SCIENCE, v.656, pp.173 - 179-
dc.relation.isPartOfTHEORETICAL COMPUTER SCIENCE-
dc.citation.titleTHEORETICAL COMPUTER SCIENCE-
dc.citation.volume656-
dc.citation.startPage173-
dc.citation.endPage179-
dc.type.rimsART-
dc.type.docTypeArticle-
dc.description.journalClass1-
dc.description.isOpenAccessN-
dc.description.journalRegisteredClassscie-
dc.description.journalRegisteredClassscopus-
dc.relation.journalResearchAreaComputer Science-
dc.relation.journalWebOfScienceCategoryComputer Science, Theory & Methods-
dc.subject.keywordPlusApproximate string matching-
dc.subject.keywordPlusConstruction time-
dc.subject.keywordPlusEdit distance-
dc.subject.keywordPlusParallelizations-
dc.subject.keywordPlusProtein sequences-
dc.subject.keywordPlusSpace efficient-
dc.subject.keywordPlusSpace requirements-
dc.subject.keywordPlusTime and space complexity-
dc.subject.keywordAuthorApproximate string matching-
dc.subject.keywordAuthorEdit distance-
dc.subject.keywordAuthorFour-Russians&apos-
dc.subject.keywordAuthoralgorithm-
dc.subject.keywordAuthorParallelization-
dc.identifier.urlhttps://www.sciencedirect.com/science/article/pii/S0304397516300676?via%3Dihub-
Files in This Item
Go to Link
Appears in
Collections
서울 공과대학 > 서울 컴퓨터소프트웨어학부 > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Park, Hee jin photo

Park, Hee jin
COLLEGE OF ENGINEERING (SCHOOL OF COMPUTER SCIENCE)
Read more

Altmetrics

Total Views & Downloads

BROWSE