An efficient DNA sequence searching method using position specific weighting scheme

Kim, Woo-Cheol; Park, Sanghyun; Won, Jung-Im; Kim, Sang-Wook; Yoon, Jee-Hee

doi:10.1177/0165551506062329

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

An efficient DNA sequence searching method using position specific weighting scheme

Authors: Kim, Woo-Cheol; Park, Sanghyun; Won, Jung-Im; Kim, Sang-Wook; Yoon, Jee-Hee

Issue Date: Apr-2006

Publisher: SAGE Publications

Keywords: DNA database; indexing; query processing; exact match; wildcard match; k-mismatch

Citation: Journal of Information Science, v.32, no.2, pp 176 - 190

Pages: 15

Indexed: SCIE
SCOPUS

Journal Title: Journal of Information Science

Volume: 32

Number: 2

Start Page: 176

End Page: 190

URI: https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/181935

DOI: 10.1177/0165551506062329

ISSN: 0165-5515
1741-6485

Abstract: Exact match queries, wildcard match queries, and k-mismatch queries are widely used in various molecular biology applications including the searching of ESTs (Expressed Sequence Tags) and DNA transcription factors. In this paper, we suggest an efficient indexing and processing mechanism for such queries. Our indexing method places a sliding window at every possible location of a DNA sequence and extracts its signature by considering the occurrence frequency of each nucleotide. It then stores a set of signatures using a multi-dimensional index such as the R*-tree. Also, by assigning a weight to each position of a window, it prevents signatures from being concentrated around a few spots in indexing space. Our query processing method converts a query sequence into a multi-dimensional rectangle and searches the index for the signatures overlapping with the rectangle. Experiments with real biological data sets have revealed that the proposed approach is at least 4.4 times, 2.1 times, and several orders of magnitude faster than the previous one in performing exact match, wildcard match, and k-mismatch queries, respectively.

Files in This Item

86973.pdf 418.66 kB

Appears in Collections: 서울 공과대학 > 서울 컴퓨터소프트웨어학부 > 1. Journal Articles

Show full item record

qrcode

Related Researcher

Researcher Kim, Sang-Wook photo

Kim, Sang-Wook: COLLEGE OF ENGINEERING (SCHOOL OF COMPUTER SCIENCE)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE