Detailed Information

Cited 1 time in webofscience Cited 1 time in scopus
Metadata Downloads

An Efficient Search Algorithm for Finding Genomic-Range Overlaps Based on the Maximum Range Length

Authors
Seok, Ho-SikSong, TaeminKong, Sek WonHwang, Kyu-Baek
Issue Date
Jul-2015
Publisher
IEEE COMPUTER SOC
Keywords
Genome analysis; next-generation sequencing; range search; RNA sequencing; variant annotation; variant prioritization; whole-genome sequencing
Citation
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, v.12, no.4, pp.778 - 784
Journal Title
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS
Volume
12
Number
4
Start Page
778
End Page
784
URI
http://scholarworks.bwise.kr/ssu/handle/2018.sw.ssu/8698
DOI
10.1109/TCBB.2014.2369042
ISSN
1545-5963
Abstract
Efficient search algorithms for finding genomic-range overlaps are essential for various bioinformatics applications. A majority of fast algorithms for searching the overlaps between a query range (e.g., a genomic variant) and a set of N reference ranges (e.g., exons) has time complexity of O(k + log N), where k denotes a term related to the length and location of the reference ranges. Here, we present a simple but efficient algorithm that reduces k, based on the maximum reference range length. Specifically, for a given query range and the maximum reference range length, the proposed method divides the reference range set into three subsets: always, potentially, and never overlapping. Therefore, search effort can be reduced by excluding never overlapping subset. We demonstrate that the running time of the proposed algorithm is proportional to potentially overlapping subset size, that is proportional to the maximum reference range length if all the other conditions are the same. Moreover, an implementation of our algorithm was 13.8 to 30.0 percent faster than one of the fastest range search methods available when tested on various genomic-range data sets. The proposed algorithm has been incorporated into a disease-linked variant prioritization pipeline for WGS (http://gnome.tchlab.org) and its implementation is available at http://ml.ssu.ac.kr/gSearch.
Files in This Item
There are no files associated with this item.
Appears in
Collections
College of Information Technology > School of Computer Science and Engineering > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Hwang, Kyu Baek photo

Hwang, Kyu Baek
College of Information Technology (School of Computer Science and Engineering)
Read more

Altmetrics

Total Views & Downloads

BROWSE