Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Performance bottleneck of subsequence matching in time-series databases: Observation, solution, and performance evaluation

Authors
Kim, Sang WookJeong, Byeong Soo
Issue Date
Nov-2007
Publisher
ELSEVIER SCIENCE INC
Keywords
data mining; time-series databases; similar sequence matching; performance
Citation
INFORMATION SCIENCES, v.177, no.22, pp.4841 - 4858
Indexed
SCIE
SCOPUS
Journal Title
INFORMATION SCIENCES
Volume
177
Number
22
Start Page
4841
End Page
4858
URI
https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/179410
DOI
10.1016/j.ins.2007.06.032
ISSN
0020-0255
Abstract
Subsequence matching is an operation that finds subsequences whose changing patterns are similar to a given query sequence from time-series databases. This paper identifies a performance bottleneck in subsequence matching, and then proposes an effective method that substantially improves the performance of entire subsequence matching by resolving the performance bottleneck. First, we analyze the disk access and CPU processing times required during the index searching and post-processing steps of subsequence matching through preliminary experiments. Based on these results, we show that the post-processing step is a main performance, bottleneck in subsequence matching. Then, we argue that the optimization of the post-processing step is a crucial issue overlooked in previous approaches. In order to resolve the performance bottleneck, we propose a simple yet highly effective method for expediting the post-processing step. By rearranging the order of candidate subsequences to be compared with a query sequence, our method completely eliminates the redundancies of disk accesses and CPU processing that occur in the post-processing step. Our method is fairly efficient, and does not incur any false dismissal. We quantitatively demonstrate the superiority of our method through extensive experimentation. The results show that our method produces a significantly faster post-processing step; When using a data set of real-world stock sequences, our method was 43.36-96.75 times faster than previous methods, and when using data sets of large numbers of synthetic sequences, our method was 12.48-26.95 times faster than previous methods. Also, the results show that our method reduces the weight of the post-processing step over entire subsequence matching from more than 97% to less than 67%. This implies that our method successfully resolves the performance bottleneck in subsequence matching. As a result, our method provides excellent performance in entire subsequence matching. Compared with previous methods, our method is 16.17-32.64 times faster when using a data set of real-world stock sequences and 8.64-14.29 times faster when using data sets of large numbers of synthetic sequences.
Files in This Item
Go to Link
Appears in
Collections
서울 공과대학 > 서울 컴퓨터소프트웨어학부 > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Kim, Sang-Wook photo

Kim, Sang-Wook
COLLEGE OF ENGINEERING (SCHOOL OF COMPUTER SCIENCE)
Read more

Altmetrics

Total Views & Downloads

BROWSE