Efficient Processing of Substring Match Queries with Inverted q-gram Indexes
- Authors
- Kim, Younghoon; Woo, Kyoung-Gu; Park, Hyoungmin; Shim, Kyuseok
- Issue Date
- Mar-2010
- Publisher
- IEEE COMPUTER SOC
- Citation
- Proceedings - International Conference on Data Engineering, ICDE 2010, pp.721 - 732
- Indexed
- SCIE
SCOPUS
- Journal Title
- Proceedings - International Conference on Data Engineering, ICDE 2010
- Start Page
- 721
- End Page
- 732
- URI
- https://scholarworks.bwise.kr/erica/handle/2021.sw.erica/40537
- DOI
- 10.1109/ICDE.2010.5447866
- ISSN
- 1063-6382
- Abstract
- With the widespread of the internet, text-based data sources have become ubiquitous and the demand of effective support for string matching queries becomes ever increasing. The relational query language SQL also supports LIKE clause over string data to handle substring matching queries. Due to popularity of such substring matching queries, there have been a lot of study on designing efficient indexes to support the LIKE clause in SQL. Among them, q-gram based indexes have been studied extensively. However, how to process substring matching queries efficiently with such indexes has received very little attention until recently. In this paper, we show that the optimal execution of intersecting posting lists of q-grams for substring matching queries should be decided judiciously. Then we present the optimal and approximate algorithms based on cost estimation for substring matching queries. Performance study confirms that our techniques improve query execution time with q-gram indexes significantly compared to the traditional algorithms.
- Files in This Item
-
Go to Link
- Appears in
Collections - COLLEGE OF COMPUTING > DEPARTMENT OF ARTIFICIAL INTELLIGENCE > 1. Journal Articles
![qrcode](https://api.qrserver.com/v1/create-qr-code/?size=55x55&data=https://scholarworks.bwise.kr/erica/handle/2021.sw.erica/40537)
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.