DIAFM: An improved and novel approach for incremental frequent Itemset mining
- Authors
- 이영문
- Issue Date
- Dec-2024
- Publisher
- MDPI
- Keywords
- distributed data miningMapReducelarge-scale data processingbig data analytics
- Citation
- MATHEMATICS, v.12, no.24, pp 1 - 29
- Pages
- 29
- Indexed
- SCIE
SCOPUS
- Journal Title
- MATHEMATICS
- Volume
- 12
- Number
- 24
- Start Page
- 1
- End Page
- 29
- URI
- https://scholarworks.bwise.kr/erica/handle/2021.sw.erica/121896
- DOI
- 10.3390/math12243930
- ISSN
- 2227-7390
- Abstract
- Traditional approaches to data mining are generally designed for small, centralized, and static datasets. However, when a dataset grows at an enormous rate, the algorithms become infeasible in terms of huge consumption of computational and I/O resources. Frequent itemset mining (FIM) is one of the key algorithms in data mining and finds applications in a variety of domains; however, traditional algorithms do face problems in efficiently processing large and dynamic datasets. This research introduces a distributed incremental approximation frequent itemset mining (DIAFM) algorithm that tackles the mentioned challenges using shard-based approximation within the MapReduce framework. DIAFM minimizes the computational overhead of a program by reducing dataset scans, bypassing exact support checks, and incorporating shard-level error thresholds for an appropriate trade-off between efficiency and accuracy. Extensive experiments have demonstrated that DIAFM reduces runtime by 40-60% compared to traditional methods with losses in accuracy within 1-5%, even for datasets over 500,000 transactions. Its incremental nature ensures that new data increments are handled efficiently without needing to reprocess the entire dataset, making it particularly suitable for real-time, large-scale applications such as transaction analysis and IoT data streams. These results demonstrate the scalability, robustness, and practical applicability of DIAFM and establish it as a competitive and efficient solution for mining frequent itemsets in distributed, dynamic environments.
- Files in This Item
- There are no files associated with this item.
- Appears in
Collections - COLLEGE OF ENGINEERING SCIENCES > DEPARTMENT OF ROBOT ENGINEERING > 1. Journal Articles

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.