DIAFM: An improved and novel approach for incremental frequent Itemset mining
DC Field | Value | Language |
---|---|---|
dc.contributor.author | 이영문 | - |
dc.date.accessioned | 2025-01-09T06:00:21Z | - |
dc.date.available | 2025-01-09T06:00:21Z | - |
dc.date.issued | 2024-12 | - |
dc.identifier.issn | 2227-7390 | - |
dc.identifier.uri | https://scholarworks.bwise.kr/erica/handle/2021.sw.erica/121896 | - |
dc.description.abstract | Traditional approaches to data mining are generally designed for small, centralized, and static datasets. However, when a dataset grows at an enormous rate, the algorithms become infeasible in terms of huge consumption of computational and I/O resources. Frequent itemset mining (FIM) is one of the key algorithms in data mining and finds applications in a variety of domains; however, traditional algorithms do face problems in efficiently processing large and dynamic datasets. This research introduces a distributed incremental approximation frequent itemset mining (DIAFM) algorithm that tackles the mentioned challenges using shard-based approximation within the MapReduce framework. DIAFM minimizes the computational overhead of a program by reducing dataset scans, bypassing exact support checks, and incorporating shard-level error thresholds for an appropriate trade-off between efficiency and accuracy. Extensive experiments have demonstrated that DIAFM reduces runtime by 40-60% compared to traditional methods with losses in accuracy within 1-5%, even for datasets over 500,000 transactions. Its incremental nature ensures that new data increments are handled efficiently without needing to reprocess the entire dataset, making it particularly suitable for real-time, large-scale applications such as transaction analysis and IoT data streams. These results demonstrate the scalability, robustness, and practical applicability of DIAFM and establish it as a competitive and efficient solution for mining frequent itemsets in distributed, dynamic environments. | - |
dc.format.extent | 29 | - |
dc.language | 영어 | - |
dc.language.iso | ENG | - |
dc.publisher | MDPI | - |
dc.title | DIAFM: An improved and novel approach for incremental frequent Itemset mining | - |
dc.type | Article | - |
dc.identifier.doi | 10.3390/math12243930 | - |
dc.identifier.scopusid | 2-s2.0-85213380705 | - |
dc.identifier.wosid | 001384605800001 | - |
dc.identifier.bibliographicCitation | MATHEMATICS, v.12, no.24, pp 1 - 29 | - |
dc.citation.title | MATHEMATICS | - |
dc.citation.volume | 12 | - |
dc.citation.number | 24 | - |
dc.citation.startPage | 1 | - |
dc.citation.endPage | 29 | - |
dc.type.docType | Proceeding | - |
dc.description.isOpenAccess | N | - |
dc.description.journalRegisteredClass | scie | - |
dc.description.journalRegisteredClass | scopus | - |
dc.relation.journalResearchArea | Mathematics | - |
dc.relation.journalWebOfScienceCategory | MATHEMATICS | - |
dc.subject.keywordPlus | MAPREDUCEALGORITHMHADOOP | - |
dc.subject.keywordAuthor | distributed data miningMapReducelarge-scale data processingbig data analytics | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
55 Hanyangdeahak-ro, Sangnok-gu, Ansan, Gyeonggi-do, 15588, Korea+82-31-400-4269 sweetbrain@hanyang.ac.kr
COPYRIGHT © 2021 HANYANG UNIVERSITY. ALL RIGHTS RESERVED.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.