PSU: Particle Stacking Undersampling Method for Highly Imbalanced Big Dataopen access
- Authors
- Jeon, Yong-Seok; Lim, Dong-Joon
- Issue Date
- 2020
- Publisher
- IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
- Keywords
- Support vector machines; Training; Stacking; Computational efficiency; Classification algorithms; Licenses; Kernel; Data mining; imbalanced data; undersampling; big data; support vector machines
- Citation
- IEEE ACCESS, v.8, pp 131920 - 131927
- Pages
- 8
- Indexed
- SCIE
SCOPUS
- Journal Title
- IEEE ACCESS
- Volume
- 8
- Start Page
- 131920
- End Page
- 131927
- URI
- https://scholarworks.bwise.kr/skku/handle/2021.sw.skku/7228
- DOI
- 10.1109/ACCESS.2020.3009753
- ISSN
- 2169-3536
- Abstract
- Imbalanced classes are a common problem in machine learning, and the computational costs required for proper resampling increases with the data size. In this study, a simple and effective undersampling method, named particle stacking undersampling (PSU) was proposed. Compared with other competing undersampling methods, PSU can significantly reduce the computational costs, while minimizing information loss to prevent a prediction bias. The performance benchmark applied on 55 binary classification problems indicated that the proposed method not only achieved an enhanced classification performance over other well-known undersampling methods (random undersampling, NearMiss-1, NearMiss-2, cluster centroid, edited nearest neighbor, condensed nearest neighbor, and Tomek Links) but also provided a computational simplicity that can be scalable to large data. Moreover, an experiment verified that two propositions forming the basis of the PSU algorithm can also be applied to other undersampling methods to achieve methodological improvements.
- Files in This Item
- There are no files associated with this item.
- Appears in
Collections - Engineering > Department of Systems Management Engineering > 1. Journal Articles
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.