Online dependence clustering of multivariate streaming data using one-class SVMs

Lee, Geonseok; Lee, Kichun

doi:10.1002/int.22716

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Online dependence clustering of multivariate streaming data using one-class SVMs

Authors: Lee, Geonseok; Lee, Kichun

Issue Date: Jun-2022

Publisher: WILEY

Keywords: dependence clustering; one-class support vector machine; online data analysis; outlier detection; unsupervised learning

Citation: INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, v.37, no.6, pp 3682 - 3708

Pages: 27

Indexed: SCIE
SCOPUS

Journal Title: INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS

Volume: 37

Number: 6

Start Page: 3682

End Page: 3708

URI: https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/197034

DOI: 10.1002/int.22716

ISSN: 0884-8173
1098-111X

Abstract: Online clustering of multivariate streaming data has attracted considerable interest in recent years due to the abundance of data sources. Numerous studies in this field have been performed, but they usually suffer from the practical problems associated with discovering arbitrary-shaped clusters, specifying major parameters in advance, and detecting aberrant observations. Addressing these issues is important for online-clustering tasks, where data arrive in continuous streams and group behaviors change simultaneously. In this paper, we propose a kernel-based online dependence clustering, namely, KODC, that not only estimates the cluster membership using one-class support vector machines (OC-SVMs), but also detects outliers distant from the identified clusters by aggregating OC-SVM decisions in a realtime basis. At the base level, we use a new measure of connective dependence that forms the graph connected via modified Markovian transitions to enable large-scale clustering. The proposed framework introduces the coherence threshold to extract data points, which can represent a cluster to which they belong, thus controlling the computational complexity without degrading the clustering performance. To track the pattern evolution over time, KODC also updates the classifier configuration maximizing the total group connective dependence. We evaluate this framework on both several synthetic and real-world data sets involving multivariate streaming data, and compare it experimentally with other popular online-clustering methods in terms of four evaluation metrics. The results show that our framework effectively identifies the clusters and outliers, especially in various shaped data subject to change over time, without requiring any prior knowledge of the data.

Files in This Item: There are no files associated with this item.

Appears in Collections: 서울 공과대학 > 서울 산업공학과 > 1. Journal Articles

Show full item record

qrcode

Related Researcher

Researcher Lee, Ki chun photo

Lee, Ki chun: COLLEGE OF ENGINEERING (DEPARTMENT OF INDUSTRIAL ENGINEERING)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE