C-Affinity: A Novel Similarity Measure for Effective Data Clustering
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Hong, Jiwon | - |
dc.contributor.author | Kim, Sang-Wook | - |
dc.date.accessioned | 2023-06-01T07:00:18Z | - |
dc.date.available | 2023-06-01T07:00:18Z | - |
dc.date.issued | 2023-04 | - |
dc.identifier.uri | https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/185837 | - |
dc.description.abstract | Clustering is widely employed in various applications as it is one of the most useful data mining techniques. In performing clustering, a similarity measure, which defines how similar a pair of data objects are, plays an important role. A similarity measure is employed by considering a target dataset's characteristics. Current similarity measures (or distances) do not reflect the distribution of data objects in a dataset at all. From the clustering point of view, this fact may limit the clustering accuracy. In this paper, we propose c-affinity, a new notion of a similarity measure that reflects the distribution of objects in the given dataset from a clustering point of view. We design c-affinity between any two objects to have a higher value as they are more likely to belong to the same cluster by learning the data distribution. We use random walk with restart (RWR) on the k-nearest neighbor graph of the given dataset to measure (1) how similar a pair of objects are and (2) how densely other objects are distributed between them. Via extensive experiments on sixteen synthetic and real-world datasets, we verify that replacing the existing similarity measure with our c-affinity improves the clustering accuracy significantly. | - |
dc.format.extent | 4 | - |
dc.language | 영어 | - |
dc.language.iso | ENG | - |
dc.publisher | Association for Computing Machinery, Inc | - |
dc.title | C-Affinity: A Novel Similarity Measure for Effective Data Clustering | - |
dc.type | Article | - |
dc.publisher.location | 미국 | - |
dc.identifier.doi | 10.1145/3543873.3587307 | - |
dc.identifier.scopusid | 2-s2.0-85159592684 | - |
dc.identifier.bibliographicCitation | ACM Web Conference 2023 - Companion of the World Wide Web Conference, WWW 2023, pp 41 - 44 | - |
dc.citation.title | ACM Web Conference 2023 - Companion of the World Wide Web Conference, WWW 2023 | - |
dc.citation.startPage | 41 | - |
dc.citation.endPage | 44 | - |
dc.type.docType | Conference paper | - |
dc.description.isOpenAccess | N | - |
dc.description.journalRegisteredClass | scopus | - |
dc.subject.keywordPlus | C (programming language) | - |
dc.subject.keywordPlus | Cluster analysis | - |
dc.subject.keywordPlus | Clustering algorithms | - |
dc.subject.keywordPlus | Data mining | - |
dc.subject.keywordPlus | Nearest neighbor search | - |
dc.subject.keywordPlus | Clustering accuracy | - |
dc.subject.keywordPlus | Clustering affinity | - |
dc.subject.keywordPlus | Clusterings | - |
dc.subject.keywordPlus | Data clustering | - |
dc.subject.keywordPlus | Data objects | - |
dc.subject.keywordPlus | Near neighbor graph | - |
dc.subject.keywordPlus | Nearest-neighbour | - |
dc.subject.keywordPlus | Neighbor graph | - |
dc.subject.keywordPlus | Similarity measure | - |
dc.subject.keywordAuthor | clustering | - |
dc.subject.keywordAuthor | clustering affinity | - |
dc.subject.keywordAuthor | nearest neighbor graph | - |
dc.subject.keywordAuthor | similarity measure | - |
dc.identifier.url | https://dl.acm.org/doi/10.1145/3543873.3587307 | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1365
COPYRIGHT © 2021 HANYANG UNIVERSITY.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.