Efficient link-based clustering in a large scaled blog network
- Authors
- Yoon, Seok-Ho; Song, Suk-Soon; Kim, Sang-Wook
- Issue Date
- Feb-2011
- Publisher
- Association for Computing Machinery
- Keywords
- Blog data; Link-based clustering; LinkClus
- Citation
- Proceedings of the 5th International Conference on Ubiquitous Information Management and Communication, ICUIMC 2011, pp.1 - 5
- Indexed
- SCOPUS
- Journal Title
- Proceedings of the 5th International Conference on Ubiquitous Information Management and Communication, ICUIMC 2011
- Start Page
- 1
- End Page
- 5
- URI
- https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/169150
- DOI
- 10.1145/1968613.1968699
- ISSN
- 0000-0000
- Abstract
- In this paper, we address efficient processing of link-based clustering in large-scaled data environment. LinkClus is a link-based clustering method that provides good accuracy and reasonable performance. This paper first shows that this method is not sufficiently scalable to be applied to a huge volume of real-world blog data. Then, we observe that the performance bottleneck of LinkClus exists on the initial clustering step. We propose a new method to get over this performance bottleneck. The proposed method first identifies the seed sets for initial clustering efficiently. Here, each seed set consists of a small number (=2∼3) of objects that are highly similar to one another. The method then adds every other object into one of seed sets that are the most similar to the object. It also eliminates those objects of very few links that negatively affect the accuracy, thereby enhancing the overall processing performance. Via experiments with real-world blog data, we verify the scalability and accuracy of the proposed method.
- Files in This Item
-
Go to Link
- Appears in
Collections - 서울 공과대학 > 서울 컴퓨터소프트웨어학부 > 1. Journal Articles

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.