Clustering texts using feature similarity based AHC algorithm
- Authors
- Jo, Taeho
- Issue Date
- 2018
- Publisher
- IOS PRESS
- Keywords
- Feature value similarity; feature similarity; AHC algorithm; text clustering
- Citation
- JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, v.35, no.6, pp.5993 - 6003
- Journal Title
- JOURNAL OF INTELLIGENT & FUZZY SYSTEMS
- Volume
- 35
- Number
- 6
- Start Page
- 5993
- End Page
- 6003
- URI
- https://scholarworks.bwise.kr/hongik/handle/2020.sw.hongik/13148
- DOI
- 10.3233/JIFS-169840
- ISSN
- 1064-1246
- Abstract
- This article proposes the modified AHC (Agglomerative Hierarchical Clustering) algorithm which considers the feature similarity and is applied to the text clustering. The words which are given as features for encoding texts into numerical vectors are semantic related entities, rather than independent ones, and the synergy effect between the word clustering and the text clustering is expected by combining both of them with each other. In this research, we define the similarity metric between numerical vectors considering the feature similarity, and modify the AHC algorithm by adopting the proposed similarity metric as the approach to the text clustering. The proposed AHC algorithm is empirically validated as the better approach in clustering texts in news articles and opinions. The significance of this research is to improve the clustering performance by utilizing the feature similarities.
- Files in This Item
- There are no files associated with this item.
- Appears in
Collections - School of Games > Game Software Major > 1. Journal Articles
![qrcode](https://api.qrserver.com/v1/create-qr-code/?size=55x55&data=https://scholarworks.bwise.kr/hongik/handle/2020.sw.hongik/13148)
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.