Extracting and evaluating topics by region
- Authors
- Noh, Joonho; Lee, Soowon
- Issue Date
- Oct-2016
- Publisher
- SPRINGER
- Keywords
- Topic extraction; Text mining; Clustering validity index
- Citation
- MULTIMEDIA TOOLS AND APPLICATIONS, v.75, no.20, pp.12765 - 12777
- Journal Title
- MULTIMEDIA TOOLS AND APPLICATIONS
- Volume
- 75
- Number
- 20
- Start Page
- 12765
- End Page
- 12777
- URI
- http://scholarworks.bwise.kr/ssu/handle/2018.sw.ssu/7486
- DOI
- 10.1007/s11042-016-3528-6
- ISSN
- 1380-7501
- Abstract
- Analyzing streaming data that contains regional information can derive the interest trends of a region and the differences from those of other regions. The results of analyzing regional differences can be used for making important decisions in areas such as regional marketing and national policy establishment. In this paper, we propose a method to extract topics that represent regional interests from news articles collected by region. The proposed method consists of a novel word-weighting step to extract regional keywords and a word-clustering step to extract regional topics based on the associations between the extracted keywords. The validity of the extracted regional topics is evaluated through a comparison with a ground-truth topic set. Since each topic is represented by a set of words, and a regional topic set is represented by a family of sets, we propose a new clustering validity index for families of sets for a given set of regions. Using the proposed clustering validity index, the optimal parameters for the collected data are presented through experiments.
- Files in This Item
- There are no files associated with this item.
- Appears in
Collections - College of Information Technology > School of Software > 1. Journal Articles
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.