Company name discrimination in tweets using topic signatures extracted from news corpus
- Authors
- 이상호; 홍범석; 김양곤
- Issue Date
- Dec-2016
- Publisher
- Korean Institute of Information Scientists and Engineers
- Keywords
- Topic signature; Tweet; Twitter; Word sense discrimination
- Citation
- Journal of Computing Science and Engineering, v.10, no.4, pp.128 - 136
- Journal Title
- Journal of Computing Science and Engineering
- Volume
- 10
- Number
- 4
- Start Page
- 128
- End Page
- 136
- URI
- http://scholarworks.bwise.kr/ssu/handle/2018.sw.ssu/5780
- DOI
- 10.5626/JCSE.2016.10.4.128
- ISSN
- 1976-4677
- Abstract
- It is impossible for any human being to analyze the more than 500 million tweets that are generated per day. Lexical ambiguities on Twitter make it difficult to retrieve the desired data and relevant topics. Most of the solutions for the word sense disambiguation problem rely on knowledge base systems. Unfortunately, it is expensive and time-consuming to manually create a knowledge base system, resulting in a knowledge acquisition bottleneck. To solve the knowledgeacquisition bottleneck, a topic signature is used to disambiguate words. In this paper, we evaluate the effectiveness of various features of newspapers on the topic signature extraction for word sense discrimination in tweets. Based on our results, topic signatures obtained from a snippet feature exhibit higher accuracy in discriminating company names than those from the article body. We conclude that topic signatures extracted from news articles improve the accuracy of word sense discrimination in the automated analysis of tweets. © 2016. The Korean Institute of Information Scientists and Engineers.
- Files in This Item
- There are no files associated with this item.
- Appears in
Collections - College of Information Technology > School of Software > 1. Journal Articles
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.