폭소노미와 사용자 영향력을 활용한 의미기반 트윗 군집화 및 요약 기법

허지욱; 이동호

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

폭소노미와 사용자 영향력을 활용한 의미기반 트윗 군집화 및 요약 기법Semantic based Clustering and Summarization of Tweets exploiting Folksonomy and User Influence

Other Titles: Semantic based Clustering and Summarization of Tweets exploiting Folksonomy and User Influence

Authors: 허지욱; 이동호

Issue Date: Aug-2015

Publisher: 한국정보과학회

Keywords: Twitter; Clustering; Document Summarization; Tag Cluster K-Means; 트위터; 군집화; 문서요약; 태그 클러스터; K-평균

Citation: 데이타베이스연구, v.31, no.2, pp 104 - 119

Pages: 16

Indexed: KCI

Journal Title: 데이타베이스연구

Volume: 31

Number: 2

Start Page: 104

End Page: 119

URI: https://scholarworks.bwise.kr/erica/handle/2021.sw.erica/19712

ISSN: 1598-9798

Abstract: 최근 인터넷의 발달과 스마트 기기의 대중화로 많은 사용자들이 방대한 정보들을 손쉽게 접할 수 있게 되었다. 이로 인하여 트위터, 페이스북과 같은 소셜 네트워크 서비스의 이용이 급증하게 되었으며 다양한 주제들의 정보들이 생성되고 있다. 하지만 방대하게 생성된 트윗들 중 사용자 자신이 원하는 트윗의 정보를 획득하기에는 많은 노력과 시간이 들게 된다. 본 논문은 방대하게 존재하는 트위터들의 군집화를 하기 위하여 의미기반 K-평균 군집화 알고리즘을 제안한다. 의미기반 K-평균 군집화 알고리즘은 기존의 K-평균 군집화 알고리즘에서 단순히 벡터모델로 표현된 데이터들간의 유사도 측정뿐만 아니라 데이터들 간의 의미적인 유사도까지 고려하여 군집화한다. 또한 각 군집에서 가장 의미있는 트윗을 추출하기 위하여 각 트위터 사용자의 영향력을 분석하고 기존에 제안된 문서요약기법을 활용한 트윗 요약기법을 제안한다. 마지막으로 RepLab2013에서 제공된 트위터 데이터 집합을 활용한 실험을 통하여 의미기반 K-Means 군집화 알고리즘과 트윗 요약 기법의 우수성을 보였다.
Recently, with the development of Internet technologies and propagation of smart devices, many users have been able to easily access a large amount of information. For this reason, social network services such as twitter and facebook, have been rapidly increasing and have created massive data for various topics. However, it is hard and requires too much time and effort for user to find necessary information from massively generated tweets because they must manually review all of tweets. In this paper, we propose semantic based K-Means clustering algorithm which is not only to measure the similarity between the data represented by vector space model but also to measure semantic similarity between the data for clustering the massive of tweets. To extract the most meaningful tweets in each cluster, we also propose a new tweet summarization technique which analyzes user information for measuring the influence of users and exploits our previously proposed document summarization method. Finally, through the experimental results on RepLab2013 twitter dataset, we show the superiority of semantic based K-Means clustering algorithm and the tweet summarization technique.

Files in This Item: Go to Link

Appears in Collections: COLLEGE OF COMPUTING > DEPARTMENT OF ARTIFICIAL INTELLIGENCE > 1. Journal Articles

Show full item record

qrcode

Related Researcher

Researcher Lee, Dong Ho photo

Lee, Dong Ho: ERICA 소프트웨어융합대학 (DEPARTMENT OF ARTIFICIAL INTELLIGENCE)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

55 Hanyangdeahak-ro, Sangnok-gu, Ansan, Gyeonggi-do, 15588, Korea+82-31-400-4269 sweetbrain@hanyang.ac.kr

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE