A Comparative Study on the Correlation Between Similarity and Length of News from Telecommunications and Media Companies
- Authors
- Park, Yougyung; Joe, Inwhee
- Issue Date
- Apr-2023
- Publisher
- Springer International Publishing AG
- Keywords
- correlation analysis; cosine similarity; News length; News similarity; TF-IDF
- Citation
- Lecture Notes in Networks and Systems, v.724 LNNS, pp 555 - 569
- Pages
- 15
- Indexed
- SCOPUS
- Journal Title
- Lecture Notes in Networks and Systems
- Volume
- 724 LNNS
- Start Page
- 555
- End Page
- 569
- URI
- https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/192231
- DOI
- 10.1007/978-3-031-35314-7_49
- ISSN
- 2367-3370
2367-3389
- Abstract
- This study is based on the increasing damage caused by fake news, abusing articles, plagiarism, and similar news included in online news, and activities and systems for monitoring news copyright violations are emerging to prevent this. Copyright can be detected based on the similarity of news content. However, it is very difficult to check the similarity of news that hundreds of thousands of cases are distributed every day and contains similar topics and contents. As we pay attention to the tendency of news length to be shortened due to mobile, we would like to investigate the correlation between news similarities. To this end, news from five sections registered in the first half of 2022 was collected on the Naver portal, measured the length of the news, and analyzed the correlation by measuring the news similarity. News similarity was analyzed for news showing similarity of 0.7 or higher after obtaining similarity using cosine similarity and TF-IDF algorithm. In this study, two approaches were attempted to analyze the correlation between news length and news similarity. The first is to analyze the correlation between news length and similarity by classifying it by news section. The second is a method of analyzing the correlation by first classifying news into news producers, telecommunication companies, and media companies. The first approach was less correlated, but applying the second method showed a negative correlation result that the shorter the news length, the higher the news similarity. It was found that telecommunication companies had a higher correlation between news length and similarity than media companies.
- Files in This Item
-
Go to Link
- Appears in
Collections - 서울 공과대학 > 서울 컴퓨터소프트웨어학부 > 1. Journal Articles

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.