Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Improving K Nearest Neighbor into String Vector Version for Text Categorization

Authors
Jo, Taeho
Issue Date
2019
Publisher
IEEE
Keywords
String Vector; K Nearest Neighbor; Text Categorization
Citation
2019 21ST INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATION TECHNOLOGY (ICACT): ICT FOR 4TH INDUSTRIAL REVOLUTION, pp.1091 - 1097
Journal Title
2019 21ST INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATION TECHNOLOGY (ICACT): ICT FOR 4TH INDUSTRIAL REVOLUTION
Start Page
1091
End Page
1097
URI
https://scholarworks.bwise.kr/hongik/handle/2020.sw.hongik/28050
ISSN
1738-9445
Abstract
This research is concerned with the string vector based version of the KNN which is the approach to the text categorization. Traditionally, texts have been encoded into numerical vectors for using the traditional version of KNN, and encoding so leads to the three main problems: huge dimensionality, sparse distribution, and poor transparency. In order to solve the problems, this research propose that texts should be encoded into string vectors the similarity measure between string vectors is defined, and the KNN is modified into the version where string vector is given its input. The proposed KNN version is validated empirically by comparing it with the traditional KNN version on the three collections: NewsPage.com, Opiniopsis, and 20NewsGroups. The goal of this research is to improve the text categorization performance by solving them.
Files in This Item
There are no files associated with this item.
Appears in
Collections
School of Games > Game Software Major > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Altmetrics

Total Views & Downloads

BROWSE