Text classification using parallel word-level and character-level embeddings in convolutional neural networks
- Authors
- Kim, Geonu; Jang, Jungyeon; Lee, Juwon; Kim, Kitae; Yeo, Woonyoung; Kim, Jong Woo
- Issue Date
- Dec-2019
- Publisher
- Korean Society of Management Information Systems
- Keywords
- Word-level Embedding; Character-level Embedding; Convolutional Neural Network; Text Classification
- Citation
- Asia Pacific Journal of Information Systems, v.29, no.4, pp.771 - 788
- Indexed
- SCOPUS
KCI
- Journal Title
- Asia Pacific Journal of Information Systems
- Volume
- 29
- Number
- 4
- Start Page
- 771
- End Page
- 788
- URI
- https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/11582
- DOI
- 10.14329/apjis.2019.29.4.771
- ISSN
- 2288-5404
- Abstract
- Deep learning techniques such as Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) show superior performance in text classification than traditional approaches such as Support Vector Machines (SVMs) and Na?ve Bayesian approaches. When using CNNs for text classification tasks, word embedding or character embedding is a step to transform words or characters to fixed size vectors before feeding them into convolutional layers. In this paper, we propose a parallel word-level and character-level embedding approach in CNNs for text classification. The proposed approach can capture word-level and character-level patterns concurrentlyin CNNs. To show the usefulness of proposed approach, we perform experiments with two English and three Korean text datasets. The experimental results show that character-level embedding works better in Korean and word-level embedding performs well in English. Also the experimental results reveal that the proposed approach provides better performance than traditional CNNs with word-level embedding or character-level embedding in both Korean and English documents. From more detail investigation, we find that the proposed approach tends to perform better when there is relatively small amount of data comparing to the traditional embedding approaches.
- Files in This Item
-
Go to Link
- Appears in
Collections - 서울 경영대학 > 서울 경영학부 > 1. Journal Articles
![qrcode](https://api.qrserver.com/v1/create-qr-code/?size=55x55&data=https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/11582)
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.